Wednesday, March 02, 2016 3:00 PM - 4:00 PM |
Main Campus - Engineering Classroom Wing - 257: Newton Lab |
Alex Gittens; International Computer Science Institute; University of California, Berkeley
Why (some) nonlinear embeddings capture compositionality linearly Dimensionality reduction methods have been used to represent words with vectors in NLP applications since at least the introduction of latent semantic indexing in the late 1980s, but word embeddings developed in the past several years have exhibited a robust ability to map semantics in a surprisingly straightforward manner onto simple linear algebraic operations. These embeddings are trained on cooccurrence statistics and intuitively justified by appealing to the distributional hypothesis of Harris and Firth, but are typically presented in an ad-hoc algorithmic manner. We consider the canonical skip-gram Word2vec embedding, one of the most well-known of these recent word embeddings, and establish a corresponding generative model that maps the composition of words onto the addition of their |