word2vec-from-scratch

In this notebook, we explore the models proposed by Mikolov et al. in [1]. We build the Skipgram and CBOW models from scratch, train them on a relatively small corpus, implement an analogy function using the cosine similarity, and provide some examples that make use of the trained models and analogy function to perform the word analogy task. We look at three different number of dimensions for the word embeddings in order to get a better intuition how the number of dimensions influences the result.

[1] Mikolov, Tomas, et al. "Efficient Estimation of Word Representations in Vector Space" Advances in neural information processing systems. 2013.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

word2vec-from-scratch

Files

README.md

Latest commit

History

README.md

File metadata and controls

word2vec-from-scratch