Word2Vec vs. Doc2Vec Word Vectors

I am doing some analysis on document similarity and was also interested in word similarity. I know that doc2vec inherits from word2vec and by default trains using word vectors which we can access.

My question is:

Should we expect these word vectors and by association any of the methods such as most_similar to be 'better' than word2vec or are they essentially going to be the same? If in the future I only wanted word similarity should I just default to word2vec?

Topic doc2vec word2vec nlp

Category Data Science


If you only care about word similarity, then apply Occam's Razor and use word2vec. There is no need to increase model complexity if not going to be used.

Also, the quality of embeddings is primarily increased through the size and diversity of the corpus. The algorithm has a much smaller effect on the quality of the embedding.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.