how to train custom word2vec embeddings to find related articles?
I am beginner in machine learning. My project is to make search engine based on AI which shows related articles when we search on website. For this i decided to train my own embedding.
I found two methods for this:
- One is to train network to find next word( i.e inputs=[the quick,the quick brown,the quick brown fox] and outputs=[brown, fox,lazy]
- Other method is to train with nearest words(i.e [brown,fox],[brown,quick],[brown,quick]).
Which method should i use and after training how should i convert the sentence to a single vector to apply cosine similarity means sentence- the quick brown fox will return 4 vectors how should i convert it to feed for cosine similarity(which takes only one vector) with another sentence.
Topic embeddings word-embeddings nlp
Category Data Science