How to train millions of doc2vec embeddings using GPU?

I am trying to train a doc2vec based on user browsing history (urls tagged to user_id). I use chainer deep learning framework.

There are more than 20 millions (user_id and urls) of embeddings to initialize which doesn’t fit in a GPU internal memory (maximum available 12 GB). Training on CPU is very slow.

I am giving an attempt using code written in chainer given here

Please advise options to try if any.

Topic word-embeddings deep-learning nlp

Category Data Science


One option is to switch to a deep learning framework that supports distributed training.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.