How to train neural word embeddings?

Question

How to train neural word embeddings?

Vatsal Aggarwal

2022年5月18日 14:05

So I am new to Deep Learning and NLP. I have read several blog posts on medium, towardsdatascience and papers where they talk about pre-training the word embeddings in an unsupervised fashion and then use them in supervised DNN. But recently I read a blog post which suggested that training the word embeddings while training the neural network gives better results. This is the other link.

So my question is which one should I follow?

Some YouTube videos that I referred:

Deep Learning for NLP without Magic Part 1, 2 and 3

Topic word-embeddings deep-learning neural-network nlp

Category Data Science

Brian Spiering · Accepted Answer · 2022年4月9日 16:51

It depends on access to training data, computational budget, and desired performance level.

Training embedding from scratch will require both training data and computational resources. If you have access to both, there is an increased chance of improved performance for subsequent supervised learning models.

Using pre-trained embeddings require no training, thus no training data or computational resources. Those embedding may or may not be useful for the specific supervised learning task.

One option is a hybrid approach of taking pre-trained embeddings and then fine-tuning them with project-specific data. This has all the advantages of using pre-trained embedding and can leverage task-specific data. This assumes you have access to the necessary computational budget and technical skills.

Escachator · Accepted Answer · 2019年1月28日 14:27

There is one answer to know: try both methods and take the one that gives the best result. I would say that in general pre-trained embeddings usually gives better results. You can also start with pre-treained embeddings as initial conditions and let the embeddings train maybe with a smaller learning rate.

In any case, the current state of the art for text classification is ULMFIT (https://arxiv.org/abs/1801.06146), which actually doesn't do any of this. It pre-trains embedding and RNN with a language model in the wikipedia and in the target text and then fine tunes the whole model with the target text.

How to train neural word embeddings?

About