Contextual word embeddings from pretrained word2vec vectors

Question

Contextual word embeddings from pretrained word2vec vectors

amks1212

2022年5月17日 18:00

I would like to create word embeddings that take context into account, so the vector of the word Jaguar [animal] would be different from the word Jaguar [car brand].

As you know, word2vec only gives one representation for a given word, and I would like to take already pretrained embeddings and enrich them with context. So far I've tried a simple way with taking an average vector of the word and category word, for example like this.

Now I would like to try to create and train a neural network that would take entire sentences, e.g.

Jaguar F-PACE is a great SUV sports car.
Among cats, only tigers and lions are bigger than jaguars.

And then it would undertake the task of text classification (I have a dataset with several categories like animals, cars, etc.), but the result would be new representations for the word jaguar, but in different contexts, so two different embeddings.

To simplify the whole thing, I'm assuming a limited number of embeddings per word, I have a dataset of a dozen words - each word has two/three meanings and each meaning has dozens of sentences - a small dataset initially, as the whole work is heavily experimental.

Does anyone have any idea how I could create such a network? I don't hide that I'm a beginner and have no idea how to go about it.

Topic text-classification word-embeddings deep-learning neural-network nlp

Category Data Science

Contextual word embeddings from pretrained word2vec vectors

About