Can I average the BERT embeddings of multiple instances of the same word to get one vector representation of the word?

Question

Can I average the BERT embeddings of multiple instances of the same word to get one vector representation of the word?

Hantan G

2021年8月20日 11:45

In the project I'm working on right now I would like to get one embedding for every unique lemma in a corpus. Could I get this by averaging the embeddings of every instance of a lemma?

For example, say that there were 500 tokens of the lemma walk - regardless of conjugation - could I then add/average/concatenate these 500 embeddings together to get one embedding accurately representing all of them?

If this would work, which operation should I use on the embeddings to get the best result?

Topic corpus bert word-embeddings

Category Data Science

Can I average the BERT embeddings of multiple instances of the same word to get one vector representation of the word?

About