Attention to get context of words
The W2V techniques define context as a window of k words around the term, and using this learn the vector representations for words in the corpus. Attention networks can help us get the important information from a sequence. I was wondering, can attention networks help define context better, which I can then use for learning the word embeddings? I wasn't able to find any article/paper that learns word embeddings like this. They first learn the vectors using word2vec methods and then after getting the vectors, input them into their attention network to get the most important information. This got me thinking- is it possible to do what I am suggesting (or has been done already- in this case please link a few references) Please excuse for the silly doubts, I am new to NLP and would love some help! Thanks
Topic context-vector attention-mechanism word2vec nlp
Category Data Science