When are subword ngrams trained in fasttext? (Enriching Word Vectors with Subword Information)
when is the training for subword ngrams done? is it done simultaneously as when the word representation are trained? or is it done at the end, after word representations are created?
fasttext implements this paper where word representations are enriched with subword information. here, the word representation for each word is the sum of the representations of its character ngrams. just as how skipgram model is trained, so is the character ngrams, where the ngrams are the context and the word is the word.
since there are two tasks here that have to be trained, how is the training done? in sequence? or is it done together? The paper mentions two scoring objectives, one for the word and one for the subword ngrams. I am unclear as to whether these two objectives have been combined into one or there are optimized separately.
Topic fasttext word-embeddings nlp machine-learning
Category Data Science