Gensim fast text get vocab or word index
Trying to use gensim's fasttext, testing the sample code from gensim with a small change of replacing the arguement to corpus_iterable
https://radimrehurek.com/gensim/models/fasttext.html
gensim_version == 4.0.1
from gensim.models import FastText
from gensim.test.utils import common_texts # some example sentences
print(common_texts[0])
['human', 'interface', 'computer']
print(len(common_texts))
9
model = FastText(vector_size=4, window=3, min_count=1) # instantiate
model.build_vocab(corpus_iterable=common_texts)
model.train(corpus_iterable=common_texts, total_examples=len(common_texts), epochs=10)
It works, but is there any way to get the vocab for the model. For example, in Tensorflow Tokenizer there is a word_index which will return all the words. Is there something similar here?
Topic fasttext gensim word-embeddings nlp machine-learning
Category Data Science