Hellinger Distance in Gensim
I have set of documents as follows where each document has set of words that represents the content of it.
Doc1: {fish, moose, wildlife, hunting, bears, polar}
Doc2: {energy, fuel, costs, oil, gas}
Doc3: {wildlife, hunt, polar, fishing}
So, if I look at my documents I can deduce that Doc1 and Doc3 are very much similar.
I want distance metrics for bag-of-words. I followed some tutorials in Gensim about how to do it. However, as I understand, initially they train a model and then use that model to calculate the Hellinger Distance. However, in my case, I do not have any training data. Hence, please let me know how to achieve this with no training data.
Topic gensim text-mining topic-model data-mining machine-learning
Category Data Science