I am working on creating a number of Top2Vec models on Reddit threads. I am basically changing the HDBScan cluster sizes to get different clusters of the Doc2Vec embeddings representing a different # of topics. I am trying to compare different models using their coherence score. I have tried using Gensim's coherence score but failed. I got an error message indicating that a word in the topics is not included in the dictionary. I also tried using tmtooklit. While I …
For an experiment with topic models, I have calculated four coherence values using Gensim's implementation: c_v u_mass c_uci c_npmi From this paper, I know that c_v correlates mostly with human interpretation. For this reason, this seems to be the best score to use for topic evaluation. However, are there arguments for using the other measures? And how can these values be interpreted? They seem to be in a different range.
On my corpora, I am running LDA with different settings (I experiment with different number of topics, different different ngrams and TFIDF or regular BOW). Now, I want to rank these setups to select one best topic model to continue working with. In order to rank them, I have calculated both the coherence value as well as the perplexity for all the different settings, as is done here. In the link, the number of topics is selected using the coherence …