Usage of KL divergence to improve BOW model
For a university project, I chose to do sentiment analysis on a Google Play store reviews dataset. I obtained decent results classifying the data using the bag of words (BOW) model and an ADALINE classifier.
I would like to improve my model by incorporating bigrams relevant to the topic (Negative or Positive) in my features set. I found this paper which uses KL divergence to measure the relevance of unigrams/bigrams relative to a topic.
The only problem is that I am having trouble understanding what C refers to in the equation (2.2). Does it refer to the unique words associated with topic C, the set of documents on a topic, or the words in a document?
Topic bag-of-words ngrams classification
Category Data Science