Weighting of words in lexicon based sentiment analysis

Question

Weighting of words in lexicon based sentiment analysis

voltage

2022年2月10日 23:08

I have a a question regarding my current project, i am trying to do a lexicon based sentiment analysis on my data, where i calculate the sentiment score as following:

$$ Score = \frac{\sum_{i}{word_i}}{\mid words \mid} $$

So according to the score the word will be classified in either negative or positive. But i have also calculated for every word in the article its salience and frequency and would like to know if its possible to use them in my sentiment analysis formula above.

 words| salience| frequence
 sad    0.8       3
 happy  0.5       2

Topic nltk sentiment-analysis nlp

Category Data Science

Valentin Calomme · Accepted Answer · 2020年5月8日 07:33

Yes, you can. Not quite sure what else to add. Your formula can then look like:

$$ Score = \sum_{i}{f(salience_i, frequency_i, sentiment_i)}$$

Where $f$ is a function that weighs your sentiment score with the salience and frequency. Up to you to define how.

What if you don't know which $f$ to use?

Now, bear with me, this isn't something I've tried per se, but this could be an interesting approach. You could use a recurrent neural network and your input could be the salience, frequency, and sentiment score for each word. Not only will your RNN "create" (ideally) the best $f$ for your particular problem, but it will also use the sequential information of the words, which may even improve your results.

Weighting of words in lexicon based sentiment analysis

About