Sentiment data for Emoji
For experimenting we'd like to use the Emoji embedded in many Tweets as a ground truth/training data for simple quantitative senitment analysis. Tweets are usually too unstructured for NLP to work well.
Anyway, there are 722 Emoji in Unicode 6.0, and probably another 250 will be added in Unicode 7.0.
Is there a database (like e.g. SentiWordNet) that contains sentiment annotations for them?
(Note that SentiWordNet does allow for ambiguous meanings, too. Consider e.g. funny, which is not just positive: "this tastes funny" is probably not positive... same will hold for ;-)
for example. But I don't think this is harder for Emoji than it is for regular words...)
Also, if you have experience with using them for sentiment analysis, I'd be interested to hear.
Topic classification parsing machine-learning
Category Data Science