How can I predict the post popularity of reddit.com with hidden markov model(HMM)?

If I get some posts on reddict.com, how can I predict whether this post will (trending/hot/popular) in the future or not? I would like to use the hidden markov model to predict it, but I don`t know how to define the hidden states and observation sequence...can anyone give me any suggestion? Thanks~

Topic markov

Category Data Science


An HMM doesn't really make sense (echoing what Dries said). If you want to use an HMM, you would have to justify it by asking "Can Reddit posts be represented by a Markov process?" I can't think of a way to make that sentence true and still take advantage of the features related to a popular post.

Consider the possible feature set: the time it was posted, the user posting it, the type of post (link / image / text), the subreddit, the number of subscribers to that subreddit, a score of positivity / negativity, number of words in the title etc. Don't count out these features.


I don't think it makes a lot of sense to use HMM's for this problem. What I would suggest is some kind of text-based classifier. If you want to use a cool technique, you could use a neural network to learn based on the text of successful posts.

On the other hand, If you want to use an easy technique you could make a predictor for the popularity such as a regression model (try to predict upvotes).

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.