How cluster a twitter data-set?
I have a twitter data-set and I wanna extract their related topics. So, I decided to classify my Tweets into clusters using an unsupervised machine learning algorithm like k-means. This choice is made due the time consuming of the training process in the supervised approaches.
So, as a first step after cleaning my tweets, I will extract features (eg. Hashtags...) from them, and enrich them with side information from knowledge bases (eg. Wikipedia). Secondly, they will be represented in a Vector space. Next, using k-means and for given K=6 clusters, my tweets already enriched will be classified into 6 clusters.
However, I don’t know how to identify automatically the topics related to these clusters. Is there any solutions?
Topic social-network-analysis nlp clustering data-mining machine-learning
Category Data Science