LSA Model Improvement

I followed gensim's Core Tutorial and build an LSA Classification, topic modeling and Document Similarity model for newsgroups dataset.

My code is available here.

I need help with below 3 concepts.

  1. Topic Classification: I get only 50% accuracy with KNN algo.
  2. Topic Modeling: The words highlighted for each of the 20 topics doesnt stand out.
  3. Document Similarity: I wrote a small test code to find that document similarity also doesnt produce great results.

I am going to follow up it with other best models like LDA. However I am eager to know if I can improve my current LSA model. Any help here, would be really appreciable. Thanks!!

Topic similar-documents lsi gensim lda topic-model

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.