extract document topic vectors from lda model

how can I extract document-topic matrix from LDA model and use it as input features an svm classifier? I am using gensim for implementation

Topic gensim classification feature-extraction lda python

Category Data Science


I've done this before in Gensim, hopefully it will help:

train_vecs = []
for i in range(len(your_training_examples)):
    top_topics = lda_train.get_document_topics(train_corpus[i], minimum_probability=0.0)
    topic_vec = [top_topics[i][1] for i in range(20)]
    train_vecs.append(topic_vec)

The above would give the top 20 topics for every document. 'train_corpus' is the result of doing something like this in Gensim once you have a bigram object from the 'Phrases' Gensim model class:

train_corpus = [id2word.doc2bow(text) for text in bigram]

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.