How to construct the document-topic matrix using the word-topic and topic-word matrix calculated using Latent Dirichlet Allocation?

How to construct the document-topic matrix using the word-topic and topic-word matrix calculated using Latent Dirichlet Allocation? I can not seem to find it anywhere, even not from the author of LDA, M.Blei.

Gensim and sklearn just work, but I want to know how to use the two matrices to construct the document topic-matrix (Spark MLLIB LDA only gives me the 2 matrices and not the document-topic matrix).

Topic apache-spark lda python

Category Data Science


using the word-topic and topic-word matrix

This would be the same, wouldn't it? The model generates a words x topics and a topics x documents matrix, there is not much left to calculate, the model practically spits it out.

Using the Gibbs method gives you the counts Ntw[1..T][1..W] and Ndt[1..D][1..T], where for example Ndt[1][5] is the amount of words assigned to topic number 5 in document number 1.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.