Hierarchical dirichlet process results

I am thinking about using hierarchical dirichlet process to model a patent dataset. I've seen that HDP uses a base distribution and assumes that every topic comes from that base distribution. The problem is: first I'm wondering what are the main results from the HDP procedure (in the case of LDA we obtain two matrices that we can use to construct word clouds and graphs but in this case I'm not sure about the results) and what is the exact …
Category: Data Science

Two sets of topics/words in Topic Modeling

In short, the question is: I have two sets of words per document. I would like to extract two sets of topics per document corresponding to sets of words. To be more precise: Document(d) can be modelled as a union of two sets of words (WordSetA, WordSetB), where WordSetA union WordSetB is all words in (d) The goal is to find two sets of topics related to the sets of words (TopicSetA and TopicSetB), where TopicSetA is a mixture of …
Category: Data Science

What hyperparameter values does the LDA mallet model use by default? Is it true that the formula to calculate alpha = 5.0/n(topics)?

I am trying to figure out the default $\alpha$ & $\eta$ values used by mallet LDA, but there is not a lot of information on this. I did find a couple of answers, with no proper references, saying that symmetric $\alpha$ can be calculated with 5.0/num_topics? Why is that? Why can't I use 1.0/num_topics to calculate the symmetric $\alpha$, just like in standard LDA? Can someone please help me understand and link me to references? Thanks in advance.
Category: Data Science

Chinese restaurant process vs Dirichlet Process

On Wikipedia Dirichlet Process page, regarding the connection between the Chinese restaurant process and the Dirichlet process it's state the following If one associates draws from the base measure H with every table, the resulting distribution over the sample space S is a random sample of a Dirichlet process. What does it mean to: Associate draws from the base measure H with every table? It doesn't make any sense to me.
Category: Data Science

Combine two sets of clusters

I have two sets of topics obtained from two different sets of news paper articles. In other words, Cluster_1 = ${x_1, x_2, ..., x_n}$ includes the main topics of 'X' news paper set and Cluster_2 = ${y_1, y_2, ..., y_n}$ includes the main topics of 'Y' news paper set. Now I want to find clusters in the two sets that are similar/related by considering the cluster attributes as given in the example below. Example 1, **X1 in Cluster_1** is mostly …
Category: Data Science

How do you work with Latent Dirichlet Allocation in practice

One need to provide LDA with a predefined number of latent topics. Let say I have a text corpus in which I hypothesize there are 10 major topics, all composed of 10 minor subtopics. My objective is to be able to define proximity between documents. 1) How do you estimate the number of topics in practice ? Empirically ? With another method like Hierarchical Dirichlet Process (HDP) ? 2) Do you build several models ? For major and minor topics …
Topic: dirichlet
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.