dirichlet

Hierarchical dirichlet process results

Muntaser

2022年4月25日 09:01

I am thinking about using hierarchical dirichlet process to model a patent dataset. I've seen that HDP uses a base distribution and assumes that every topic comes from that base distribution. The problem is: first I'm wondering what are the main results from the HDP procedure (in the case of LDA we obtain two matrices that we can use to construct word clouds and graphs but in this case I'm not sure about the results) and what is the exact …

Topic: dirichlet unsupervised-learning topic-model data-cleaning data-mining

Category: Data Science

Two sets of topics/words in Topic Modeling

aloskam

2021年12月27日 06:04

In short, the question is: I have two sets of words per document. I would like to extract two sets of topics per document corresponding to sets of words. To be more precise: Document(d) can be modelled as a union of two sets of words (WordSetA, WordSetB), where WordSetA union WordSetB is all words in (d) The goal is to find two sets of topics related to the sets of words (TopicSetA and TopicSetB), where TopicSetA is a mixture of …

Topic: dirichlet unsupervised-learning lda topic-model

Category: Data Science

What hyperparameter values does the LDA mallet model use by default? Is it true that the formula to calculate alpha = 5.0/n(topics)?

SAMUEL DURAIVEL

2021年5月31日 21:52

I am trying to figure out the default $\alpha$ & $\eta$ values used by mallet LDA, but there is not a lot of information on this. I did find a couple of answers, with no proper references, saying that symmetric $\alpha$ can be calculated with 5.0/num_topics? Why is that? Why can't I use 1.0/num_topics to calculate the symmetric $\alpha$, just like in standard LDA? Can someone please help me understand and link me to references? Thanks in advance.

Topic: dirichlet lda nlp python

Category: Data Science

Where can I learn the complete mathematics involved in LDA?

Harish Reddy

2019年6月6日 09:48

I have come across Latent Dirichlet Allocation (LDA) on multiple occasions while reading about sentiment analysis and recommender systems. Where can I find good reading material which explains the concept in depth, especially by taking an example?

Topic: dirichlet sentiment-analysis lda nlp

Category: Data Science

Chinese restaurant process vs Dirichlet Process

Tommaso Bendinelli

2019年1月9日 20:40

On Wikipedia Dirichlet Process page, regarding the connection between the Chinese restaurant process and the Dirichlet process it's state the following If one associates draws from the base measure H with every table, the resulting distribution over the sample space S is a random sample of a Dirichlet process. What does it mean to: Associate draws from the base measure H with every table? It doesn't make any sense to me.

Topic: dirichlet clustering

Category: Data Science

Combine two sets of clusters

Volka

2017年8月18日 05:55

I have two sets of topics obtained from two different sets of news paper articles. In other words, Cluster_1 = ${x_1, x_2, ..., x_n}$ includes the main topics of 'X' news paper set and Cluster_2 = ${y_1, y_2, ..., y_n}$ includes the main topics of 'Y' news paper set. Now I want to find clusters in the two sets that are similar/related by considering the cluster attributes as given in the example below. Example 1, **X1 in Cluster_1** is mostly …

Topic: dirichlet unsupervised-learning topic-model data-mining machine-learning

Category: Data Science

How do you work with Latent Dirichlet Allocation in practice

Robin

2017年5月2日 20:44

One need to provide LDA with a predefined number of latent topics. Let say I have a text corpus in which I hypothesize there are 10 major topics, all composed of 10 minor subtopics. My objective is to be able to define proximity between documents. 1) How do you estimate the number of topics in practice ? Empirically ? With another method like Hierarchical Dirichlet Process (HDP) ? 2) Do you build several models ? For major and minor topics …

Topic: dirichlet

Category: Data Science

Hierarchical dirichlet process results

Two sets of topics/words in Topic Modeling

What hyperparameter values does the LDA mallet model use by default? Is it true that the formula to calculate alpha = 5.0/n(topics)?

Where can I learn the complete mathematics involved in LDA?

Chinese restaurant process vs Dirichlet Process

Combine two sets of clusters

How do you work with Latent Dirichlet Allocation in practice

About