Conditional Entropy and Mutual Information - Clustering evaluation

Question

Conditional Entropy and Mutual Information - Clustering evaluation

ismlhk

2019年4月19日 19:17

First of all, I am doing clustering and I have the true labels for my data. For evaluation, I am using the weighted average of the entropy values for each predicted cluster. I also came across with Mutual Information as a similar approach while going over the alternatives. On my data, they seem to give similar results.

However there is one issue that puzzles me.

Given the predicted cluster set $U$ and true clusters $V$, mutual information was defined as: $$ I(U,V) = H(U) - H(U|V) $$ or, $$ I(U,V) = H(V) - H(V|U) $$ If my math is correct, the average entropy that I'm using corresponds to conditional entropy term $H(V|U)$ and trying to minimize this aligns with maximizing the mutual information.

What I cannot see is how weigthed average entropy would differ from mutual information and why we would need the entropy terms $H(U)$ or $H(V)$. It feels like minimising one of the conditional entropies should suffice.

To put it another way, as far as I understood from the equations, having high entropy for true or predicted clusters in itself also results in higher mutual information. Does this mean that mutual information favors equally-sized clusters?

Thanks in advance.

Topic mutual-information information-theory evaluation clustering

Category Data Science

Has QUIT--Anony-Mousse · Accepted Answer · 2019年4月19日 19:17

Mutual information does favor many small clusters. Nectar these tend to be "pure". That is why variations wish as normalized mutual information and adjusted mutual information (AMI) are used instead.

Vinh, N. X.; Epps, J.; Bailey, J. (2009). "Information theoretic measures for clusterings comparison". Proceedings of the 26th Annual International Conference on Machine Learning - ICML '09. p. 1. doi:10.1145/1553374.1553511.

Conditional Entropy and Mutual Information - Clustering evaluation

About