How to estimate the mutual information numerically?

Suppose I have a sample {$z_i$}$_{i\in[0,N]}$ = {($x_i,y_i$)}$_{i\in[0,N]}$ which commes from a probability distribution $p_z(z)$. How can I use it to estimate the mutual information between X and Y ?

$MI(X,Y) = \int_Y \int_X p_z(x,y) \log{ \left(\frac{p_z(x,y)}{p_x(x)\,p_y(y)} \right) }$

where $p_x$ and $p_y$ are the marginal distributions of X and Y:

$p_x(x) = \int_Yp_z(x,y)$

$p_y(y) = \int_Xp_z(x,y)$.

Topic estimators distribution mutual-information information-theory numerical

Category Data Science


Binning

One easy way to do such an estimate is to put the continuous values into bins and obtain a discrete problem. Split up the domains of $X$ and $Y$ into bins and count the number of points that fall within each bin to obtain a density. So, the calculation would be:

$$ \sum_{b_x \in Bins_x} \sum_{b_y \in Bins_y} \frac {\#(b_x, b_y)} N \log \frac {\frac {\#(b_x, b_y)} N} {{\frac {\#b_x} N} {\frac {\#b_y} N}} $$

where $\#(b_x b_y)$ is the number of samples where $X \in b_x$ and $Y \in b_y$, $\#b_y$ is the number of samples where $Y \in b_y$ and $\#b_x$ is the number of samples where $X \in b_x$.

Entropy and Density Estimation

Another method is to note that the mutual information can be represented as

$$ I(X, Y) = H(X) + H(Y) - H(X,Y) $$

so if you can estimate entropy you can estimate information. Now, in order to find the expectation of a function over your data you can just use the plugin estimator and do

$$ E[g(X)] = {\frac 1 n} \sum_i g(X_i) $$

The problem here is that we want to estimate the function $\log p(x)$ where $p$ depends on our data.

So one can use an estimate of the density of $p$ for each point and then use the plugin estimator. Kernel density estimators are one approach and nearest neighbor estimators are another. Both methods are non-parametric.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.