Mututal Information in sklearn

I expected sklearn's mutual_info_classif to give a value of 1 for the mutual information of a series of values with itself but instead I'm seeing results ranging between about 1.0 and 1.5. What am I doing wrong?

This video on mutual information (from 4:56 to 6:53) says that when one variable perfectly predicts another then the mutual information score should be log_2(2) = 1. However I do not get that result:

import pandas as pd
from sklearn.metrics import confusion_matrix

y = [1,1,1,1,1,0,0,0,0,0]
print(Confusion matrix:)
print(confusion_matrix(y,y))

print(Mutual information:)
result = mutual_info_classif(pd.DataFrame(y), y)
print(result)

which gives:

Confusion matrix:
[[5 0]
 [0 5]]
Mutual information:
[1.28730159]

When the two variables are independent, I do however see the expected value of zero:

x = [1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1]
y = [1,1,1,1,1,0,0,0,0,0,1,1,1,1,1,0,0,0,0,0]

print(Confusion matrix:)
print(confusion_matrix(x,y))

print(Mutual information:)
result = mutual_info_classif(pd.DataFrame(x), y)
print(result)

which gives:

Confusion matrix:
[[5 5]
 [5 5]]
Mutual information:
[0]

Why am I not seeing a value of 1 for the first case?

Topic mutual-information scikit-learn python

Category Data Science


Sklearn has different objects dealing with mutual information score.

What you are looking for is the normalized_mutual_info_score.

The mutual_info_score and the mutual_info_classif they both take into account (even if in a different way, the first as a denominator, the second as a numerator) the integration volume over the space of samples.

Your code becomes

import pandas as pd
from sklearn.metrics import confusion_matrix
from sklearn.metrics import normalized_mutual_info_score, mutual_info_score

y = [1,1,1,1,1,0,0,0,0,0,1,1,1,1,1,0,0,0,0,0]
print("Confusion matrix:")
print(confusion_matrix(y,y))

print("Mutual information:")
result = normalized_mutual_info_score(y, y)
result_not_normalised = mutual_info_score(y,y)
print('norm: ', result)
print('not-norm: ', result_not_normalised)

giving,

Confusion matrix:
[[10  0]
 [ 0 10]]
Mutual information:
norm:  1.0
not-norm:  0.6920129648318737

Note also that $ln(2) \simeq 0.6931$.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.