Does Sample Size affects Mutual Information for Feature Selection?

Fernando

2021年6月14日 00:27

There is a dataset with n rows (samples) and p columns (variables/features), the objective is to predict a certain target variable (y). Should n (sample size) matter to the results of pairwise mutual information tests between every feature and y ? Meaning if n is too small or too large, the results can't be trusted ? My intuition says no, but I'm not fully confident.

And is there a good reason, besides domain knowledge, to not exclude a variable that in their test had Mutual Information = 0 ?

Topic mutual-information feature-selection

Category Data Science

Does Sample Size affects Mutual Information for Feature Selection?

About