For feature selection, do we use Chi-squared with Mutual Information together?
Or do we only choose one out of two for categorical data.
Or do we only choose one out of two for categorical data.
Usually feature selection is done with mutual information, correlation or conditional entropy. I'm not aware of statistical tests like chi square used for this, especially because the goal is usually to get a score which represents the importance of a feature, not a yes/no answer for every feature.
But in theory one can used whatever method they want. In general individual feature selection is a rough approximation anyway, since it doesn't take into account the contribution of subsets of features together.
Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.