Heat map and correlation among variables

I would have a question on heat map and correlation among variables. I created this heat map, looking at possible correlation among variables and target. I got very small values. I wanted to set a small threshold, e.g., 0.05, for selecting features. Do you think it makes sense, or should I exclude all of them?

Topic heatmap correlation feature-selection machine-learning

Category Data Science


From the info you provide, it seems you are carrying feature selection based on the correlation between your predictor variables and the target.
This is correct as a type of feature selection (see here) in the family of univariate filter selection, although not the only one. It is fast and intuitive, although you can have a look at other methods. You might also be interested in:

  • variance threshold selection (also per input feature, univariate filter method): it assumes that higher variance in a feature values could mean more prediction power
  • sequential backward selection (look here): it means more performance cost, but features are judged in subsets (not independently as above) and is ok if you don't have many features (as it seems to be)

There are many other strategies for feature selection (you might want to check for this source)

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.