How to choose Recursive Feature Elimination parameters

in my project I have 900 features and I thought to use Recursive Feature Elimination algorithm to reduce the dimensionality of my problem (in order to improve the accuracy).

But I can't figure out how to choose the RFE parameters (estimator and the number of parameters to select).

Should I use model selection techniques in this case as well? Do you have any advice?

Topic rfe model-selection dimensionality-reduction

Category Data Science


Practically there are two options for feature selection:

  • Select the number of features and other parameters arbitrarily and/or based on external constraints. It's common to choose these parameters simply based on what the RAM memory can handle, and how long the training process might take.
  • Run a full hyper-parameter tuning process: try many different values by training a model in every case, then evaluate on a test set or using cross-validation. At the end of the process the parameters which achieved the high performance are picked, the final model is re-trained and then evaluated on a different (fresh) test set.

Needless to say, the second option requires a lot more time and/or computing power. Keep in mind that as soon as multiple trials are made with different parameters, in theory this requires a fresh test set for the final evaluation.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.