Feature selection algorithm for psychometrics, when there is several predicted variables

I'm on a psychometric study. It is a survey.

All variables are on a scale of 7. So these are considered as continuous variables.

I have this dataset:

  • 600 features
  • 100 predicted variables
  • 100 survey answers so far

We are stuck running the survey because 700 questions are really way too much. Surprising?

We would like to select 100 features over the 600.

We ran Cronbach's Alpha, low variance, correlated variables to remove features that were problematic. There are still too many features left.

We would like to use a feature selection algorithm. Something like Chi-Square, Lasso, ANOVA, Random forest's features importance...

However, I don't really know what is fitted for this use case.

What could be your choice?

Thanks

Topic multi-output feature-engineering feature-extraction feature-selection algorithms

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.