Which stage should the correlation analysis be done?
I was thinking about it, but I couldn't find a logical explanation.
Mostly im following below steps after data become ready:
- Correlation analysis and elimination
- Apply dummy if categorical variables exist
- Balance the data if data is unbalanced
- Scale data
- Feature selection (Backward, Stepwise etc.)
- Train model
Where would the correlation analysis be applied for this path I followed would make more sense? After the data is balanced? After scaling? Or at first?
Topic correlation feature-selection
Category Data Science