Association between Categorical Variables and regression

We perform data analysis and build models. Say, for example, I built a regression model that has more than one predictor (multiple regression). We then check many things: normality, multicollinearity, etc. Specifically, we check for multicollinearity, for a numeric/continuous variable, VIF (Variance Inflation Factors) etc. If we find that there is multicollinearity, we then drop one of the highly correlated features.

My question is: what can be done with categorical variables? I mean if two categorical variables are correlated/associated does it mean I have to drop it? I am not clear how to handle categorical variables like we handle correlations between continuous variables.

What do we mean by two factor variables are correlated or dependent or independent? What if there is a collinearity? How do you identify that collinearity? How do you deal with it?

Topic pearsons-correlation-coefficient data regression categorical-data

Category Data Science


Correlation between categorical variables can be calculated with Spearman's rank correlation coefficient. If Spearman's rank correlation is high enough, the variables can be dropped.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.