One-hot & interaction one-hot on multiple categorical

I was wondering if there is any value to creating combined features out of multiple categorical variables when the individual categorical variables are already one-hot encoded?

Simple example: there is a variable P with categories {X, Y} and a variable Q with categories {Z, W}. After one-hot, we would have 4 variables: P.X, P.Y, Q.Z, and Q.W.

In this scenario, I'm wondering if the algorithm (Xgboost or a deep neural network) would sufficiently learn interaction effects between these or is there further value to creating variables: X.Z, X.W, Y.Z, Y.W which would be the unique combinations of P and Q.

The reason I am asking is to try to assess whether to embark on creating these interaction variables in my real-world scenario, where I have 7 such categorical features and 6-15 categories each which would mean thousands of new variables to account for all possible levels of permutations.

Topic categorical-encoding one-hot-encoding feature-engineering xgboost neural-network

Category Data Science


Tree models will be able to figure out the Interaction whether Label Encoded or OHE.
If you separately create all the combinations of Category values as new Features, then there is nothing left for interaction (At least 1st level).

e.g. let's take the example Features
P - {X,Y}
Q - {Z,W}


Let's assume, P and Q have interaction on X and W.
If we Label encode - We will observe multiple consecutive splits on P==X and Q==W.
If you OHE - you will observe multiple consecutive splits on Is_P_X and Is_Q_W (since both are separate feature now).
In these cases, you will have to observe the splits to see the interaction or can use a partial dependence plot or similar approaches.

If you Create features on all combinations -
In this case, you will observe very high Feature importance for feature X_W that is actually because of interaction but you will not observe it as consecutive splits on two feature but a split on single feature i.e X_W
In these cases, you can observe the interaction with Feature importance.

On a dummy data where I added the interaction effect on output for Sunday and Breezy weather.

Image 01 - Showing Interaction in the splits

enter image description here

Image 02 - Simply splitting on Sun_Breezy when every combination is a separate feature

enter image description here

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.