XGBoost regression scale invariant? 0 depth trees for target variable with small (1E-7) values

I thought the consensus was that XGBoost was largely scale-invariant and scaling of features isn't really necessary but something's going wrong and I don't understand what.

I have a range of features with different scales and I'm trying to do a regression to target a variable that's in the range of 1E-7 or so (i.e. the target will be somewhere between 1E-7 and 9E-7).

When I run XGBoost on this, I get warnings about 0 depth trees and every prediction is the same value, regardless of the value of the input feature.

If I scale my target variable by 1E9 (so it's on the order of 100 rather than 1E-7) then the regression works perfectly fine and I get a decent model.

Can anyone shed any light on this? Is there a particular hyperparameter that's sensitive to the scale of the target variable?

Topic gradient-boosting-decision-trees xgboost python

Category Data Science


It may not be related to hyperparams per say. I think it has more to do with the nature of how xgboost is trained. XGBoost for regression tries to reduce the variance at every node. May when you have variables with such less values it just round of them to 0 and it has nothing to learn. It may have to do more with precision which XGBoost work.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.