GridSearch multiplying the number of trees in XGboost?

Question

GridSearch multiplying the number of trees in XGboost?

Cosapocha

2022年6月2日 16:43

I'm having an issue: after running an XGboost in a HalvingGridSearchCV, I receive a certain number of estimators (50 for example), but the number of trees is constantly being multiplied by 3. I don't understand why.

Here is the code:


model = XGBClassifier(objective='multi:softprob', subsample = 0.9, colsample_bytree=0.5, num_class= 3)

md = [3, 6, 10, 15]
lr = [0.1, 0.5, 1]
g = [0, 0.25, 1]
rl = [0, 1, 10]
spw = [1, 3, 5]
ns = [5, 10, 20]

param_grid = {'max_depth': md, 'learning_rate':lr, 'gamma':g, 'reg_lambda':rl, 'scale_pos_weight':spw, 'n_estimators':ns}

sh = HalvingGridSearchCV(model, param_grid,  cv=10, factor=2, resource='n_samples', n_jobs=-1,
                         max_resources=60, min_resources = 30, 
                         error_score='raise', verbose=1).fit(X_train,y_train)

best_estimator = sh.best_estimator_

Then I print the number of trees:

dump_list = bestimator.get_booster().get_dump()
num_trees = len(dump_list)

print('number of trees:', num_trees)

print(bestimator)

And I get:

number of trees: 150
XGBClassifier(colsample_bytree=0.5, gamma=1, max_depth=10, n_estimators=50,
              num_class=3, objective='multi:softprob', reg_lambda=10,
              subsample=0.9)

As you can see, I have 3 times more trees than it's supposed. I been hours looking into it, and I have no idea why.

Topic gradient-boosting-decision-trees xgboost decision-trees classification machine-learning

Category Data Science

Ben Reiniger · Accepted Answer · 2022年6月2日 16:43

1

Ben Reiniger answered at 2022年6月2日 16:43

You have three classes; xgboost is building three one-vs-rest models, hence three times the trees.

https://github.com/dmlc/xgboost/issues/806

GridSearch multiplying the number of trees in XGboost?

About