XGBoost Log Loss different from GridSearchCV Log Loss

I have a classification problem where I am trying to predict if the data returns a 1 or 0. So your classic binary classification. I have my set of data that I have split into the dependent variables (ones I am training on) and the independent variable (my target that I am predicting, either a 0 or 1). I am using log loss as the scoring metric for my model.

Firstly, I am using the cv function in xgboost to figure out the number of estimators I need as it stops when the log loss has not improved over 50 rounds. I then train my model and predict. My code is below:

def modelfit(alg, dtrain, dtarget, useTrainCV=True, cv_folds=5, early_stopping_rounds=50):
    
    if useTrainCV:
        # gets the xgb parameters specifically.
        xgb_param = alg.get_xgb_params()
        
        # this is the internal xgb data frame that is for efficiency. We map the training data to the labels.
        xgtrain = xgb.DMatrix(dtrain.values, label=dtarget)
        
        # this performs cross validation on the dataset. As our data is not really time dependent we can afford to cross
        # validate. It stops when it hasnt improved for 50 rounds. This is only for determining n_estimators
        cvresult = xgb.cv(xgb_param, xgtrain, num_boost_round=alg.get_params()['n_estimators'], nfold=cv_folds,
            metrics='logloss', early_stopping_rounds=early_stopping_rounds)
                
        print(f'Optimal n_estimators - {cvresult.shape[0]}')
        
        # this sets the most optimal n_estimators parameter into the booster.
        alg.set_params(n_estimators=cvresult.shape[0])
            
    # fit the algorithm on the data and set evaluation metric
    alg.fit(dtrain.values, dtarget, eval_metric='logloss', eval_set=[(dtrain.values, dtarget)])
    
    print(alg.evals_result())
        
    # predict training set:
    dtrain_predictions = alg.predict(dtrain.values)
    print(dtrain_predictions)
    dtrain_predprob = alg.predict_proba(dtrain.values)[:,1]
        
    # print model report:
    print(\nModel Report)
    print(Log Loss Score (Train): %f % metrics.log_loss(dtarget, dtrain_predprob))

I then run this function on this particular XGBoostClassifier:

#Choose all predictors
xgb1 = XGBClassifier(
 learning_rate =0.1,
 n_estimators=1000,
 max_depth=5,
 min_child_weight=1,
 gamma=0,
 subsample=0.8,
 colsample_bytree=0.8,
 objective= 'binary:logistic',
 scale_pos_weight=1, 
 nthread=-1,
 seed=27)

modelfit(xgb1, X, y)

The log loss value that is returned is: 0.577496 and the number of estimators is 65.

I then turn to GridSearchCV to tune the other parameters and I start with:

param_test1 = {
 'max_depth' : range(1,10),
 'min_child_weight' : range(1,6)
}

Note how the original max depth and min child weight are contained within these ranges that I used in the xgb1 classifier.

xgb2 = XGBClassifier(
        learning_rate =0.1, 
        n_estimators=65,
        max_depth=5,
        min_child_weight=1, 
        gamma=0, 
        subsample=0.8, 
        colsample_bytree=0.8,
        objective= 'binary:logistic',
        nthread=-1,
        scale_pos_weight=1, 
        seed=27
)

gsearch1 = GridSearchCV(
    estimator = xgb2, 
    param_grid = param_test1, scoring='neg_log_loss', n_jobs=-1, cv=5
)

gsearch1.fit(X, y)
gsearch1.best_params_, gsearch1.best_score_

However, this returns me with:

(
{'max_depth': 1, 'min_child_weight': 1}, -0.6275341839742403
)

So my question is how has the grid search said the best parameters are max_depth = 1 and min_child_weight = 1 and the log loss is 0.628 when previously before using GridSearchCV my model returned a better log loss of 0.577 with max_depth = 5 and min_child_weight = 1?

Any help would be appreciated, please. Thanks.

Topic grid-search xgboost ensemble-modeling classification machine-learning

Category Data Science


Your modelfit prints the training score, but GridSearchCV bases its decisions on the out-of-fold average (and in particular best_score_ is an out-of-fold average score). This is an unfair comparison, and in particular your 0.577 is probably quite optimistically biased.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.