My own model trained on the full data is better than the best_estimator I get from GridSearchCV with refit=True?

I am using an XGBoost model to classify some data. I have cv splits (train, val) and a separate test set that I never use until the end.

I have used GridSearchCV to determine the best parameters and fed my cv splits (5 folds) into it as well as set refit=True so that once it figures out the best hyperparameters it trains on the full data (all folds as opposed to just 4/5 folds) and returns the best_estimator. I then test this best model on my test set at the end.

I then compare the results of this model with a model that I train on my own separately with the best hyperparameters, and I get better results with my own model. Why is that?

Does GridSearchCV still use cross validation when it's training on the full data with the best hyperparameters? Is it the case that GridSearchCV is doing something extra that's hurting the model?

Topic hyperparameter-tuning data-science-model gridsearchcv xgboost

Category Data Science


I then compare the results of this model with a model that I train on my own separately with the best hyperparameters, and I get better results with my own model. Why is that?

I would say that the most likely explanation is overfitting with the model selected by grid search: when trying the different combinations of parameters, it's possible that one happens to be very good by chance on the CV split. The CV is supposed to counter this effect of chance but it cannot perfectly avoid it, especially if there are many different combinations of parameters and/or the dataset is small. As a result this model would be selected even though it's not truly better in general, and it obtains sub-optimal performance on the test set.

Does GridSearchCV still use cross validation when it's training on the full data with the best hyperparameters?

I don't know the implementation but this wouldn't make any sense, so no.

Is it the case that GridSearchCV is doing something extra that's hurting the model?

I don't think so but the process of tuning the parameters is a kind of training so it can also lead to overfitting.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.