cross validation on whole data set or training data?

Question

cross validation on whole data set or training data?

zack

2020年3月23日 23:58

I am always having cross validation score smaller then the training score and I am performing cross validation on just training data is that normal thing ? Kfold = 5

Topic score cross-validation machine-learning

Category Data Science

Djib2011 · Accepted Answer · 2020年3月23日 23:58

Yes, it's called overfitting. Your model is beginning to memorize the training set, but not performing well on any validation or test set. If your question is why is this happening, I'd like to refer you to another answer I wrote explaining this phenomenon in more detail.

One interesting question that could be made is why is the performance on the cross-validation folds worse than on the test set?

This is a bit more tough to answer, because I don't have all the details. Some possible explanations could be that the since the training set is larger than each fold, the model was trained better, or that simply the test set examples happened to be easier.

cross validation on whole data set or training data?

About