Minimum number of samples to train XGBoost without overfitting

Question

Minimum number of samples to train XGBoost without overfitting

jeremy_rutman

2022年5月12日 14:01

When using Neural Networks for image processing I learned a rule of thumb: to avoid overfitting, supply at least 10 training examples for every neuron.

Is there a similar rule of thumb for classifiers such as XGBoost, presumably taking into account the number of features and estimators?

And, considering the 'curse of dimensionality' shouldn't the rule of thumb be that n_training is geometric in n_dimensions, and not linear?

Topic overfitting xgboost neural-network classification

Category Data Science

LaSul · Accepted Answer · 2018年12月21日 08:50

This is not only a number of sample, this is also a question of depth.

The higher depth you have, the more you're likely to overfit.

You can reduce the overfit by adding a high number of trees, that allows you to "steady" your algorithm

parvij · Accepted Answer · 2018年7月22日 16:00

It's completely true that the number of examples should be related to features. But it is not only the number of features because the range of a number(max-min and count of different number) is also important. On the other hand, if you have noise you need more examples, so it's related to your dataset.

Minimum number of samples to train XGBoost without overfitting

About