How do Classification Algorithms such as Catboost and Random Forest parse test data?

Question

How do Classification Algorithms such as Catboost and Random Forest parse test data?

Nathan

2021年3月18日 14:23

I would like to know how classification works with the algorithms listed above. My specific question is this, say I have a high signal continuous feature which has a certain distribution and I train a model according to some training data and it finds the best split for that feature. When I use the model on test data, would it split according to a specific number or by distribution? i.e if the number '10' provides the best split for the training data which happens to be 75th percentile. When it comes to test data will it split according to the percentile or the number 10? I hope that is clear.

Topic catboost training random-forest machine-learning

Category Data Science

10xAI · Accepted Answer · 2021年3月18日 14:23

1

10xAI answered at 2021年3月18日 14:23

Both the models i.e. RandomForest/CatBoost are based on a Decision Tree.
A Tree splits on feature values. Check this Tree for Iris dataset. Scikit-Learn

How do Classification Algorithms such as Catboost and Random Forest parse test data?

About