How are the same observation sets treated in Random Forests with Bootstrapping?
Let's assume an extremely small dataset with only 4 observations. And I create a Random Forest model, with a quite large number of trees, say 200. If so, some sample sets that are the same each other can be used in fitting, right? Is it OK?
Even though a dataset is large, the same sample sets can be selected, theoretically. Do the Bootstrapping and Random Forest method not care at all or have a step to avoid such duplicates?
Topic bootstraping random-forest
Category Data Science