Averaging multiple train-test splits to estimate the performance with higher variability?
I have a small size data set and I want to assess the effect of a certain type of cases on the overall model performance. For example, is the model biased against people of a certain age group?
Using a single train-test split, the number of cases of a particular type becomes quite small, and I suspect findings may occur due to randomness.
Would it in this scenario make sense to use multiple train-test splits, compute the average performances, and then assess the effects of the particular type of cases after? In other words, improving the variability of the result by including more cases of the particular type.
Topic cnn data neural-network machine-learning
Category Data Science