Why is rpart not splitting this data even when there is gain in gini?
df - tibble(x1=factor(c(S1, S1, S2, S2)), y=factor(c(1, 1, 0, 1)))
md - rpart(formula=y~., data=df, method=class, control=rpart.control(minsplit=2, cp=0))
nrow(md$frame) #outputs 1
Consider the split
left child node:
S1, 1
S1, 1
Right child node:
S2, 0
S2, 1
Here the gain in gini would be ${1 \over 8} = 0.125$
Why is rpart not doing this split?
Topic decision-trees r
Category Data Science