How does the construction of a decision tree differ for different optimization metrics?
I understand how a decision tree is constructed (in the ID3 algorithm) using criterion such as entropy, gini index, and variance reduction. But the formulae for these criteria do not care about optimization metrics such as accuracy, recall, AUC, kappa, f1-score, and others.
R and Python packages allow me to optimize for such metrics when I construct a decision tree. What do they do differently for each of these metrics? Where does the change happen?
Is there a pattern to how these changes are done for different classification/regression algorithms?
Topic decision-trees optimization algorithms machine-learning
Category Data Science