I have a DecisionTreeClassifier built with sklearn (criterion="gini"), for which I need to explain how each particular prediction has been made. Similar to what they do in that sklearn example, I go through each node involved in the decision path and extract the feature, current value, sign and threshold, but I also need to get some measure of how important each step is. My intuition is: Each node has a value of gini impurity. This show how "impure" data is …
I would like to understand why "Gini index operates on the categorical target variables in terms of “success” or “failure” and performs only binary split" ? Why it would not be possible to have 3 decision after a split when we are using the Giny impurity to select an attribute ? source : https://medium.com/analytics-steps/understanding-the-gini-index-and-information-gain-in-decision-trees-ab4720518ba8 and this is not the only ressource saying that.
I'm studying random forest models, but I don't understand what the Gini index is and what it's for. Does anyone have any material on this or can give me an explanation? Thanks!