Could gini impurity rise as we go through decision tree?
I have a DecisionTreeClassifier
built with sklearn (criterion=gini
), for which I need to explain how each particular prediction has been made.
Similar to what they do in that sklearn example, I go through each node involved in the decision path and extract the feature, current value, sign and threshold, but I also need to get some measure of how important each step is.
My intuition is:
Each node has a value of gini impurity. This show how impure data is at that particular step. If we take gini impurity of the following node, and substract it from the current node's gini impurity, this difference shows how much information we gained from performing that particular step. In other words, this is a measure of importance of each decision step.
So the first question is: am I correct with that so far?
The issue I have with that solution is that sometimes a step has a negative delta of gini impurity. That would mean performing such step would reduce our certainty, which is contradicts my understanding of decision trees. The amount of steps with negative deltas is up to 15%.
So, the next quistions are: is this possible? Is there an explanation of why this might be the case?
I did suspect that there might be a bug in my code, but I double checked and didn't found it so far. One possible way to solve that mistery is to visualize tree (for example, with sklearn.tree.export_text
), but the tree is huge, and manually inspecting it is extremely hard.
TLDR: Could there be such scenarios in which gini impurity would rise as we go further away from the root of a decision tree?
Topic gini-index decision-trees scikit-learn
Category Data Science