Feature importance difference in two similar machine learning models

Situation 1:

I have trained a text classification model (Model 1) which gives me a probability of true class as X. I have also trained a classification model (Model 2) using only the categorical and numeric data. Both the models are used to predict the same true class; just the features differ. I used a random forest classifier on the probabilities returned by Model 1 and Model 2(taking them as input features) and got similar performance metrics(Accuracy, Precision recall). feature importance was 49% for model 1 and 51% for model 2.

Situation 2:

I used the probability X of text classification model as an input feature to the Model 2(which contained categorical and numeric features). The performance was almost similar to Situation 1 but here the feature importance of the final model indicated that text model probability had higher importance around 68% and rest of the features had lesser importance.

I want to understand the difference in feature importance of both situations.

Topic features stacking xgboost random-forest machine-learning

Category Data Science


In 2nd case, you are not comparing apple to apple.

Let's say we have 4 Features and all are equally good [Also, No interaction].

Case I -
We created two Models using 2 Features each
Model I - F1/F2
Model II- F3/F4

Comparing these two Models as Features will give you an Idea about F1/F2-combined compared to F3/F4-combined. This is your Case - I.

Case II -
If you will compare F1/F2-combined, F3(alone), and F4(alone), definitely F1/F2 combined will have high importance.
The output probability of a Model using two Features is basically holding the information of both the Features together.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.