Aggregate SHAP importances from different models

Question

Aggregate SHAP importances from different models

CaffeineMan

2020年10月22日 22:14

A couple of questions on the SHAP approach to the estimation of feature importance.

I would like to use the random forest, logistic regression, SVM, and kNN to train four classification models on a dataset. Parameters in each training are chosen to give the best accuracy and precision for every model. A feature has a different magnitude of SHAP values in every model.

Are these differences meaningful, so as the feature indeed has different importance depending on an algorithm (RF vs. SVM vs. kNN...)?
Can I aggregate feature importances from these four models? For example, can I sum up SHAP values of every feature in all four models combined and have summary importance among all models?
If 2 is valid, do I need the same feature set for every model?

Please, also provide related references, if you can.

A related previous post, Is it valid to compare SHAP values across models?, unfortunately, got no answers.

Topic shap explainable-ai predictor-importance decision-trees

Category Data Science

Aggregate SHAP importances from different models

About