Aggregate SHAP importances from different models
A couple of questions on the SHAP approach to the estimation of feature importance.
I would like to use the random forest, logistic regression, SVM, and kNN to train four classification models on a dataset. Parameters in each training are chosen to give the best accuracy and precision for every model. A feature has a different magnitude of SHAP values in every model.
- Are these differences meaningful, so as the feature indeed has different importance depending on an algorithm (RF vs. SVM vs. kNN...)?
- Can I aggregate feature importances from these four models? For example, can I sum up SHAP values of every feature in all four models combined and have summary importance among all models?
- If 2 is valid, do I need the same feature set for every model?
Please, also provide related references, if you can.
A related previous post, Is it valid to compare SHAP values across models?, unfortunately, got no answers.
Topic shap explainable-ai predictor-importance decision-trees
Category Data Science