Sample size for SHAP explainer and range of a SHAP value

Question

Sample size for SHAP explainer and range of a SHAP value

The Great

2022年3月25日 15:23

I am working on a binary classification with 977 records with 77:23 class proportion. I used random forest model.

Based on my attempt to run SHAP package, I got the below plots

And I also see that SHAP requires us to select sample size to get the SHAP value as shown here in this post

When SHAP does not use same assumption as LIME neighborhood, why does it require sample size to be mentioned?

To summarize, my questions are as follows

a) Based on my plot above, is my feature contribution very less? Should the contribution magnitude be greater than 1 to be considered as important feature (with some predictive power) or the scale of x-axis differs based on different project? While it shows,0.20,0.12,0.11,0.5 etc. how do I know whether they have sufficient predictive power? Here in SHAP website I see that they have values more than 0.5 etc. Is the range specific to that problem or SHAP values usually have a common range and my features make very less contribution?

b) Additionally, why do we sample size to be given to compute the SHAP value?

Topic shap random-forest classification predictive-modeling machine-learning

Category Data Science

Sample size for SHAP explainer and range of a SHAP value

About