Feature importance of a linear regression

Question

Feature importance of a linear regression

NAS

2022年5月19日 06:10

What is the easiest and easy to explain feature importance calculation for linear regression? I know I can use Shap to compute feature importance, but I find it difficult to explain it to stakeholders, and the coefficient is not a good measure of feature importance since it depends on the scale of the feature. Some suggested (standard deviation of feature)*feature coefficient as a good measure of feature importance.

Topic feature-importances linear-models linear-regression statistics machine-learning

Category Data Science

Geogre · Accepted Answer · 2022年5月19日 06:10

The easiest way is probably looking at quantiy $$ \frac{\beta_i}{\hat{se}(\beta_i)}$$ where $ \beta_i $ is the coefficient of i-th feature and $ \hat{se}(\beta_i) $ is the standard error of the coefficient.

The standard error of the coefficient can be computed by looking at the entries at i-th row, i-columns of the matrix $$(X^TX)^{-1}\frac{\sum{(\hat{y_i}-y_i)^2}}{n-p}$$

where $ X $ is your feature matrix, $p $ is number of parameters in your model.

From coding part, you can use OLS from statsmodels in python. The summary function already contains all these nesscary statistic.

Feature importance of a linear regression

About