Feature importance of a linear regression

What is the easiest and easy to explain feature importance calculation for linear regression? I know I can use Shap to compute feature importance, but I find it difficult to explain it to stakeholders, and the coefficient is not a good measure of feature importance since it depends on the scale of the feature. Some suggested (standard deviation of feature)*feature coefficient as a good measure of feature importance.

Topic feature-importances linear-models linear-regression statistics machine-learning

Category Data Science


The easiest way is probably looking at quantiy $$ \frac{\beta_i}{\hat{se}(\beta_i)}$$ where $ \beta_i $ is the coefficient of i-th feature and $ \hat{se}(\beta_i) $ is the standard error of the coefficient.

The standard error of the coefficient can be computed by looking at the entries at i-th row, i-columns of the matrix $$(X^TX)^{-1}\frac{\sum{(\hat{y_i}-y_i)^2}}{n-p}$$

where $ X $ is your feature matrix, $p $ is number of parameters in your model.

From coding part, you can use OLS from statsmodels in python. The summary function already contains all these nesscary statistic.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.