Identify causal feature in a classification model

Question

Identify causal feature in a classification model

CutePoison

2021年10月14日 12:04

Assume I have a model $f(x;b_1,b_2,b_3,b_4)$ which maps a 4-dimensional vector into a binary classifier e.g logistic regression with 4 parameters to create churn-classifier.

Say, for instance, that $b_1 =\text{time spend on site (in minutes)}$ and $b_1=0.3$ (with no intercept) that means when time on site increases by 1 minute the probability of churning increases with ~0.57, keeping all other variable fixed.

But that does not mean, that we, on the other way round, can reduce the chance of people churning, just by keeping them on the site for longer.

So, my question is; When we have models where we can interpret the increase of the probability/odds/score of the model, how can we take action on those features? When can we, in the example above, conclude we just need to keep users on the page for a longer period of time to reduce their churn probability? I.e how can we determine the causality of a feature without doing some kind of additional experiments?

Topic causalimpact probability

Category Data Science

Identify causal feature in a classification model

About