Logistic Regression for prediction

Question

Logistic Regression for prediction

Ledian K.

2022年5月23日 07:08

I would like to ask about the theoretical approach of using Logistic Regression for customer data and more specifically Churn Prediction (in BigQuery and Python).

I have my customer data for an online shop and I would like to predict if the customer will churn based on some characteristics. I have created my dataset and the Churn label (based on the hypothesis that if the customer hasn't bought something in the last year then it is assumed that the customer is churned since we are dealing with a non-contractual setting).

I am using 3 years of data (2019-2021), which includes ~3M customers and 43 features, and as I said, a customer is considered to be churned if the customer didn't place an order in 2021.

I checked the distribution of my label which is ~balanced.
I checked for some Logistic Regression assumptions such as multicollinearity, outlier influence etc.
I split the data into 80% training data, 10% evaluation data, 10% prediction data.
I checked the model's performance by looking at the classification metrics (Accuracy, Recall etc.)

My question would be:

We have the predictions of the 10% of the data (i.e. the probabilities that a customer will churn). Could we have the probabilities for all the other customers that belong in the training dataset and in the evaluation dataset?

In other words, what would be the next steps after we have trained and have checked that we could use the model, if your final goal would be to have in the end the probabilities of your customers to churn or to not churn?

Thank you in advance for your help!

Topic prediction churn logistic-regression

Category Data Science

Alex Serra Marrugat · Accepted Answer · 2022年4月19日 07:56

1

Alex Serra Marrugat answered at 2022年4月19日 07:56

You have method of your trained that model that will return you the predicted probability:

model.predict_proba(X)

Check the reference for more information and examples.

Logistic Regression for prediction

About