How to define churn prediction for period of time in the future (for example 4 months)
Task is churn prediction for customers who pay subscription for the service, in the next 4 months.
The customer can pay subscription on monthly or yearly basis. If the customer doesn't pay in subscription period (for monthly basis next month, for yearly basis after 12 months) he receives a warning in the next month, then again second warning (a month after that) and then he is awarded status “churned”.
Inputs are the data from data warehouse, one row per month for each customer. I use the last 5 years, which means that for each user I have max 60 rows of infos. Of course, there are customers who have fewer rows in data warehouse (they came later than starting period for dataset).
In data warehouse, the column 'CHURN', contains info is the customer active in current month or is he churned. Also, the definition of churn is quite clear.
For better understanding, I can explain this part of dataset as the following:
ID | 01/2019 | 02/2019 | 03/2019 | 04/2019 | … | 06/2020 | 07/2020 | 08/2020 | |
---|---|---|---|---|---|---|---|---|---|
110 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | (ACTIVE ALL THE TIME) | |
111 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | (CHURN AFTER 1 MONTH) | |
112 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | (CHURN IN 07/2020) |
0- active 1 - churned
What approach could be the best to prepare data for churn prediction problem: Time window, for example 24 months or something else?
What approach is the best for defining churn in the future period of time based on this data?
Topic churn predictive-modeling
Category Data Science