Churn Prediction Training Set
I don't understand how to form my dataset from activity(logins etc.) and characteristic(location, age etc.) raw user data.
Ultimately, each row of the training set will have N activity features for a certain period, M characteristic features and a binary outcome - churn or not after the end of this period.
My problem comes from defining the period and the number of rows per users.
The options I see are the following:
- Define period from start of user lifetime, 1 week for example. Then each row is 1 user (activity and characteristics) and outcome is whether they churned in week 2 or not.
- Break down a user's lifetime into periods. Predict all users every day on the data from their last week. Let's say user has 2 weeks lifetime. Training data will be:
data_week_1, not churn
data_week_2, churn
Looking for any advice or links related to the viability of these or other methods of dataset formation.
Topic churn classification
Category Data Science