What to do when one feature has very large importance/weight?

I am new to Data Science and currently am trying to predict customers churn for a company that offers of subscription-based bookings management software. Its customers are gyms. I have a small unbalanced dataset of a historical data (False 670, True 230) with 2 numerical predictors: age(days since subscription), number of active days in the last month(days on which a customer(gym) had bookings) and 1 categorical: logo (boolean, if a customers uploaded a logo in a software).

Predictors have following negative correlations with churn :

  • logo: 0.65
  • num_active_days_last_month: 0.40
  • age: 0.3

Feature importances look similar with Logo having the most weight.

When I predict, the model (logistic regression) classifies customers without logo as churners, even thought they are quite active.

For example the following two customers have almost the same probability to churn:

Customer 1:

  • logo: True
  • num_active_days_last_month: 1
  • age:30 days

Customer 2:

  • logo: False
  • num_active_days_last_month: 22
  • age: 250 days

I understand that this is what model learned from the dataset, but it just doesn’t make sense in my mind to have such strong importance assigned to something like Logo. Is there any way I can avoid completely excluding Logo from the predictors? maybe somehow decrease its importance?

Thank you in advance for any help/ suggestions i can get.

Topic data-science-model churn logistic-regression classification

Category Data Science


I don't understand why the logo is taken into account in your algorithm.

Generally speaking, you have to take into account variables in your algorithm that make sense, either because it is very obvious (which seems to be your case) or because you didn't find any correlation with other data (through a correlation algorithm).

My suggestion is to remove the logo from your model first. Then, the two remaining variables might not be enough to do predictions with a data science algorithm. Perhaps the active days in last month is enough?

Of course, the customers who have a high age and were present in the last month have lower chances to churn.

What could be interesting in your case is predicting when a customer is most likely to churn thanks to a model that recognize time series patterns.

However, I'm affraid there is no enough variables to reach interesting results, nor enough data because 1000 rows may not cover most scenarios and statistical sets.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.