Classification or Regression approach?

I have a dataset with x variables and the target y (between 0 and 100%, so 0 and 1)

My goal os to predict if a sample is in a group of y [0,0.25), [25,50) or [50,100].

And I am wondering if I should use a classification model and number these groups with 3 labels [0,1,2] or perform a regression to obtain a specific value (e.g. 0,18 or 18%) and get the grouping later. Which approach should be used/yield better results? (And why)

Topic learning regression classification

Category Data Science


Since you said that you want to predict ranges of the outcome variable, I think that classification is best. But, If you want a point estimate, regression is best, providing your data model meets the assumptions for regression.

Linear regression has rigorous assumptions. If you want to relax the assumptions you can also look at Quantile Regression, which makes no assumptions about the distributions of the error terms. With this method you can calculate a predicted probability for any quantile, which might fit nicely with the ranges you desire to predict.


It sounds like beta regression which predicts a continuous value bounded between 0 and 1.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.