Classification or Regression approach?

Question

Classification or Regression approach?

Mechamod

2022年5月3日 20:13

I have a dataset with x variables and the target y (between 0 and 100%, so 0 and 1)

My goal os to predict if a sample is in a group of y [0,0.25), [25,50) or [50,100].

And I am wondering if I should use a classification model and number these groups with 3 labels [0,1,2] or perform a regression to obtain a specific value (e.g. 0,18 or 18%) and get the grouping later. Which approach should be used/yield better results? (And why)

Topic learning regression classification

Category Data Science

Ralph Winters · Accepted Answer · 2022年5月3日 18:56

Since you said that you want to predict ranges of the outcome variable, I think that classification is best. But, If you want a point estimate, regression is best, providing your data model meets the assumptions for regression.

Linear regression has rigorous assumptions. If you want to relax the assumptions you can also look at Quantile Regression, which makes no assumptions about the distributions of the error terms. With this method you can calculate a predicted probability for any quantile, which might fit nicely with the ranges you desire to predict.

Brian Spiering · Accepted Answer · 2022年5月3日 18:34

1

Brian Spiering answered at 2022年5月3日 18:34

It sounds like beta regression which predicts a continuous value bounded between 0 and 1.

Classification or Regression approach?

About