How to weight loss in regression

Question

How to weight loss in regression

Gal Avineri

2022年3月18日 00:01

I've got a regression problem where a model is required to predict a value in the range [0, 1].

I've tried to look at the distribution of the data and and it seems that there are more examples with a low value label ([0, 0.2]) than higher value labels ([0.2, 1]).

When I try to train the model using the MAE metric, the model converges to a state where it has a very low loss, but it seems that the model has converged to a state in which it predicts a low value on many of the high value label examples.

So my assumption was that the data is imbalanced and I should try to weight the loss of the examples depending on their label.

Question: what is the best way to weight the loss in this configuration?

Should I weight each example by the value of its label using some function f(x) , where f(x) is low when x is low and high when x is high?

Or should I split the label values into bins ([0, 0.1), [0.1, 0.2) ... [0.9, 1]) and weight each bin (similarly to categorical loss weight)?

Topic weighted-data regression class-imbalance machine-learning

Category Data Science

Brian Spiering · Accepted Answer · 2021年6月16日 14:07

1

Brian Spiering answered at 2021年6月16日 14:07

If you are predicting values between 0 and 1, you should use beta regression.

Beta regression will handle the heteroskedasticity or skewness which are commonly observed in rates or proportions.

How to weight loss in regression

About