Loss function to prevent estimator bias
I have a regression problem I'm trying to build a model for: Predicting sales per person (= 0) depending on some variables. I'm running different model types and gave deep neural networks a try. The loss functions I'm using are mean squared error and mean absolute error (or sometimes a mix).
I often run into this issue though, that despite mse and mae are being optimized, I end up with a very strong bias in the prediction, e.g. sum(training_all_predictions) / sum(training_all_real) = 0.76
.
Looking at this from a small example point of view, I can't blame the model:
real - c(10, 30, 100)
pred1 - c(4, 14, 122)
pred2 - c(16, 46, 122)
## mean absolute error
mean(abs(pred1 - real))
# 14.66667
mean(abs(pred2 - real))
# 14.66667
## mean squared error
mean((pred1 - real)^2)
# 258.6667
mean((pred2 - real)^2)
# 258.6667
So from a model loss point of view, these are identical solutions. However, if I were to sum up multiple predictions, I would clearly prefer pred1
:
sum(pred2) / sum(real)
# 1.314286
sum(pred1) / sum(real)
# 1
So if I take the whole example, pred2 is off by 31%, while pred1 nails it. On a individual level both predictions are equal.
All other common regression loss functions I found struggle from the same problem. (Using Keras: https://keras.io/api/losses/)
Questions:
- Can I solve this with a custom loss functions?
- I tried
(cumsum(y_pred) - cumsum(y_test))^2
but although I got a decline of this loss over epochs, I was even further off (~0.6).
- I tried
- Am I attacking my problem from the wrong angle?
- I could try to build a model on cohorts, but this just feels very off, as I would have to aggregate information and would introduce cohort size as another variable.
- Multiplying everything with a factor also sounds off, as this will likely heavily increase mse / mae again.
Edit: Specified why pred1 is better than pred2.
Edit2: Removed the reference to Estimator bias to avoid confusion.
Edit3: Increased the numbers in the example to make it more obvious.
Topic bias keras regression neural-network r
Category Data Science