How to weigh imbalanced softlabels?
The target is a probability between N classes, I don't want it to predict the class with the highest probability but the 'actual' probability per class.
For example:
| | Class 1 | Class 2 | Class 3 |
------------------------------------
| 1 | 0.9 | 0.05 | 0.05 |
| 2 | 0.2 | 0.8 | 0 |
| 3 | 0.3 | 0.3 | 0.4 |
| 4 | 0.7 | 0 | 0.3 |
------------------------------------
| + | 2.1 | 1.15 | 0.75 | - correct this imbalance?
| 0 | 4 | 3 | 3 | - or this one?
Some classes have 'more' samples in the sense that the sum of probabilities is higher than other classes. Do I have to balance this out with weights in the loss function? Or do I only correct for the imbalance in >0 as normally?
Topic labels class-imbalance
Category Data Science