Account for imbalanced data in a Neural Network using prior distribution
I have a dataset with 4 classes, say their distribution in the training-set is
$P_{prior}(C1) = 60\% $
$P_{prior}(C2) = 25\% $
$P_{prior}(C3) = 10\% $
$P_{prior}(C4) = 5\% $
After training a Neural Network (on a balanced dataset, i.e after undersampling), I get the output for a new sample as
$P(C1) = 50\%$,
$P(C2) = 10\%$
$P(C3) = 10\%$
$P(C4)=30\%$
Ususally we would just assign the sample to class 1, since it has the greatest outcome. But, if we compare the output to the prior-distribution we get the following ratios
$\tilde{P}(C1)=P(C1)/P_{prior}(C1) = 0.83$
$\tilde{P}(C2)=P(C2)/P_{prior}(C2) = 0.4$
$\tilde{P}(C3)=P(C3)/P_{prior}(C3) = 1 $
$\tilde{P}(C4)=P(C4)/P_{prior}(C4) = 6 $
thus the probability of class 4 is six times greater before we saw the data, and we are actually more certain that it is not class 1 than if we did not see the sample, thus I would argue that we should assign the new sample to class 4, instead of class 1.
Is that approach/thought wrong? And if it is - how would we encounter (on the prediction side, not in the network structure e.g by dropout etc.) for class-imbalance in a Neural Network?
Topic probability-calibration class-imbalance neural-network
Category Data Science