Positive/negative training sample imbalance in multi-label image classifiers

I'm trying to train VGG-16 on the Pascal VOC 2012 dataset, which has images with 20 labels (and a given image can have multiple classes present). The examples are highly imbalanced, so I've balanced them such that each label is represented roughly equal in the training set.

But this means that for each label, 5% of the total images are positive examples and 95% are negative samples. There is no way to achieve a 50/50 split for all classes.

I'm using binary cross entropy loss and a sigmoid activation at the final VGG layer, since this is a multi-label problem. Binary accuracy looks great but in fact, the results for any given class are pretty dismal (~15% recall). The classifier is not fitting to positive examples and is biased toward reporting a negative result because that matches the data distribution (very few positive samples).

What is typically done in this scenario? The original paper appears to train on mutually-independent classes. Should I be using a custom loss function?

Topic multilabel-classification image-classification neural-network

Category Data Science


There are other loss functions that are more useful for imbalanced multi-label datasets. One example is from FAIR’s Focal Loss for Dense Object Detection paper. Focal loss function is dynamic based on the predicted probability of each object. The goal is to decreases the dominance of over-represented classes in the total loss term.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.