Class weights for imbalanced data in multilabel problems
I am trying to train a CNN for a multiclass - multilabel classification task (20 classes, each sample can belong to 1+ labels) and the dataset is highly imbalanced. In single-label cases I would use the compute_class_weights function from sklearn to calculate the class weights in order to help the optimizer to account for the minority class. However, for the multilabel case I feel its not working as supposed to, because it considers as number of samples the number of times all classes occur, while the actual number of samples are less (since its multilabel). Is anyone familiar with a function, or even a formula, to calculate the class weights in this case?
Thanks
Topic weighted-data class-imbalance
Category Data Science