AUC-ROC for Multi-Label Classification
Hey guys I'm currently reading about AUC-ROC and I have understood the binary case and I think that I understand the multi-classification case. Now I'm a bit confused on how to generalize it to the multi-label case, and I can't find any intuitive explanatory texts on the matter.
I want to clarify if my intuition is correct with an example, let's assume that we have some scenario with three classes (c1, c2, c3).
Let's start with multi-classification:
When we're considering the multi-classification setting we look at each label separately.
So if we're looking at the ROC for label c1, we can bunch together c2 and c3 as "negatives".
I.e, when we have a sample that belongs to c1, we only look at the predictive score of c1, and build a predictive score distribution of the positive samples. Then we look at samples that belongs to either c2 and c3,i.e the bunched together negative samples and we look at their prediction scores and build a distribution of those scores as well. This results in something like the following:
Based on those distributions we can get the TPRs and FPRs based of some thresholds and calculate the ROC for c1. Then we can do the same for c2 and c3, and if we want we can average over the three ROC curves, to get an aggregated score for the problem.
That's my intuition so far anyways.
But what about the multi-labeled scenario?
Here's when things get confusing for me. Do we calculate it in the exact same manner? I understand that we still calculate the ROC for every class individually, but I'm not sure of how to think of it. Let's say that we're looking at it from the perspective of class c1. For every sample that is regarded as c1 (and possibly c2 and c3 as well), we add the model's predictive score for c1 the the distribution. But what if we encounter for example a sample that is regarded as c2 AND c3 (this can't happen in the multi-class scenario), do we think of this as TWO negative samples and add two predictive scores to the distribution?
Am I thinking in the right tracks here?
Topic metric auc multilabel-classification multiclass-classification
Category Data Science