AUC-ROC for Multi-Label Classification

Question

AUC-ROC for Multi-Label Classification

NotoriousFunk

2022年5月23日 09:18

Hey guys I'm currently reading about AUC-ROC and I have understood the binary case and I think that I understand the multi-classification case. Now I'm a bit confused on how to generalize it to the multi-label case, and I can't find any intuitive explanatory texts on the matter.

I want to clarify if my intuition is correct with an example, let's assume that we have some scenario with three classes (c1, c2, c3).

Let's start with multi-classification:

When we're considering the multi-classification setting we look at each label separately.

So if we're looking at the ROC for label c1, we can bunch together c2 and c3 as "negatives".

I.e, when we have a sample that belongs to c1, we only look at the predictive score of c1, and build a predictive score distribution of the positive samples. Then we look at samples that belongs to either c2 and c3,i.e the bunched together negative samples and we look at their prediction scores and build a distribution of those scores as well. This results in something like the following:

Based on those distributions we can get the TPRs and FPRs based of some thresholds and calculate the ROC for c1. Then we can do the same for c2 and c3, and if we want we can average over the three ROC curves, to get an aggregated score for the problem.

That's my intuition so far anyways.

But what about the multi-labeled scenario?

Here's when things get confusing for me. Do we calculate it in the exact same manner? I understand that we still calculate the ROC for every class individually, but I'm not sure of how to think of it. Let's say that we're looking at it from the perspective of class c1. For every sample that is regarded as c1 (and possibly c2 and c3 as well), we add the model's predictive score for c1 the the distribution. But what if we encounter for example a sample that is regarded as c2 AND c3 (this can't happen in the multi-class scenario), do we think of this as TWO negative samples and add two predictive scores to the distribution?

Am I thinking in the right tracks here?

Topic metric auc multilabel-classification multiclass-classification

Category Data Science

hans · Accepted Answer · 2020年4月17日 12:09

I am not sure if I understand your thinking, but my understanding is following - maybe it can help you see it from another perspective.

The multi-label classification problem with n possible classes can be seen as n binary classifiers. If so, we can simply calculate AUC ROC for each binary classifier and average it. This is a bit tricky - there are different ways of averaging, especially:

'macro': Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.

'weighted': Calculate metrics for each label, and find their average, weighted by support (the number of true instances for each label).

as explained here.

There is another option (which may be the closest to how you see it):

'micro': Calculate metrics globally by considering each element of the label indicator matrix as a label.

which is implemented here.

There is also the fourth option, which I don't really understand:

'samples': Calculate metrics for each instance, and find their average.

I have been using myself weighted averaging.

Erwan · Accepted Answer · 2020年2月29日 01:07

Disclaimer: I'm not familiar with AUC/ROC with multiclass or multi-label tasks myself.

According to this question and its answers, the case of multiclass classification doesn't seem that simple. I would be very cautious about simply averaging values across classes, because the properties of AUC/ROC would problably not hold in general.
That being said, in case the method mentioned for multiclass is considered sufficient, then there's no reason not to use the same for multi-label: counting each instance N times for N classes is already what you do (or should do) in the multiclass case: the instances not relevant for a class are actually counted as true negatives. ROC curves are by nature built for a binary classification task, which means that every instance is classified as exactly one of the four possibilities True/False Positive/negative.

AUC-ROC for Multi-Label Classification

About