Are more target labels in a multi-label classification always better?

Context

We work on medical image segmentation. There are a lot of potential labels for one and the same region we segment. There can be different medically defined labels like anatomical regions, more biological labels like tissue types or spatial labels like left/right. And many labels can be further differentiated into (hierarchical) sub labels.

Clarification

The question is with respect to the number of classes / target labels which are used in a multi-label classification/segmentation. It is not about the number of samples and not about the number of input features.

Hypothesis

Are more target labels in a multi-label classification always better?

  1. Yes, even if these labels are not used for the final task, they act as additional features / more knowledge during training.

1.1. Yes, but only if the labels are good / not misleading.

1.2. Yes, even subpar labels can act as noisy labels still supporting the training.

  1. No, more specialized models work better and are easier to train (including only the really relevant labels)

A colleague and I had a disagreement on this and I was sure that someone already did some research on this. Surprisingly, my short literature search did not bring up anything useful. Most results are on more samples or more features. For more samples my understanding is that more samples is generally better but there is a point of no return. For more features the current state seems to be it depends. I would guess that both of these results, are also applicable to more labels but I would be interested in further insights.

Are you aware of any research covering this question? Any personal experience in multi-label classification/segmentation?


In response to @Erwan's reply, let's assume the 2 classes vs. 100 classes example.

The point on overall performance undoubtably being lower makes sense (random baseline example). But that's not really a metric one cares about. What we care about is the individual label performance. The scenario we have is that these 2 classes are mainly important but we also have knowledge about these other 98 classes. The question then becomes if we should include all that information during training (in the form of labels) or not and how that will affect the performance of the 2 classes. My thought process is with respect to multitask learning (which by my knowledge is proven to work well).

This could hint again towards a it depends.

For example, given a hierarchy of labels, let's say class 1, 2 and 3. Class 2 and 3 can only be a subset of class 1's area.

  1. If we are only interested in class 2 and 3, would including class 1 help? My assumption would be that class 1 adds context and limits to class 2 and 3.
  2. If we are only interested in class 1, would including class 2 and 3 help? Here I'm unsure. Class 2 and 3 could give additional structure / information on the expectable content of class 1. So it could make it more robust.

(disclaimer: I'm biased towards more labels being better)

Topic image-segmentation labels multilabel-classification multitask-learning deep-learning

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.