What's the difference between multi label classification and fuzzy classification?

Is it just the between academics and practitioners in term usage?

Or is theoretical difference of how we consider each sample: as belonging to multiple classes at once or to one fuzzy class?

Or this distinction has some practical meaning of how we build model for classification?

Topic fuzzy-logic multilabel-classification fuzzy-classification classification

Category Data Science


Multi-label classification (Wiki):

Given $K$ classes, find a map $f:X \rightarrow \{0, 1\}^K$.

Fuzzy classification (a good citation is needed!):

Given $K$ classes, find a map $p: X \rightarrow [0, 1]^K$ where $\sum_{k=1}^{K} p(k)=1$.

In multi-label classification, as defined, there is no "resource limit" on classes compared to fuzzy classification.

For example, a neural network with a softmax layer does fuzzy classification (soft classification). If we only select a class with the highest score, then it will become a single-label classification (hard classification), and if we select top $k$ classes, it will be a multi-label classification (again hard classification).

Fuzzy classification:        [0.5, 0.2, 0.3, 0, 0]
Single-label classification: [1,   0,   0,   0, 0]
Multi-label classification:  [1,   0,   1,   0, 0]

As another example for multi-label classification, we could have $K$ neural networks for $K$ classes with sigmoid outputs, and assign a point to class $k$ if output of network $k$ is higher than 0.5.

Outputs:                     [0.6, 0.1, 0.6, 0.9, 0.2]
Multi-label classification:  [1,   0,   1,   1,     0]

Practical considerations

As demonstrated in the examples, the key difference is the "resource limit" that exists in fuzzy classification but not in multi-label classification. Including the limit (in the first example), or ignoring it (in the second example) depends on the task. For example, in a classification task that has mutually exclusive labels, we want to include the "resource limit" to impose the "mutually exclusive" assumption on the model.

Note that the $\sum_{k=1}^{K} p(k)=1$ restriction in fuzzy classification is merely a "definition", there is no point in arguing about a definition. We can either propose another classification, or argue when to use - and when not to use - such classification.


A multi label classifier learns to predict class labels using some algorithm and training data. It learns to associate an object's label with some vector containing values for the features. It estimates the probability of a sample belonging to a certain class, based on some condition.

Fuzzy classifiers do the same exact thing, except, it uses fuzzy logic to determine which class a sample belongs to. The data would need to be described using linguistic rules as opposed to the data used by a conventional classifier. When classifying a sample, it would return a "degree of membership" to each class.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.