What (probabilistic models) can only output decisions when they are certain?
I'm basically looking for approaches, models, algorithms for the following situation (a fault diagnosis problem):
I have an input set $\{x_i\}_{i \in \{1..m\}}$ with $n$ binary features of cases (think of "faults" or "alarms" that fired) and $k$ classes. Each case $x_i$ can belong to at least one class and at most $k$ (so I'm dealing with multi-label classification).
Now some relations in the data set are utterly boring/uninformative (say, feature $a$ says "Mechanical Error occurred" and label $b$ means "Mechanical Error fixed"). But even more generally, whenever $x_a = 1$, I see all kinds of labels, i.e., $a$ is not predictive. Put differently, the relation is not "functional".
Other input features $c$ might have a much more "functional" nature, such that whenever $x_c = 1$, I can easily deduce $y_d = 1$.
For instance my training set could look something like this:
$ [0, 1,0] \mapsto \{4, 1\}$
$ [0, 1,0] \mapsto \{2\}$
$ [1, 0, 0] \mapsto \{1\}$
$ [1, 0, 0] \mapsto \{1\}$
So, knowing $[0,1,0]$ is not really informative whereas $[1,0,0]$ tells me (with high confidence) that the label 1 is active.
I'm looking for the latter pairs, a classifier that only extracts meaningful pairs and ignores uninformative inputs.
Could you point me to relevant techniques / keywords? Thanks a lot!
Topic information-theory multilabel-classification
Category Data Science