Tuning a classifier for high precision, with no regard for recall

I understand this falls under the decision making aspect, rather than the probabilistic, but for the purposes of some work I am doing, I need the classifier to have very high precision, as I can't afford a false positive. I do not care about false negatives, and consequently, do not care about recall. Since it is currently a binary classifier, some might say to play with the decision probability threshold from its current 0.5 value, but I will eventually need to add a third class, and will therefore need to switch to 3 outputs with softmax. I am unaware of traditional methods for shifting my pipeline towards a high precision outcome, and am looking for ways to achieve this.

If it is any help, the problem is classification of 256x256 grayscale images in a domain that is very difficult to classify, according to current whitepapers in the computer vision research area.

Topic finetuning multiclass-classification image-classification

Category Data Science


Since, in your comment to Eugen's answer, you say your data is imbalanced you might find the focal loss function useful. From the abstract of the paper

Focal Loss focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training.

There is code available from this github.


To make your learner cost sensitive, you can increase your training data by more "no" instances. If there are 10 times more "no" instance in your training data, the errors on "no's" will hurt much more and your learner will come up with a dicision scheme which will be biased that way. The variance on the "yes'es" will be lowered on the other hand (pay attention on overfitting here). After the training, you use your original data for testing and you should get good results here.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.