How can I improve the recall of a certain class in a multiclass-classification result

I am working on a multiclass classification which is to assign medical related queries of web search to certain departments of hospital.My classifier is based on the fastText.

I found for most conditions, the result is good enough say recall is 0.8 for Nephrology. However, for just one department, Dermatology, the recall is pretty low,like 0.5. Unfortunately, this label has most samples in the test data.

How can I improve the recall of one class while maintain the performance of other classes? Will ensembling method work?

Topic search-engine multiclass-classification nlp

Category Data Science


A couple options:

  • Increase the confidence score corresponding to your class-of-interest until you reach the desired recall
  • Upsample the class you wish to have better recall on in the training set
  • Use class sensitive weighting ... make the loss associated to incorrectly classifying your class-of-interest higher than the others
  • Create two models, a binary model for your class-of-interest and a second model that predicts on everything but that class. Tune the "other" threshold of the binary classifier so that you have your desired recall. For all texts that have confidence scores less than the other threshold, classify them using your second model.

Recall is the ability of a search model to find the correctly labeled items amongst all the items for a given query.

One common method to improve specific query results is to create a custom model. Create a model just for "Dermatology". That model can be tuned to increase recall without impacting other queries that can use the generic search model.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.