Making an ensemble model for high F1 score

I presently have 2 algorithms that have a numerical output. Using a threshold of 0.9, I get the classification output. Let's say they are:

  1. P (high precision, low recall)
  2. R (high recall, low precision)

Individually, they have poor F-1 scores. Is the naive way of creating a classifier C as:

C(*) = x.P(*) + (1-x).R(*)

And optimizing for x and threshold a good approach to improve the F-1 score? Or is there some alternate approach I must try. Note: I can't vary functions P() and R(). Their outputs are provided as a black-box function.

Topic f1score ensemble-modeling binary classification

Category Data Science


In general this would mean that P predicts only a small number of instances higher than 0.9 whereas R predicts most instances higher than 0.9. Therefore a weighted average of the two scores will fall somewhere in the middle, likely resulting in a moderate precision and moderate recall.

This can give significantly better results but only if the two classifiers are complementary, i.e. they predict instances in a way different enough from each other. Otherwise it's equivalent to tuning the threshold on a single classifier.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.