Making an ensemble model for high F1 score

Question

Making an ensemble model for high F1 score

Kanishk Mair

2022年4月6日 09:06

I presently have 2 algorithms that have a numerical output. Using a threshold of 0.9, I get the classification output. Let's say they are:

P (high precision, low recall)
R (high recall, low precision)

Individually, they have poor F-1 scores. Is the naive way of creating a classifier C as:

C(*) = x.P(*) + (1-x).R(*)

And optimizing for x and threshold a good approach to improve the F-1 score? Or is there some alternate approach I must try. Note: I can't vary functions P() and R(). Their outputs are provided as a black-box function.

Topic f1score ensemble-modeling binary classification

Category Data Science

Erwan · Accepted Answer · 2021年5月13日 15:08

In general this would mean that P predicts only a small number of instances higher than 0.9 whereas R predicts most instances higher than 0.9. Therefore a weighted average of the two scores will fall somewhere in the middle, likely resulting in a moderate precision and moderate recall.

This can give significantly better results but only if the two classifiers are complementary, i.e. they predict instances in a way different enough from each other. Otherwise it's equivalent to tuning the threshold on a single classifier.

Making an ensemble model for high F1 score

About