Can the F1 score be equal to zero?

As it is mentioned in the F1 score Wikipedia, 'F1 score reaches its best value at 1 (perfect precision and recall) and worst at 0'.

What is the worst condition that was mentioned?

Even if we consider the case of: either precision or recall is 0. The whole F1-score value becomes undefined. Because when either precision or recall is to be 0, true postives should be 0. When the true positives value becomes 0, both the precision and recall become 0.

Topic f1score metric confusion-matrix classification performance

Category Data Science


If we go by the formula, it can actually be zero when when at least one of precision or recall is zero (regardless of the other one being zero or undefined). Look at the formulas for precision, recall, and F1:

By looking at the F1 formula, F1 can be zero when TP is zero (causing Prec and Rec to be either 0 or undefined) and FP + FN > 0. Since both FP and FN are non-negative, this means that F1 can be zero in three scenarios:

1- TP = FP = 0 ^ FN > 0,

2- TP = FN = 0 ^ FP > 0,

3- TP = 0 ^ FP > 0 ^ FN > 0.

In the first scenario, Prec is undefined and Rec is zero. In the second scenario, Prec is zero and Rec is undefined, and in the last scenario, both Prec and Rec are zero.


F1 will never be zero, but very near to zero for a bad classifier. If TP or TN is zero then there isn't any need to check F1.


It can't be exactly zero. We need exactly one (only one) of precision. Or recall to be zero to make f1 = zero, but both have "tp" as the numerator.

#### Will be Nan
y_test = np.array([0,0,1,1])
y_pred = np.array([0,1,0,0])

from sklearn.metrics import confusion_matrix
tn, fp, fn, tp = confusion_matrix(y_test, y_pred).ravel()
precision = tp/(tp+tn)
recall = tp/(tp+fn)
f1 = 2*precision*recall/(precision+recall)
print(f1)

nan

#### Can be ~0 with a specific case, with this data
y_test = np.hstack((np.zeros((1,2)),np.ones((1,1000000))))
y_pred = np.hstack((np.ones((1,1)),np.zeros((1,1000000)),np.ones((1,1))))
tn, fp, fn, tp = confusion_matrix(y_test[0], y_pred[0]).ravel()
precision = tp/(tp+tn)
recall = tp/(tp+fn)
f1 = 2*precision*recall/(precision+recall)
print('{0:1.6f}'.format(f1))

0.000002


It's a mistake on Wikipedia.

$F_{1}$ as the harmonic mean is defined only at positive real numbers. $PRE$ or $REC$ could be equal 0 in case $TP=0$. Which provides to undefined result $F_1=\frac{0}{0}$.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.