Which F1-score is used for the semantic segmentation tasks?

I read some papers about state-of-the-art semantic segmentation models and in all of them, authors use for comparison F1-score metric, but they did not write whether they use the micro or macro version of it.

Does anyone know which F1-score is used to describe the segmentation results and why it is so obvious that authors do not define it in papers?

Sample papers:

https://arxiv.org/pdf/1709.00201.pdf

https://arxiv.org/pdf/1511.00561.pdf

Topic semantic-segmentation f1score computer-vision

Category Data Science


I looked very quickly and only at the first paper so I might miss something but it looks to me like the task is a binary classification problem. If this is correct then there's no need for averaging the F1-score.

Also in this paper the authors even give the formula of the F1-score (!), so I'd say that they are quite thorough in their description of the evaluation measures they use. I would take this as an additional indication that there's no averaging, since it's unlikely that they wouldn't mention it.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.