YOLO Dense Prediction

Question

kevin

2022年2月20日 17:07

I have two questions about dense prediction in YOLOv4 paper

What does it mean by the (hard negative, online hard) example mining method is not applicable to one-stage object detector, because this kind of detector belongs to the dense prediction architecture ?
Why dense prediction does not belong to two-stage detector ?

tricostume · Accepted Answer · 2021年5月8日 13:02

It is kind of unfortunate that no further explanations were given in the paper about this. In my opinion Hard negative mining is actively used by architectures like SSD, actually by boosting its performance with the use of selective training. I could speculate about what the authors meant, namely that a one-stage detector does not have per construction any element capable of performing the hard negative mining, whereas Faster-RCNN (sparse predictor) does as it relies on two different stages (classification/bounding box regression + RoI generation) to discriminate the results actively.
Dense prediction means that RoIs are a dense sample of possible boxes around points of interest (image or feature maps). Object Detectors whose output is the subset of the top relevant RoIs (and output features) out of these are dense predictors. They are planned as a self-contained framework that runs in a one-stage manner. Dense prediction is also part of two-stage detectors as you can see in the image, but specific boxes are used for further processing (e.g. a box refinement method).