Train object detection without annotated data/bounding boxes
From what I can see most object detection NNs (Fast(er) R-CNN, YOLO etc) are trained on data including bounding boxes indicating where in the picture the objects are localised.
Are there algos that simply take the full picture + label annotations, and then on top of determining whether an image contain certain object(s) also indirectly
- learn to understand the appropriate bounding box(es) for objects?