What is the difference in computational cost at inference time between object detection and semantic segmentation?

Question

What is the difference in computational cost at inference time between object detection and semantic segmentation?

JStrahl

2021年2月15日 14:10

I am aware that YOLO (v1-5) is a real-time object detection model with moderately good overall prediction performance. I know that UNet and variants are efficient semantic segmentation models that are also fast and have good prediction performance.

I cannot find any resources comparing the inference speed differences between these two approaches. It seems to me that semantic segmentation is clearly a more difficult problem, to classify each pixel in an image, than object detection, drawing bounding boxes around objects in the image.

Does anyone have good resources for this comparison? Or a very good explanation to why one is computationally more demanding that the other?

Topic semantic-segmentation object-detection convolutional-neural-network computer-vision efficiency

Category Data Science

What is the difference in computational cost at inference time between object detection and semantic segmentation?

About