What is Deep supervision?
I'm interested in segmentation models for medical imaging purposes. When I looked at the state of the art, I fell on a paper on a new architecture, Unet++:
UNet++: A Nested U-Net Architecture for Medical Image Segmentation from Zongwei Zhou, Md Mahfuzur Rahman Siddiquee, Nima Tajbakhsh, and Jianming Liang at Arizona State University
Like Unet, it is has an encoder/decoder architecture with skip connections (adding fine-grained feature maps of the encoder to the decoder). However, in Unet++, the skip connections are nested and dense so that a model can improve its ability to capture fine-grained details.
The second difference with Unet is the use of deep supervision. The papers says that deep supervision enables:
the model to operate in two modes: 1) accurate mode wherein the outputs from all segmentation branches are averaged; 2) fast mode wherein the final segmentation map is selected from only one of the segmentation branches, the choice of which determines the extent of model pruning and speed gain
I didn't understand how deep supervision works and what are the benefits in the case of Unet++.
Can you explain how it works? Thank you in advance for your help.
Topic semantic-segmentation deep-learning
Category Data Science