What is Deep supervision?

I'm interested in segmentation models for medical imaging purposes. When I looked at the state of the art, I fell on a paper on a new architecture, Unet++:

UNet++: A Nested U-Net Architecture for Medical Image Segmentation from Zongwei Zhou, Md Mahfuzur Rahman Siddiquee, Nima Tajbakhsh, and Jianming Liang at Arizona State University

Like Unet, it is has an encoder/decoder architecture with skip connections (adding fine-grained feature maps of the encoder to the decoder). However, in Unet++, the skip connections are nested and dense so that a model can improve its ability to capture fine-grained details.

The second difference with Unet is the use of deep supervision. The papers says that deep supervision enables:

the model to operate in two modes: 1) accurate mode wherein the outputs from all segmentation branches are averaged; 2) fast mode wherein the final segmentation map is selected from only one of the segmentation branches, the choice of which determines the extent of model pruning and speed gain

I didn't understand how deep supervision works and what are the benefits in the case of Unet++.

Can you explain how it works? Thank you in advance for your help.

Topic semantic-segmentation deep-learning

Category Data Science


The idea of deep supervision is to add, so called, companion objective functions at each hidden layer of a network and then compute the final loss as the output loss plus the sum of the companion losses. The idea was introduced in this paper:

@inproceedings{lee2015deeply,
  title={Deeply-supervised nets},
  author={Lee, Chen-Yu and Xie, Saining and Gallagher, Patrick and Zhang, Zhengyou and Tu, Zhuowen},
  booktitle={Artificial intelligence and statistics},
  pages={562--570},
  year={2015},
  organization={PMLR}
}

Partial answer: Quoting a paper

"The advantage of such deep supervision is evident: (1)for small training data and relatively shallower networks, deep supervision functions as a strong “regularization” for classification accuracy and learned features; (2) for large training data and deeper networks deep supervision makes it convenient to exploit the significant performance gains that extremely deep networks can bring by improving otherwise problematic convergence behavior"

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.