Fine tune the RetinaNet model in PyTorch

Question

Fine tune the RetinaNet model in PyTorch

xcsob

2021年12月1日 13:01

I would like to fine the pre-trained RetinaNet model available in torchvision in order to create my own object detection.

I'm trying to replicate what is done for the FastRCNN at this link: https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html#finetuning-from-a-pretrained-model

What I have done is the following:

model = model = torchvision.models.detection.retinanet_resnet50_fpn(pretrained=True)
num_classes = 2 

# get number of input features and anchor boxed for the classifier
in_features = model.head.classification_head.conv[0].in_channels
num_anchors = model.head.classification_head.num_anchors

# replace the pre-trained head with a new one
model.head = RetinaNetHead(in_features, num_anchors, num_classes)

The model is declared, and the training doesn't break. However the performance are so bad that neither a very stupid detection works.

My question is, the code that I wrote is okay to retrain the RetinaNet model?

Topic torchvision finetuning

Category Data Science

calveeen · Accepted Answer · 2021年6月18日 15:07

I am also trying to do a similar thing. The code below should work. After loading the pretrained weights on COCO dataset, we need to replace the classifier layer with our own.

num_classes = # num of objects to identify + background class

model = torchvision.models.detection.retinanet_resnet50_fpn(pretrained=True)
# replace classification layer 
in_features = model.head.classification_head.conv[0].in_channels
num_anchors = model.head.classification_head.num_anchors
model.head.classification_head.num_classes = num_classes

cls_logits = torch.nn.Conv2d(out_channels, num_anchors * num_classes, kernel_size = 3, stride=1, padding=1)
torch.nn.init.normal_(cls_logits.weight, std=0.01)  # as per pytorch code
torch.nn.init.constant_(cls_logits.bias, -math.log((1 - 0.01) / 0.01))  # as per pytorcch code 
# assign cls head to model
model.head.classification_head.cls_logits = cls_logits

No change is needed for the regression box network because the number of anchor boxes per spatial location does not change as when the model was pretrained on COCO dataset. The code is a little cumbersome as compared to Faster R-CNN. Would hope to see a solution that is more elegant.

Hope this helps.

Fine tune the RetinaNet model in PyTorch

About