Image resizing and padding for CNN

I want to train a CNN for image recognition. Images for training have not fixed size. I want the input size for the CNN to be 50x100 (height x width), for example. When I resize some small sized images (for example 32x32) to input size, the content of the image is stretched horizontally too much, but for some medium size images it looks okay.

What is the proper method for resizing images while avoiding the content being destroyed?

(I am thinking about padding images with 0s to complete size after resizing them to some degree keeping ratio of width and height. Would it be okay with this method?)

Topic image-recognition preprocessing image-classification deep-learning machine-learning

Category Data Science


This question on stackoverflow might help you. To sum up, some deep learning researchers think that padding a big part of the image is not a good practice, since the neural network has to learn that the padded area is not relevant for classification, and it does not have to learn that if you use interpolation, for instance.

[Update April.11 2022] Well, based on the newer research[2019], it seems like zero-padding would not affect to the CNN model at all, since zeros would not change the synaptic weights during forward propagation or back propagation during convolution.

paper reference: https://journalofbigdata.springeropen.com/articles/10.1186/s40537-019-0263-7


You can do the following First resize the images up to certain extent and then pad the image from all sides ,which could help in maintaining the features in the image.


You have a few options:

For Small Images:

  • upsample through interpolation
  • pad the image using zeros

If you are unable to maintain the aspect ratio via upsampling, you can upsample and also crop the excess pixels in the largest dimension. Of course this would result in losing data, but you can repeatedly shift the center of your crop. This would help your model be more robust.


For Large Images:

  • downsample
  • crop down to your input size

Lastly, if you are using a Fully Convolutional Network (FCN), you do not need to resize your images.

TL;DR:

yes, padding with zeros is a valid option.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.