How to approach different image resolutions in deep learning for regression problem?

I have an image dataset of various resolutions and using regression DNN model with fixed n*n input resolution. As model learns certain positions in the image, I've been using zero padding to fit images resolutions to maintain 1:1 aspect ratio.

Is there a better way to preprocess images?

Without zero padding, I get worse results, and I guess maintaining aspect ratio is necessary to avoid objects' shape distortions in DNN input.

Topic image-preprocessing regression deep-learning

Category Data Science


In a fully convolutional neural network, i.e., a neural network containing only convolutional and pooling layers, you can get away with input images of different sizes. but that would only work if you didn't need the output of your network to be of a fixed size. for example for segmentation tasks, because the output is the same size as the input, you can actually design a fully convolutional network where inputs are images of different resolution. This is basically one of the ideas behind FCN.

Aside from that in classification or regression, because your output is a certain fixed size, you have to fix each layers size and this also means that the input must have a certain size. Plus in classification or regression tasks, fully connected layers are used as the last layers of the network, thus because of this reason too, you have to have fixed size inputs.

Long story short, in your case, there is nothing you can do architecture-wise to get away with this. So you have to handle it in your sample data.

What is usually done, is that you transform your dataset and zero-pad all the images to a fixed resolution and then use it.

Don't scale your images to a certain resolution as convolutional layers are not scale invariant and this will mess up your test phase.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.