Invalid shape (4, 460, 513) for image data

I am using read_image to read the image.

from torchvision.io import read_image
image = read_image(/content/train/000001-11.jpg)

Now, when I try to find the shape of the image, I get $(4, 460, 513)$ as the image shape.

But, when I use opencv to read the image, I get $(460, 513, 3)$ as the image shape.

img=cv2.imread(/content/train/000001-11.jpg)

Could anyone explain to me why this happens? Why are there 4 channels instead of three?

I tried to print the 4 channels for a particular case of the former case. I found the last channel has the value 255 for all cells. I need to plot the image in both the cases.I am unable to plot in the case of read_image. How to plot in case of read_image?

Topic torchvision pytorch python

Category Data Science


read_image from torchvision.io by default will read the image using reading mode ImageReadMode.UNCHANGED, meaning that it will read the image as it stored on disk (see the documentation). In this case, that means you will get an image with four channels: (R, G, B, A), with A being the alpha channel. imread, however, by default uses the flag cv.IMREAD_COLOR to read in the image (see the documentation), meaning that you will only get three channels: (R, G, B). The difference in the number of channels is simply caused by how the images are read. Depending on what you want you can change the default flag in read_iamge.

There are several ways to plot an image with four channels, using matplotlib is one of them. The imshow function of matplotlib allows you to plot RGBA images as per the documentation.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.