Invalid shape (4, 460, 513) for image data

Question

Invalid shape (4, 460, 513) for image data

Amit

2021年8月11日 18:53

I am using read_image to read the image.

from torchvision.io import read_image
image = read_image(/content/train/000001-11.jpg)

Now, when I try to find the shape of the image, I get $(4, 460, 513)$ as the image shape.

But, when I use opencv to read the image, I get $(460, 513, 3)$ as the image shape.

img=cv2.imread(/content/train/000001-11.jpg)

Could anyone explain to me why this happens? Why are there 4 channels instead of three?

I tried to print the 4 channels for a particular case of the former case. I found the last channel has the value 255 for all cells. I need to plot the image in both the cases.I am unable to plot in the case of read_image. How to plot in case of read_image?

Topic torchvision pytorch python

Category Data Science

Oxbowerce · Accepted Answer · 2021年8月11日 13:01

read_image from torchvision.io by default will read the image using reading mode ImageReadMode.UNCHANGED, meaning that it will read the image as it stored on disk (see the documentation). In this case, that means you will get an image with four channels: (R, G, B, A), with A being the alpha channel. imread, however, by default uses the flag cv.IMREAD_COLOR to read in the image (see the documentation), meaning that you will only get three channels: (R, G, B). The difference in the number of channels is simply caused by how the images are read. Depending on what you want you can change the default flag in read_iamge.

There are several ways to plot an image with four channels, using matplotlib is one of them. The imshow function of matplotlib allows you to plot RGBA images as per the documentation.

Invalid shape (4, 460, 513) for image data

About