How important is the channel order in deep-learning computer vision tasks?

Question

How important is the channel order in deep-learning computer vision tasks?

bers

2021年11月15日 05:28

I stumbled across this question while working with OpenCV, which stores color images in BGR order in memory, while most other libraries I know of use RGB order.

How important is this difference? Would I be able to use BGR images for an RGB-trained network? Obviously, a fire truck is red in most regions, and not blue. But don't CNNs look for texture rather than simple colors?

Even assuming there is a difference - could it still make sense to apply channel permutations as a kind of data augmentation during training? This would spread the spectrum of textures from each original channel to both others, again, the assumption being that colors are not all that important.

I would be curious to know if research like this already been done - I would be surprised if it had not.

Topic image cnn data-augmentation deep-learning

Category Data Science

Noah Weber · Accepted Answer · 2021年11月15日 05:28

Task dependent but could be important.

Lets find some arbitrary representation that separates luminance from chrominance (represented as a color 2D vector)?
That way, you are detecting bright objects independent of color, and colorful objects less dependent on brightness. Obviously under assumption that you are color agnostic.
Check this paper where they applied exactly this permutations augmentation

Good Pen · Accepted Answer · 2021年11月15日 04:44

Would I be able to use BGR images for an RGB-trained network?

I think the performance will be much worse than RGB input.

Color Permutations as augmentation

Paper Rethinking Data Augmentation: Self-Supervision and Self-Distillation:

if the augmentation results in large distributional discrepancy among pictures (e.g., rotations), forcing their label invariance may be too difficult to solve and often hurts the performance.
To tackle this challenge, we suggest an idea of learning the joint distribution of the original and self-supervised labels of augmented samples.

Augmentation: two transformations which use the entire input image
1.rotation (0◦,90◦, 180◦, 270◦)
This transformation is widely used for self-supervision due to its simplicity
2.Color permutation
constructs M = 3! = 6 different images via swapping RGB channels This transformation can be useful when color information is important such as finegrained classification datasets

How important is the channel order in deep-learning computer vision tasks?

Color Permutations as augmentation

About