Batch normalization for image CNN - Why not use the mean of the entire batch?
For CNN to recognize images, why not use the entire batch data, instead of per feature, to calculate the mean in the Batch Normalization?
When each feature is independent, need to use per feature. However the features (pixels) of images having RGB channels with 8 bit color for CNN are related. If there are 256 pixels in R channel in an image, 255 for pixel i
and 255 for pixel j
are both white meaning the same intensity(?) in R color.
Then why not use the mean of the entire data in a batch? If the pixel channel i
happens to have the values between (0, 127) and channel j
has (128, 255), the meaning that (0, 127) is within [0, 255] and the relational meaning between i
and j
, which is, pixel i
intensity is lower than that of j
) gets lost.
Topic batch-normalization
Category Data Science