Why doesn't batch normalization 'zero out' a batch of size one?

Question

Why doesn't batch normalization 'zero out' a batch of size one?

worduser

2022年4月12日 19:41

I'm using Tensorflow. Consider the example below:

 x
tf.Tensor: shape=(1,), dtype=float32, numpy=array([-0.22630838], dtype=float32)
 tf.keras.layers.BatchNormalization()(x)
tf.Tensor: shape=(1,), dtype=float32, numpy=array([-0.22619529], dtype=float32)

There doesn't seem to be any change at all, besides maybe some perturbation due to epsilon. Shouldn't a normalized sample of size one just be the zero tensor?

I figured maybe there was some problem with the fact that the batch size = 1 (variance is zero in this case, so how do you make the variance =1) But I've tried other simple examples with different shapes and setting the

axis=0,1,etc

parameter to different values. None of them make any change at all, really.

Am I simply using the API incorrectly?

Topic batch-normalization keras tensorflow

Category Data Science

user101893 · Accepted Answer · 2022年4月12日 19:41

Maybe you are running the batch norm with a running mean and it joins the mean and std of the single sample with the initial values of the mean and var. Try setting momentum to 0 (I think, also try 1, the point is to turn off the running calculation), and then, I guess it might solve the problem.

Why doesn't batch normalization 'zero out' a batch of size one?

About