Batch normalization

Part 1

Im going through this article and wanted to try and calculate a forward and backward pass with batch normalization. When doing the steps after the first layer I get a batch norm output that are equal for all features. Here is the code (I have on purpose done it in very small steps):

w = np.array([[0.3, 0.4],[0.5,0.1],[0.2,0.3]])
X = np.array([[0.7,0.1],[0.3,0.8],[0.4,0.6]])

def mu(x,axis=0):
  return np.mean(x,axis=axis)
def sigma(z, mu):
  Ai = np.sum(z,axis=0)
  return np.sqrt((1/len(Ai)) * (Ai-mu)**2)
def Ai(z):
  return np.sum(z,axis=0)

def norm(Ai,mu,sigma):
  return (Ai-mu)/sigma

z1 = np.dot(w1,X.T)
mu1 = mu(z1)
A1 = Ai(z1)
sigma1 = sigma(z1,mu1)
gamma1 = np.ones(len(A1))
beta1 = np.zeros(len(A1))
Ahat = norm(A1,mu1,sigma1) #since gamma is just ones it does change anything here

The output I get from this is:

[1.73205081 1.73205081 1.73205081]

Part 2 In this image: Should the sigma_mov and mu_mov be set to zero for the first layer?

EDIT: I think I found what I did wrong. In the normalization step I used A1 and not z1. Also I think I found that its normal to use initlize moving average with zeros for mean and ones for variance. Nice if anyone can confirm this.

Topic batch-normalization machine-learning

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.