"Invalid value" in RMSprop implementation from scratch in Python
Edit 2: The regularization term (reg_term) is sometimes negative due negatative parameters. Hence S[fdW{l}] contains some negative values. I realize the reg_term has to be added before taking the sqrt, like this:
S[fdW{l}] = beta2 * S[fdW{l}] + (1 - beta2) * (np.square(gradients[fdW{l}] + reg_term))
Edit 1: I see that S[fdW{l}] contains some negative values. How is this possible when np.square(gradients[fdW{l}] always contains positive values?
I have implemented a neural network from scratch which uses mini-batch gradient descent. The network works well. Unfortunately, I can't get my RMSprop implementation to work. I have verified that the network works well with momentum.
I get a RuntimeWarning when training the network with RMSprop: invalid value encountered in sqrt. This happens in the RMSprop update step.
My implementation of update parameters:
def update_parameters(parameters, gradients, V, S, batch_size, t, learning_rate, reg_param):
L = len(parameters) // 2
beta1 = 0.9
beta2 = 0.999
epsilon = 1e-8
for l in range(1, L+1):
reg_term = (reg_param / batch_size) * parameters[fW{l}]
# RMSprop gradients
S[fdW{l}] = beta2 * S[fdW{l}] + (1 - beta2) * (np.square(gradients[fdW{l}]) + reg_term)
S[fdb{l}] = beta2 * S[fdb{l}] + (1 - beta2) * np.square(gradients[fdb{l}])
# RMSprop update
parameters[fW{l}] -= learning_rate * (gradients[fdW{l}] / (np.sqrt(S[fdW{l}])) + epsilon)
parameters[fb{l}] -= learning_rate * (gradients[fdb{l}] / (np.sqrt(S[fdb{l}])) + epsilon)
This is how I initialize the parameters:
def init_params_V_and_S(activation_layers):
params = {}
V = {}
S = {}
L = len(activation_layers)
for l in range(1, L):
params[fW{l}] = np.random.randn(activation_layers[l], activation_layers[l-1]) * np.sqrt(2 / activation_layers[l-1])
params[fb{l}] = np.zeros((activation_layers[l], 1))
# RMSprop params
S[fdW{l}] = np.zeros((activation_layers[l], activation_layers[l-1]))
S[fdb{l}] = np.zeros((activation_layers[l], 1))
return params, V, S
Any ideas what's causing this?
Topic optimization python machine-learning
Category Data Science