Keras models break when I add batch normalization

I'm creating the model for a DDPG agent (keras-rl version) but i'm having some trouble with errors whenever I try adding in batch normalization in the first of two networks.

Here is the creation function as i'd like it to be:

def buildDDPGNets(actNum, obsSpace):
    actorObsInput = Input(shape = (1,) + obsSpace, name = "actor_obs_input")
    a = Flatten()(actorObsInput)
    a = Dense(600, use_bias = False)(a)
    a = BatchNormalization()(a)
    a = Activation("relu")(a)
    a = Dense(300, use_bias = False)(a)
    a = BatchNormalization()(a)
    a = Activation("relu")(a)
    a = Dense(actNum)(a)
    a = Activation("tanh")(a)   # Bipdeal walker applies torque (-1, 1).
    actor = Model(inputs = [actorObsInput], outputs = a)
    criticActInput = Input(shape = (actNum,), name = "critic_action_input")
    criticObsInput = Input(shape = (1,) + obsSpace, name = "critic_obs_input")
    c = Flatten()(criticObsInput)
    c = Dense(600, use_bias = False)(c)
    c = BatchNormalization()(c)
    c = Activation("relu")(c)
    c = Concatenate()([c, criticActInput])
    c = Dense(300, use_bias = False)(c)
    c = BatchNormalization()(c)
    c = Activation("relu")(c)
    c = Dense(1)(c)
    c = Activation("linear")(c)
    critic = Model(inputs = [criticActInput, criticObsInput], outputs = c)
    return (criticActInput, actor, critic)

This causes me to get the following error:

    InvalidArgumentError: You must feed a value for placeholder tensor 'actor_obs_input' with dtype float and shape [?,1,24]
     [[{{node actor_obs_input}}]]

However, If I remove the batch normalization from the first network (a) and do not change the second network (c) at all, It runs as it should.

    a = Flatten()(actorObsInput)
    a = Dense(600, use_bias = False)(a)
    #a = BatchNormalization()(a)
    a = Activation("relu")(a)
    a = Dense(300, use_bias = False)(a)
    #a = BatchNormalization()(a)
    a = Activation("relu")(a)
    a = Dense(actNum)(a)

Its also notable that if I do it the other way around (remove bn from c and leave it in a) the error still occurs. Any idea as to why that's happening? Its odd because it runs fine with batch norm in the critic, just not the actor. The models are being used by keras-rl DDPG agent btw.

Update: I've tried rewriting it with the sequential object instead of the functional api. Didn't help. Still got the error with no change. I'm beginning to think this is some sort of problem with keras's batch normalize class when being applied to systems of multiple models.

Topic keras-rl batch-normalization keras neural-network

Category Data Science


When you wrote this:

a = BatchNormalization()(a)

you assigned the object BatchNormalization() to a. The following layer:

a = Activation("relu")(a)

is supposed to receive some data in numpy array, not a BatchNormalization layer. You should rewrite your actor code like this:

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, BatchNormalization

actor = Sequential([
    Dense(600, input_shape = your_input_shape, activation = 'relu', use_bias = False),
    BatchNormalization(), 
    Dense(300, activation = 'relu', use_bias = False),
    BatchNormalization(),
    Dense(actNum, activation = 'tanh')
    ])

Please note that I didn't fully specify the input_shape. That is because I don't know the nature of your data, that's why I left your_input_shape. You can modify it according to your needs.

The other model, critic, should follow the same structure IMHO. It's also more readable (and easier to debug) for you. Once both actor and critic are defined you can assemble them together using a Model(), as you did above.


This link describes the cause of the problem. Its because we try to feed a placeholder defined as an Input layer using another input layer. You can modify your code as follows:

def buildDDPGNets(actNum, obsSpace):

  actorObsInput = Input(shape = (1,) + obsSpace, name = "actor_obs_input")
  criticActInput = Input(shape = (actNum,), name = "critic_action_input")

  flattened_obs = Flatten()(actorObsInput)
  a = Dense(600, use_bias = False)(flattened_obs)
  a = BatchNormalization()(a)
  a = Activation("relu")(a)
  a = Dense(300, use_bias = False)(a)
  a = BatchNormalization()(a)
  a = Activation("relu")(a)
  a = Dense(actNum)(a)
  a = Activation("tanh")(a)   # Bipdeal walker applies torque (-1, 1).
  actor = Model(inputs = [actorObsInput], outputs = a)

  c = Dense(600, use_bias = False)(flattened_obs)
  c = BatchNormalization()(c)
  c = Activation("relu")(c)
  c = Concatenate()([c, criticActInput])
  c = Dense(300, use_bias = False)(c)
  c = BatchNormalization()(c)
  c = Activation("relu")(c)
  c = Dense(1)(c)
  c = Activation("linear")(c)
  critic = Model(inputs = [criticActInput, criticObsInput], outputs = c)

  return (criticActInput, actor, critic)

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.