Variational AutoEncoder giving negative loss
I'm learning about variational autoencoders and I've implemented a simple example in keras, model summary below. I've copied the loss function from one of Francois Chollet's blog posts and I'm getting really really negative losses. What am I missing here?
Model: "model_1"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 224)] 0
__________________________________________________________________________________________________
encoding_flatten (Flatten) (None, 224) 0 input_1[0][0]
__________________________________________________________________________________________________
encoding_layer_2 (Dense) (None, 256) 57600 encoding_flatten[0][0]
__________________________________________________________________________________________________
encoding_layer_3 (Dense) (None, 128) 32896 encoding_layer_2[0][0]
__________________________________________________________________________________________________
encoding_layer_4 (Dense) (None, 64) 8256 encoding_layer_3[0][0]
__________________________________________________________________________________________________
encoding_layer_5 (Dense) (None, 32) 2080 encoding_layer_4[0][0]
__________________________________________________________________________________________________
encoding_layer_6 (Dense) (None, 16) 528 encoding_layer_5[0][0]
__________________________________________________________________________________________________
encoder_mean (Dense) (None, 16) 272 encoding_layer_6[0][0]
__________________________________________________________________________________________________
encoder_sigma (Dense) (None, 16) 272 encoding_layer_6[0][0]
__________________________________________________________________________________________________
lambda (Lambda) (None, 16) 0 encoder_mean[0][0]
encoder_sigma[0][0]
__________________________________________________________________________________________________
decoder_layer_1 (Dense) (None, 16) 272 lambda[0][0]
__________________________________________________________________________________________________
decoder_layer_2 (Dense) (None, 32) 544 decoder_layer_1[0][0]
__________________________________________________________________________________________________
decoder_layer_3 (Dense) (None, 64) 2112 decoder_layer_2[0][0]
__________________________________________________________________________________________________
decoder_layer_4 (Dense) (None, 128) 8320 decoder_layer_3[0][0]
__________________________________________________________________________________________________
decoder_layer_5 (Dense) (None, 256) 33024 decoder_layer_4[0][0]
__________________________________________________________________________________________________
decoder_mean (Dense) (None, 224) 57568 decoder_layer_5[0][0]
==================================================================================================
Total params: 203,744
Trainable params: 203,744
Non-trainable params: 0
__________________________________________________________________________________________________
Train on 3974 samples, validate on 994 samples
Epoch 1/10
3974/3974 [==============================] - 3s 677us/sample - loss: -28.1519 - val_loss: -33.5864
Epoch 2/10
3974/3974 [==============================] - 1s 346us/sample - loss: -137258.8175 - val_loss: -3683802.1489
Epoch 3/10
3974/3974 [==============================] - 1s 344us/sample - loss: -14543022903.6056 - val_loss: -107811177469.9396
Epoch 4/10
3974/3974 [==============================] - 1s 363us/sample - loss: -3011718676570.7012 - val_loss: -13131454938476.6816
Epoch 5/10
3974/3974 [==============================] - 1s 350us/sample - loss: -101442605943572.4844 - val_loss: -322685056398605.9375
Epoch 6/10
3974/3974 [==============================] - 1s 344us/sample - loss: -1417424385529640.5000 - val_loss: -3687688508198145.5000
Epoch 7/10
3974/3974 [==============================] - 1s 358us/sample - loss: -11794297368126698.0000 - val_loss: -26632844827070784.0000
Epoch 8/10
3974/3974 [==============================] - 1s 339us/sample - loss: -69508229806130784.0000 - val_loss: -141312065640756336.0000
Epoch 9/10
3974/3974 [==============================] - 1s 345us/sample - loss: -319838384005810432.0000 - val_loss: -599553350073361152.0000
Epoch 10/10
3974/3974 [==============================] - 1s 342us/sample - loss: -1221653451351326464.0000 - val_loss: -2147128507956525312.0000
latent sample func:
def sampling(self,args):
"""Reparameterization trick by sampling fr an isotropic unit Gaussian.
# Arguments
args (tensor): mean and log of variance of Q(z|X)
# Returns
z (tensor): sampled latent vector
"""
z_mean, z_log_var = args
set = tf.shape(z_mean)[0]
batch = tf.shape(z_mean)[1]
dim = tf.shape(z_mean)[-1]
# by default, random_normal has mean=0 and std=1.0
epsilon = tf.random.normal(shape=(set, dim))#tfp.distributions.Normal(mean=tf.zeros(shape=(batch, dim)),loc=tf.ones(shape=(batch, dim)))
return z_mean + (z_log_var * epsilon)
Loss func:
def vae_loss(self,input, x_decoded_mean):
xent_loss = tf.reduce_mean(tf.keras.backend.binary_crossentropy(input, x_decoded_mean))
kl_loss = -0.5 * tf.reduce_sum(tf.square(self.encoded_mean) + tf.square(self.encoded_sigma) - tf.math.log(tf.square(self.encoded_sigma)) - 1, -1)
return xent_loss + kl_loss
Another vae_loss implementation:
def vae_loss(self,input, x_decoded_mean):
gen_loss = tf.reduce_sum(tf.keras.backend.binary_crossentropy(input, x_decoded_mean))
#gen_loss = tf.losses.mean_squared_error(input,x_decoded_mean)
kl_loss = -0.5 * tf.reduce_sum(1 + self.encoded_sigma - tf.square(self.encoded_mean) - tf.exp(self.encoded_sigma), -1)
return tf.reduce_mean(gen_loss + kl_loss)
log_sigma kl_loss:
kl_loss = 0.5 * tf.reduce_sum(tf.square(self.encoded_mean) + tf.square(tf.exp(self.encoded_sigma)) - self.encoded_sigma - 1, axis=-1)
Topic keras tensorflow autoencoder loss-function python
Category Data Science