Regression sequence output loss function
I am fairly new to deep learning, and I have the following task. Based on an audio sequence of shape (200, 1024), I have to predict two sequences of shape (200, 1) of continuous values (for e.g 0.5687) that represent the emotion at each timestep (valence v and arousal a). So I've created the following LSTM:
inputs_audio = Input(shape=(200, 1024))
audio_net = LSTM(256, return_sequences=True)(inputs_audio)
audio_net = LSTM(256, return_sequences=True)(audio_net)
audio_net = LSTM(256, return_sequences=False)(audio_net)
audio_net = Dropout(0.3)(audio_net)
final_model = audio_net
target_names = ('v', 'a')
model_combined = [Dense(1, name=name)(final_model) for name in target_names]
model = Model(inputs_audio, model_combined)
opt = adam_v2.Adam(learning_rate=0.0006, decay=1e-6)
model.compile(loss=ccc_loss_3, optimizer=opt, metrics=[ccc_v])
I am using the Concordance Correlation Coefficient (CCC) loss function, the code looks like this:
def ccc_loss_3(y_true, y_pred):
y_true = tf.cast(y_true, dtype=np.float32)
y_pred = tf.cast(y_pred, dtype=np.float32)
return 1 - ccc_v(y_true, y_pred)
def ccc_v(y_true, y_pred):
Concordance Correlation Coefficient
x = y_true[:, 0]
y = y_pred[:, 0]
mx = K.mean(x, axis=0)
my = K.mean(y, axis=0)
xm, ym = x - mx, y - my
rho = K.sum(xm * ym) / (K.sqrt(K.sum(xm ** 2)) * K.sqrt(K.sum(ym ** 2)))
x_s = K.std(x)
y_s = K.std(y)
ccc = 2 * rho * x_s * y_s / (x_s ** 2 + y_s ** 2 + (mx - my) ** 2)
return ccc
I'm pretty sure there is something wrong in my implementation of this function, since the network is stuck on an infinite val_loss and nan ccc_loss for each of the valence and arousal. Can someone explain to me what needs to be modified to fix this ?
Topic lstm keras loss-function sequence regression
Category Data Science