How to implement Early stopping in Neural Machine Translation with Attention or Transformers?
I am trying to implement early stopping to my model where I am performing Machine Translation using Seq2Seq with attention. I am mostly used to writing my own models in steps, something like this:
for activation in activations:
for layer1 in layers1:
for optimizer in optimizers:
# define model
model_vanilla_lstm = Sequential()
model_vanilla_lstm.add(LSTM(layer1, activation=activation, input_shape=(n_step, n_features)))
model_vanilla_lstm.add(Dense(1))
#compile model
model_vanilla_lstm.compile(optimizer=optimizer, loss='mse')
#Early Stopping
earlyStop=EarlyStopping(monitor=val_loss,mode='min',patience=5)
# fit model
history = model_vanilla_lstm.fit(X, y, epochs=epoch, validation_data=(X_test,dataset_test['Close']) , verbose=1, callbacks=[earlyStop])
#Summary of the model
print(model_vanilla_lstm.summary())
So, here I know where exactly to put the early stopping, but currently, I am referring to the Tensorflow website's NMT with attention tutorial: https://colab.research.google.com/github/tensorflow/tensorflow/blob/r1.9/tensorflow/contrib/eager/python/examples/nmt_with_attention/nmt_with_attention.ipynb
It is really confusing as to how to implement early stopping here, I seek suggestions.