How to implement Early stopping in Neural Machine Translation with Attention or Transformers?

I am trying to implement early stopping to my model where I am performing Machine Translation using Seq2Seq with attention. I am mostly used to writing my own models in steps, something like this:

for activation in activations:
  for layer1 in layers1:
    for optimizer in optimizers:
      # define model
      model_vanilla_lstm = Sequential()
      model_vanilla_lstm.add(LSTM(layer1, activation=activation, input_shape=(n_step, n_features)))
      model_vanilla_lstm.add(Dense(1))

      #compile model
      model_vanilla_lstm.compile(optimizer=optimizer, loss='mse')

      #Early Stopping
      earlyStop=EarlyStopping(monitor=val_loss,mode='min',patience=5)

      # fit model
      history = model_vanilla_lstm.fit(X, y, epochs=epoch, validation_data=(X_test,dataset_test['Close']) , verbose=1, callbacks=[earlyStop])

      #Summary of the model
      print(model_vanilla_lstm.summary())

So, here I know where exactly to put the early stopping, but currently, I am referring to the Tensorflow website's NMT with attention tutorial: https://colab.research.google.com/github/tensorflow/tensorflow/blob/r1.9/tensorflow/contrib/eager/python/examples/nmt_with_attention/nmt_with_attention.ipynb

It is really confusing as to how to implement early stopping here, I seek suggestions.

Topic early-stopping attention-mechanism sequence-to-sequence tensorflow machine-translation

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.