Sequence-to-Sequence Transformer for Neural machine translation

I am using the tutorial in Keras documentation here. I am new to deep learning. On a different dataset Menyo-20k dataset, of about 10071 total pairs, 7051 training pairs,1510 validation pairs,1510 test pairs. The highest validation accuracy and test accuracy I have gotten is approximately 0.26. I tried the list of things below:

  1. Using the following optimizers: SGD, Adam, RMSprop
  2. Tried different learning rate
  3. Tried the dropout rate of 0.4 and 0.1
  4. Tried using different embedding dimensions and feed-forward network dimension
  5. Used Early stopping and patience =3, the model does not go past the 13th epoch. I tried the model itself without changing any parameters, the validation accuracy never got to 0.3, I tried to change the different parameters in order to know what I am doing wrong and I can't figure it out. Please what am I doing wrong? Thank you in advance for your guidance.

Topic transformer keras deep-learning language-model nlp

Category Data Science


Few things can be tried :-

  1. Increase no of iterations a lot
  2. Add more neural or Transformer layers
  3. Use Bleu Score metric instead of accuracy metrics as accuracy metrics in machine language translation does not make that much sense.
  4. May need larger dataset

Not sure what will work finally - have to experiment

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.