Sequence-to-Sequence Transformer for Neural machine translation
I am using the tutorial in Keras documentation here. I am new to deep learning. On a different dataset Menyo-20k
dataset, of about 10071 total pairs, 7051 training pairs,1510 validation pairs,1510 test pairs. The highest validation accuracy and test accuracy I have gotten is approximately 0.26. I tried the list of things below:
- Using the following optimizers:
SGD, Adam, RMSprop
- Tried different learning rate
- Tried the dropout rate of
0.4 and 0.1
- Tried using different embedding dimensions and feed-forward network dimension
- Used
Early stopping and patience =3
, the model does not go past the13th epoch
. I tried the model itself without changing any parameters, thevalidation accuracy never got to 0.3
, I tried to change the different parameters in order to know what I am doing wrong and I can't figure it out. Please what am I doing wrong? Thank you in advance for your guidance.
Topic transformer keras deep-learning language-model nlp
Category Data Science