Training Loss increases, but Validation Loss decreases

carrot

2021年9月5日 19:03

I am finetuning a T5 transformer model on a sequence to sequence task. My program outputs the training and validation loss every 500 optimization steps. However, when I first started training the model, the training loss steeply increased, but my validation loss decreased (My training dataset has about 85,000 samples and my validation dataset has about 10,000 samples)! Does anyone know why this might be happening? Is this a sign my model is not learning properly?

Also, does anyone know why my training loss is so much higher (with a completely different graph than my validation loss)?

Topic loss huggingface deep-learning nlp machine-learning

Category Data Science

Training Loss increases, but Validation Loss decreases

About