Model stalls and learning slows down after the first epochs
I have an issue with a model that I'm working on, I cannot show you the model architecture because it's basically confidential research.
The model includes Graph convolutional networks and Transformer encoders combined to process and classify both sequences and graphs, I have dropout and layernorm and skip connections everywhere.
Optimizer : RAdam with a reducelronplateau scheduler
The issue is that the model learns pretty fast in a natural progression in the first 10 to 15 epochs but tends to stall after that and the val accuracy or the F1 score don't really increase by much after that ( only by small margins around 0.001 ) and eventually overfits .
what could be the issue ?
I think it's one of these :
- The data is too simple for the architecture or the architecture is too complex for the data .
- I need more data ( the results improve after data Augmentation but it eventually stalls and overfits ) .
Topic graph-neural-network transformer
Category Data Science