Model stalls and learning slows down after the first epochs

I have an issue with a model that I'm working on, I cannot show you the model architecture because it's basically confidential research. The model includes Graph convolutional networks and Transformer encoders combined to process and classify both sequences and graphs, I have dropout and layernorm and skip connections everywhere.

Optimizer : RAdam with a reducelronplateau scheduler

The issue is that the model learns pretty fast in a natural progression in the first 10 to 15 epochs but tends to stall after that and the val accuracy or the F1 score don't really increase by much after that ( only by small margins around 0.001 ) and eventually overfits .

what could be the issue ? I think it's one of these :

  • The data is too simple for the architecture or the architecture is too complex for the data .
  • I need more data ( the results improve after data Augmentation but it eventually stalls and overfits ) .

Topic graph-neural-network transformer

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.