How to improve a CNN without changing the architecture?
I'm currently using an autoencoder CNN that's built upon the VGG-16 architecture that was designed by someone else. I want to replicate their results using their dataset first but I'm finding that:
-Validation losses diverge from training losses fairly early on (I get to around 10 epochs and it already looks like it's overfitting) -At its best, the validation losses aren't even close to being as low as training losses -In general, the accuracy is still worse than reported in their paper.
I'm new to machine learning and want to know if there are hyperparameters I should try to change or what I can do to maybe tinker with it without changing its architecture?
Topic finetuning hyperparameter-tuning cnn training neural-network
Category Data Science