Do smaller neural nets always converge faster than larger ones?
In your experience, do smaller CNN models (fewer params) converge faster than larger models?
I would think yes, naturally, because there are fewer parameters to optimize. However, I am training a a custom MobileNetV2-based Unet (with 2.9k parameters) for image segmentation, which is taking longer to converge than a model with greater number of parameters (5k params). If this convergence behavior is unexpected, it probably indicates a bug in the architecture
Topic training convergence convolutional-neural-network neural-network
Category Data Science