Is it advisable to use a model which is underfit but gives very high accuracy?
I am training a model for a single-label classification task in Vision. In this training, I am using oversampling of all the classes, and MixUp augmentation, along with rotation and dihedral transformations to augment data.
What happens is, the model, after being trained for 20 epochs, achieves $8\%$ validation loss (CE Loss) and $98\%$ accuracy in predicting the labels of the images in the validation set.
The problem is that the model underfits. While the accuracy is extremely high and validation loss is extremely low, the training loss is quite high $\approx 75\%$.
Should I use this model in production? Although the model underfits the training data, it achieves very high accuracy in predicting the labels in the validation set, and the validation loss is also extremely low.
Should I work with an underfit model in production?
Here's how the last two epochs look like-
epoch | train_loss | valid_loss | accuracy |
---|---|---|---|
18 | $0.764258$ | $0.150605$ | $0.963151$ |
19 | $0.763108$ | $0.152006$ | $0.961245$ |
In case you might ask what am I doing adding augmentations if the model is underfitting, but if I don't add those augmentations, the model will start to overfit, and validation loss and accuracy will start to be worse.
I am not asking how do I make the underfitting go away, I can do that by running, say, 20 more epochs of training. I am asking if it is okay to use such a model in production.
Topic fastai loss-function image-classification computer-vision classification
Category Data Science