Is it advisable to use a model which is underfit but gives very high accuracy?

Question

Is it advisable to use a model which is underfit but gives very high accuracy?

truth

2021年6月18日 13:40

I am training a model for a single-label classification task in Vision. In this training, I am using oversampling of all the classes, and MixUp augmentation, along with rotation and dihedral transformations to augment data.

What happens is, the model, after being trained for 20 epochs, achieves $8\%$ validation loss (CE Loss) and $98\%$ accuracy in predicting the labels of the images in the validation set.

The problem is that the model underfits. While the accuracy is extremely high and validation loss is extremely low, the training loss is quite high $\approx 75\%$.

Should I use this model in production? Although the model underfits the training data, it achieves very high accuracy in predicting the labels in the validation set, and the validation loss is also extremely low.

Should I work with an underfit model in production?

Here's how the last two epochs look like-

epoch	train_loss	valid_loss	accuracy
18	$0.764258$	$0.150605$	$0.963151$
19	$0.763108$	$0.152006$	$0.961245$

In case you might ask what am I doing adding augmentations if the model is underfitting, but if I don't add those augmentations, the model will start to overfit, and validation loss and accuracy will start to be worse.

I am not asking how do I make the underfitting go away, I can do that by running, say, 20 more epochs of training. I am asking if it is okay to use such a model in production.

Topic fastai loss-function image-classification computer-vision classification

Category Data Science

Ben Reiniger · Accepted Answer · 2021年6月18日 13:40

More generally, this would be indicative of a problem. In your context, where you're confident that the test set is representative of the intended production setting, and the lower scores on the training set may be due to the augmentation, I think you're probably fine to proceed.

To be a little more confident, I'd want to evaluate the hypothesis that the training scores are low because of the augmentations; can you evaluate the original training set?

Nicolas Martin · Accepted Answer · 2021年6月17日 19:19

In other words, your model doesn't learn very well on the training data, but depite that, it does good predictions on test data, right?

The short answer is no, because there is a big risk of biased results in production.

The long answer is you have to know whether the test data is representative enough of the production data or not.

Do you also use the same augmentations for the test data?

Is it advisable to use a model which is underfit but gives very high accuracy?

About