Training loss = 0, training accuracy =1, validation and test around 85%

I have created different CNNs for doing image classification. The dataset is this: https://www.kaggle.com/crowww/a-large-scale-fish-dataset There are 9 classes, and each class contains 1000 images of fish. I split in training (800 imgs per class), validation (100) and test (100). I created different CNN with these layers:

1)1 convolutional layers (conv, relu, batchnorm) + 2 fully connected layers + output

2)2 convolutional layers (conv, relu, batchnorm and maxpooling) + 2 fully connected layers + output

3)4 convolutional layers (conv, relu, batchnorm and maxpooling) + 2 fully connected layers + output

These are the outputs of each model: 1):

Epoch 1: TrL=0.6311, TrA=0.8006, VL=0.2925, VA=0.9077, TeL=0.4005, TeA=0.8708,
Epoch 2: TrL=0.0443, TrA=0.9939, VL=0.2072, VA=0.9235, TeL=0.3610, TeA=0.8896,
Epoch 3: TrL=0.0156, TrA=0.9993, VL=0.2128, VA=0.9161, TeL=0.3231, TeA=0.8896,
Epoch 4: TrL=0.0090, TrA=0.9996, VL=0.1883, VA=0.9287, TeL=0.2808, TeA=0.9000,
Epoch 5: TrL=0.0058, TrA=1.0000, VL=0.1663, VA=0.9350, TeL=0.2689, TeA=0.8990,
Epoch 6: TrL=0.0044, TrA=1.0000, VL=0.1628, VA=0.9339, TeL=0.2594, TeA=0.9073,
Epoch 7: TrL=0.0035, TrA=1.0000, VL=0.1675, VA=0.9350, TeL=0.2662, TeA=0.9062,
Epoch 8: TrL=0.0028, TrA=1.0000, VL=0.1608, VA=0.9350, TeL=0.2697, TeA=0.9031,
Epoch 9: TrL=0.0024, TrA=1.0000, VL=0.1645, VA=0.9350, TeL=0.2688, TeA=0.9052,
Epoch 10: TrL=0.0021, TrA=1.0000, VL=0.1556, VA=0.9339, TeL=0.2691, TeA=0.9073

2):

Epoch 1: TrL=0.4261, TrA=0.8676, VL=0.2128, VA=0.9469, TeL=0.3264, TeA=0.8917,
Epoch 2: TrL=0.0193, TrA=0.9982, VL=0.1412, VA=0.9719, TeL=0.2580, TeA=0.9094,
Epoch 3: TrL=0.0060, TrA=1.0000, VL=0.1064, VA=0.9719, TeL=0.2463, TeA=0.9104,
Epoch 4: TrL=0.0037, TrA=1.0000, VL=0.0811, VA=0.9802, TeL=0.2210, TeA=0.9177,
Epoch 5: TrL=0.0025, TrA=1.0000, VL=0.0794, VA=0.9792, TeL=0.2098, TeA=0.9250,
Epoch 6: TrL=0.0020, TrA=1.0000, VL=0.0768, VA=0.9792, TeL=0.2100, TeA=0.9260,
Epoch 7: TrL=0.0016, TrA=1.0000, VL=0.0730, VA=0.9802, TeL=0.2025, TeA=0.9292,
Epoch 8: TrL=0.0014, TrA=1.0000, VL=0.0720, VA=0.9792, TeL=0.2040, TeA=0.9292,
Epoch 9: TrL=0.0012, TrA=1.0000, VL=0.0731, VA=0.9792, TeL=0.1927, TeA=0.9313,
Epoch 10: TrL=0.0011, TrA=1.0000, VL=0.0696, VA=0.9792, TeL=0.2019, TeA=0.9292

3):

Epoch 1: TrL=0.7956, TrA=0.7686, VL=0.8161, VA=0.6991, TeL=0.9512, TeA=0.6719,
Epoch 2: TrL=0.0815, TrA=0.9894, VL=0.4978, VA=0.8254, TeL=0.6932, TeA=0.7417,
Epoch 3: TrL=0.0224, TrA=0.9996, VL=0.4205, VA=0.8494, TeL=0.6169, TeA=0.7854,
Epoch 4: TrL=0.0107, TrA=1.0000, VL=0.4251, VA=0.8463, TeL=0.6164, TeA=0.7760,
Epoch 5: TrL=0.0071, TrA=1.0000, VL=0.3946, VA=0.8536, TeL=0.5990, TeA=0.7865,
Epoch 6: TrL=0.0052, TrA=1.0000, VL=0.4075, VA=0.8515, TeL=0.5714, TeA=0.7906,
Epoch 7: TrL=0.0040, TrA=1.0000, VL=0.3773, VA=0.8609, TeL=0.5512, TeA=0.8010,
Epoch 8: TrL=0.0033, TrA=1.0000, VL=0.3643, VA=0.8661, TeL=0.5491, TeA=0.8052,
Epoch 9: TrL=0.0028, TrA=1.0000, VL=0.3768, VA=0.8598, TeL=0.5377, TeA=0.8042,
Epoch 10: TrL=0.0023, TrA=1.0000, VL=0.3760, VA=0.8640, TeL=0.5380, TeA=0.8031

As you can see, after 2-3 epochs, training accuracy goes to 100% and training loss goes to less than 1%. But in validation accuracy is around 90-95% and in test, accuracy is around 90%. How can I interpret these results? My models are overfitting? Or they are good? For example, model 2) in the best case has TrA 1, VA 0.9802 and TeA 0.9292. I think that in this case It is not overfitting, because results are similar.

Last question: I have understood that among the epochs, I have to choose as best model, the one in which Validation Accuracy is highest. Why this? Why I cannot take the epoch in which test accuracy is the highest?

Topic loss cnn convolutional-neural-network accuracy classification

Category Data Science


To answer your last question - think of the model as your brain trying to give a maths test. Training data is what you encountered during homework/exercise and validating/testing data is what you encounter in the final examinations(most likely unseen data). To think of yourself as proficient in mathematics, you'd want your brain to be able to perform best on these unseen data.

Following the above reasoning, you'd want the snapshot of the best performing model in terms of epoch.

Why validation you ask? The answer can be found in the existence of separate validation and test set. The validation set is used as part of the training for model selection. Whereas the test set is used at the end for the final analysis of the model before being deployed for real-world data. Think of it as taking mock tests before taking the actual examination. You can combine the two and label them as validation but after that, you will have no reference for their performance as real-world data.

Here's an excellent answer that talks more about the difference between validation and test data.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.