Does high accuracy metrics with small (but equally sampled) dataset means a good model?

Question

Does high accuracy metrics with small (but equally sampled) dataset means a good model?

Sangathamilan Ravichandran

2022年6月3日 16:00

I have been training my CNN with 200 images per class for a classification problem. There problem is a binary classification one. And with the amount of test data ( 25 per class) I am getting good accuracy, precision and recall values. Does that mean my model is actually good?

Topic cnn image-classification cross-validation neural-network

Category Data Science

Lana · Accepted Answer · 2019年8月9日 20:09

You could read some papers about problems with small dataset like this one https://arxiv.org/pdf/1611.03199.pdf:

Recent work has demonstrated that standard machine-learning techniques such as random forests and simple deep-networks are capable of learning meaningful chemical information from only a few hundred compounds

Although this example isn't about images (I recommend you to look over medical problems with images and cnn), as you can find, such challenges are wide spread in different fields, where it's difficult to get sufficient amount of labeled data (medical problems for instance). The idea is that it's possible to create appropriate model and judge about the quality of it's performance. And if the target field of the further usage of your algorithm has the same data representation, it's quite possible that your model is good enough.

Samuel Tap · Accepted Answer · 2019年8月9日 10:12

You can do a crossvalidation to be sure your testing set is not just very easy to classify.

If it is possible, you could try to augment the size of your training set by doing some rotation, shift, flip ... If you are using Keras, you can read this blog.

Does high accuracy metrics with small (but equally sampled) dataset means a good model?

About