Different number of images in classes

I am working on a deep learning CNN project. The dataset contains more than 500 classes and the classes have different numbers of items (images). For example, some of the classes have 5 images and some of the classes have 10 and some of the classes have 20 images and some of the classes have more then 20 images.

Can I use this dataset to create the CNN model?

Should the number of the images in each class be the same number?

Note: I will use VGG to train the model.

Topic keras convolutional-neural-network tensorflow cloud-computing deep-learning

Category Data Science


Frankly, even 50 images will not be sufficient if you are going to create and use a CNN model. If you think you want more images for you model training, then go for data augmentation. It is a process of transforming an image by a small amount (be it height, width, rotation etc or any combination of these). In this way, an image and its augmented image will differ slightly. You can find relevant article here-

https://medium.com/nanonets/how-to-use-deep-learning-when-you-have-limited-data-part-2-data-augmentation-c26971dc8ced

To answer the part that should there be same number of images in each class, there should be approximately same number. This problem is a general problem while working on classification task and there are several ways to deal with it, including simulating the data (augmentation).

I would suggest that first create a separate test set, then on the remaining train set, use data augmentation and finally create the model.

EDIT

Using a pretrained convnet is also an option, as stated in a deep learning book-

A common and highly effective approach to deep learning on small image datasets is to use a pretrained network. A pretrained network is a saved network that was previously trained on a large dataset, typically on a large-scale image-classification task. If this original dataset is large enough and general enough, then the spatial hierarchy of features learned by the pretrained network can effectively act as a generic model of the visual world, and hence its features can prove useful for many different computer vision problems, even though these new problems may involve completely different classes than those of the original task. For instance, you might train a network on ImageNet (where classes are mostly animals and everyday objects) and then repurpose this trained network for something as remote as identifying furniture items in images.


You have a very low amount of images to create a CNN from scratch. You may be able to train your model via transfer learning from a pre-trained model, but you still may have too small of a dataset.

See something like this: https://www.analyticsvidhya.com/blog/2017/06/transfer-learning-the-art-of-fine-tuning-a-pre-trained-model/

Basically, you can take an image model that has been trained on tons of images, and either use their weights to seed or (more likely) just retrain the fully connected dense layers.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.