the size of training data set in the context of computer vision

Generally speaking, for training a machine learning model, the size of training data set should be bigger than the number of predictors. For a neural network, or even a deep learning model, the number of parameters are usually tens of thousands or even millions. It seems that in practice, the number of training data set, i.e., the number of images, is usually less than the number of parameters. How to explain this? I know, we can claim that the pre-trained model may remove the requirement of having that many images. Is this the only reason, or we should use number of pixels multiplied by the number of images to measure the size of training data set.

Topic image-recognition image-classification computer-vision deep-learning neural-network

Category Data Science


Your second hypothesis is on the right track. Try comparing the information content of the training set with the information content of the network parameters. Of course most images are compressible, but they don't compress down to a single floating-point number, which is how network parameters are usually encoded.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.