Non-Real Time Data Augmentation for CNN Classification. What are the drawbacks?
When people talk about and use data augmentation, are they mostly referring to real-time data augmentation? In the case of image classification, that would involve augmenting the data right before fitting the model, and a new augmented image is used every epoch. In this case only augmented images are used to train the model and the raw image is never used, so the size of the input doesn’t actually change.
But what about non-real-time data augmentation? By this, I mean augmenting the data in the preprocessing stage, so that you literally expand the sample size of your input. Then you feed all those augmented images along with the original into the cnn, so it’s the same images every epoch. Is this a valid idea? Has it been done and what are the drawbacks? Any logical fallacies or objections by data science and machine learning experts?
Thanks!
Topic cnn data-augmentation machine-learning
Category Data Science