What will happen if we train a model on a dataset sorted by class
Suppose we have a dataset of two classes (0 and 1) divided into over 12k mini-batches where the first half of the dataset (over 6k mini-batches) belong to class 0, and the other half belongs to class 1. What will happen if a model is trained on this dataset without shuffling the samples?
Topic mini-batch-gradient-descent training
Category Data Science