Training a CNN on a large dataset

Question

Training a CNN on a large dataset

eun ji

2021年3月21日 04:04

I am currently trying to build a CNN for around 100,000 images. There are 42 classes. I have used the default batch size of 32. This is how my model looks like:

model = Sequential()
model.add(Conv2D(filters = 32, kernel_size = (3, 3), activation = 'relu', input_shape = training_data.image_shape))
model.add(MaxPool2D(pool_size = (2, 2)))
model.add(Dropout(rate = 0.3))

model.add(Conv2D(filters = 64, kernel_size = (3, 3), activation = 'relu'))
model.add(MaxPool2D(pool_size = (2, 2)))
model.add(Dropout(rate = 0.2))

model.add(Conv2D(filters = 126, kernel_size = (3, 3), activation = 'relu'))
model.add(MaxPool2D(pool_size = (2, 2)))
model.add(Dropout(rate = 0.15))

model.add(Flatten())

model.add(Dense(units = 32, activation = 'relu'))
model.add(Dropout(rate = 0.15))

model.add(Dense(units = 64, activation = 'relu'))
model.add(Dropout(rate = 0.1))

model.add(Dense(units = 42, activation = 'softmax'))
model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])

However, the training time takes super long and each epoch takes around 35 minutes to run. The accuracy is also very low and increases very slowly.

My jupyter lab will sometimes stop and have to refresh everything again. So is there a way to train in smaller batches? Or a way to improve the training speed? Any help is appreciated. It is a very huge dataset.

Epoch 1/15
2307/2307 [==============================] - 3999s 2s/step - loss: 3.5377 - accuracy: 0.0687 - val_loss: 3.3247 - val_accuracy: 0.1223
Epoch 2/15
2307/2307 [==============================] - 3764s 2s/step - loss: 3.2884 - accuracy: 0.1239 - val_loss: 3.1065 - val_accuracy: 0.1739
Epoch 3/15
2307/2307 [==============================] - 2204s 955ms/step - loss: 3.1435 - accuracy: 0.1562 - val_loss: 2.9825 - val_accuracy: 0.2069
Epoch 4/15
2307/2307 [==============================] - 2193s 951ms/step - loss: 3.0526 - accuracy: 0.1778 - val_loss: 2.9059 - val_accuracy: 0.2171

Topic cnn dropout convolution deep-learning neural-network

Category Data Science

10xAI · Accepted Answer · 2020年6月23日 04:15

Changing the batch size will not change the overall training time too much. Since with any batch size you are passing almost 80K images.

One(and the best) approach will be to use transfer learning.

If you have a compelling reason to do full training, you will need a GPU powered bigger hardware. Google Colab can be an option. There are many other options available

Before that you may try to gauge the model with a sample of ~5k images

user1288043 · Accepted Answer · 2020年6月23日 02:06

When you use keras fit, pass the value for x as a generator function which will provide (perhaps using yield) the batch of data (x, y) tuple. Also in the generator function, you can use checkpoint.

https://www.tensorflow.org/api_docs/python/tf/keras/Model#fit

Training a CNN on a large dataset

About