Running out of memory when training Keras LSTM model for binary classification on image sequences

Question

Running out of memory when training Keras LSTM model for binary classification on image sequences

Alex

2022年3月21日 11:06

I'm trying to come up with a Keras model based on LSTM layers that would do binary classification on image sequences.

The input data has the following shape: (sample_number, timesteps, width, height, channels) where one example would be (1200, 100, 100, 100, 3).
So it's a 5D tensor equivalent to video data.

timesteps is equal to 100 -> each sample (image sequence) has 100 frames
channels is equal to 3 -> RGB data

Here's a minimal workable example:

import numpy as np
from keras.preprocessing.image import ImageDataGenerator
from keras import models, layers, optimizers

class TestingStuff():

    def __sequence_image_generator(self, x, y, batch_size, generator, seq_len):
        new_y = np.repeat(y, seq_len)
        helper_flow = generator.flow(x.reshape(x.shape[0] * seq_len,
                                                x.shape[2],
                                                x.shape[3],
                                                x.shape[4]),
                                    new_y,
                                    batch_size=seq_len * batch_size)
        for x_temp, y_temp in helper_flow:
            yield x_temp.reshape(x_temp.shape[0] // seq_len, 
                                seq_len, 
                                x.shape[2] * x.shape[3] * x.shape[4]), y_temp[::seq_len]    

    def testStuff(self):

        batch_size = 50
        training_epochs = 60

        # Random generated, similar to the actual dataset
        train_samples_num = 50
        valid_samples_num = 50
        data_train = np.random.randint(0, 65536, size=(train_samples_num, 100, 100, 100, 3), dtype='uint16')
        data_valid = np.random.randint(0, 65536, size=(valid_samples_num, 100, 100, 100, 3), dtype='uint16')
        labels_train = np.random.randint(0, 2, size=(train_samples_num), dtype='uint8')
        labels_valid = np.random.randint(0, 2, size=(valid_samples_num), dtype='uint8')

        train_data_generator = ImageDataGenerator() 
        valid_data_generator = ImageDataGenerator()

        num_frames_per_sample = data_train.shape[1]
        data_dimension = data_train.shape[2] * data_train.shape[3] * data_train.shape[4] # height * width * channels
        data_train_num_samples = data_train.shape[0]
        data_valid_num_samples = data_valid.shape[0]

        train_generator = self.__sequence_image_generator(x = data_train, 
                                                          y = labels_train, 
                                                          batch_size = batch_size, 
                                                          generator = train_data_generator, 
                                                          seq_len = num_frames_per_sample)
        valid_generator = self.__sequence_image_generator(x = data_valid, 
                                                          y = labels_valid, 
                                                          batch_size = batch_size, 
                                                          generator = valid_data_generator, 
                                                          seq_len = num_frames_per_sample)

        num_units = 100

        model = models.Sequential()
        model.add(layers.LSTM(num_units, input_shape=(num_frames_per_sample, data_dimension)))
        model.add(layers.Dense(1, activation='sigmoid'))

        model.compile(optimizer=optimizers.Adam(), loss='binary_crossentropy', metrics=['acc'])
        model.summary()

        model.fit_generator(train_generator,
                            steps_per_epoch = data_train_num_samples // batch_size,
                            epochs = training_epochs,
                            validation_data = valid_generator,
                            validation_steps = data_valid_num_samples // batch_size,
                            verbose = 1)

my_class = TestingStuff()
my_class.testStuff()

This example was tested with the following versions:

python     3.6.8
keras      2.2.4
tensorflow 1.13.1

Code explanation:

data_train is of shape (50, 100, 100, 100, 3) and represents 50 samples of 100 frames of 100x100 images with 3 channels. The images are 16 bit. Same holds for data_valid.
labels_train and labels_valid are 1D tensors with possible values 1 and 0.
ImageDataGenerator() is used for data augmentation purposes, but in this example no transformations are mentioned.
__sequence_image_generator() is adapted from here and has the purpose to reshape the initial input data (5D tensor) to the input shape (4D tensor) expected by the flow() method of the ImageDataGenerator class and further into the input shape expected by the LSTM layer (3D tensor with shape (batch_size, timesteps, input_dim)).
The model architecture is a starting point (to be improved), with only 1 LSTM layer and 1 Dense layer.

Issue:

I noticed that the code works fine when train_samples_num and valid_samples_num have values of up to 50. If those variables have larger values (such as 1000), then the memory usage becomes excessive and it seems like the whole training is blocked. The training doesn't get past the 1st epoch.
I'm suspecting that the issue possibly lies somewhere in the __sequence_image_generator(), where the data generation might be inefficient. But I might be wrong.
Changing num_units or batch_size to smaller values does not fix the issue. The excessive memory usage is still there even with num_units = 1 and batch_size = 1.

Output with train_samples_num and valid_samples_num equal to 50:

Using TensorFlow backend.
Epoch 1/60
1/1 [==============================] - 16s 16s/step - loss: 0.7258 - acc: 0.5400 - val_loss: 0.7119 - val_acc: 0.6200
Epoch 2/60
1/1 [==============================] - 18s 18s/step - loss: 0.7301 - acc: 0.4800 - val_loss: 0.7445 - val_acc: 0.4000
Epoch 3/60
1/1 [==============================] - 21s 21s/step - loss: 0.7312 - acc: 0.4200 - val_loss: 0.7411 - val_acc: 0.4200
(...training continues...)

Output with train_samples_num and valid_samples_num equal to 1000:

Using TensorFlow backend.
Epoch 1/60
(...never finishes training the 1st epoch and memory usage grows until a MemoryError occurs...)

Question:

How can I modify my code to prevent this excessive memory usage when I use a larger number of samples?
My data has about 5000 samples for the train dataset and less than that for the valid dataset and test dataset.

Topic lstm keras tensorflow sequence

Category Data Science

Ray Escobar · Accepted Answer · 2022年2月16日 04:19

I have a similar problem, the memory of my training phase was exhausted, playing around with hyperparameters I check that the batch size must be reduced in order to be supported the processing unless you are using GPU's or TPU's, so I give it a try to reduce batch size and ended with less that I want but enough to finish the training on my model. hope this helps.

aminrd · Accepted Answer · 2019年10月3日 21:30

In Keras, you can save your model using model.save(). Then, you can either load a saved model to train it with new data, or you can continue training your model.

My suggestion is, first shuffle your input images if they are following a specific pattern based on their labels. Then load a batch of let say 100 images, continue training your model and save() it when iteration finished. The whole point of saving your model after each iteration is, you can continue training if something happens on one iteration.

Running out of memory when training Keras LSTM model for binary classification on image sequences

About