Running out of memory when training Keras LSTM model for binary classification on image sequences
I'm trying to come up with a Keras model based on LSTM layers that would do binary classification on image sequences.
The input data has the following shape: (sample_number, timesteps, width, height, channels)
where one example would be (1200, 100, 100, 100, 3)
.
So it's a 5D tensor equivalent to video data.
timesteps
is equal to 100 -> each sample (image sequence) has 100 frames
channels
is equal to 3 -> RGB data
Here's a minimal workable example:
import numpy as np
from keras.preprocessing.image import ImageDataGenerator
from keras import models, layers, optimizers
class TestingStuff():
def __sequence_image_generator(self, x, y, batch_size, generator, seq_len):
new_y = np.repeat(y, seq_len)
helper_flow = generator.flow(x.reshape(x.shape[0] * seq_len,
x.shape[2],
x.shape[3],
x.shape[4]),
new_y,
batch_size=seq_len * batch_size)
for x_temp, y_temp in helper_flow:
yield x_temp.reshape(x_temp.shape[0] // seq_len,
seq_len,
x.shape[2] * x.shape[3] * x.shape[4]), y_temp[::seq_len]
def testStuff(self):
batch_size = 50
training_epochs = 60
# Random generated, similar to the actual dataset
train_samples_num = 50
valid_samples_num = 50
data_train = np.random.randint(0, 65536, size=(train_samples_num, 100, 100, 100, 3), dtype='uint16')
data_valid = np.random.randint(0, 65536, size=(valid_samples_num, 100, 100, 100, 3), dtype='uint16')
labels_train = np.random.randint(0, 2, size=(train_samples_num), dtype='uint8')
labels_valid = np.random.randint(0, 2, size=(valid_samples_num), dtype='uint8')
train_data_generator = ImageDataGenerator()
valid_data_generator = ImageDataGenerator()
num_frames_per_sample = data_train.shape[1]
data_dimension = data_train.shape[2] * data_train.shape[3] * data_train.shape[4] # height * width * channels
data_train_num_samples = data_train.shape[0]
data_valid_num_samples = data_valid.shape[0]
train_generator = self.__sequence_image_generator(x = data_train,
y = labels_train,
batch_size = batch_size,
generator = train_data_generator,
seq_len = num_frames_per_sample)
valid_generator = self.__sequence_image_generator(x = data_valid,
y = labels_valid,
batch_size = batch_size,
generator = valid_data_generator,
seq_len = num_frames_per_sample)
num_units = 100
model = models.Sequential()
model.add(layers.LSTM(num_units, input_shape=(num_frames_per_sample, data_dimension)))
model.add(layers.Dense(1, activation='sigmoid'))
model.compile(optimizer=optimizers.Adam(), loss='binary_crossentropy', metrics=['acc'])
model.summary()
model.fit_generator(train_generator,
steps_per_epoch = data_train_num_samples // batch_size,
epochs = training_epochs,
validation_data = valid_generator,
validation_steps = data_valid_num_samples // batch_size,
verbose = 1)
my_class = TestingStuff()
my_class.testStuff()
This example was tested with the following versions:
python 3.6.8
keras 2.2.4
tensorflow 1.13.1
Code explanation:
data_train
is of shape(50, 100, 100, 100, 3)
and represents 50 samples of 100 frames of 100x100 images with 3 channels. The images are 16 bit. Same holds fordata_valid
.labels_train
andlabels_valid
are 1D tensors with possible values1
and0
.ImageDataGenerator()
is used for data augmentation purposes, but in this example no transformations are mentioned.__sequence_image_generator()
is adapted from here and has the purpose to reshape the initial input data (5D tensor) to the input shape (4D tensor) expected by the flow() method of theImageDataGenerator
class and further into the input shape expected by the LSTM layer (3D tensor with shape(batch_size, timesteps, input_dim)
).- The model architecture is a starting point (to be improved), with only 1 LSTM layer and 1 Dense layer.
Issue:
I noticed that the code works fine when train_samples_num
and valid_samples_num
have values of up to 50. If those variables have larger values (such as 1000), then the memory usage becomes excessive and it seems like the whole training is blocked. The training doesn't get past the 1st epoch.
I'm suspecting that the issue possibly lies somewhere in the __sequence_image_generator()
, where the data generation might be inefficient. But I might be wrong.
Changing num_units
or batch_size
to smaller values does not fix the issue. The excessive memory usage is still there even with num_units = 1
and batch_size = 1
.
Output with train_samples_num
and valid_samples_num
equal to 50:
Using TensorFlow backend.
Epoch 1/60
1/1 [==============================] - 16s 16s/step - loss: 0.7258 - acc: 0.5400 - val_loss: 0.7119 - val_acc: 0.6200
Epoch 2/60
1/1 [==============================] - 18s 18s/step - loss: 0.7301 - acc: 0.4800 - val_loss: 0.7445 - val_acc: 0.4000
Epoch 3/60
1/1 [==============================] - 21s 21s/step - loss: 0.7312 - acc: 0.4200 - val_loss: 0.7411 - val_acc: 0.4200
(...training continues...)
Output with train_samples_num
and valid_samples_num
equal to 1000:
Using TensorFlow backend.
Epoch 1/60
(...never finishes training the 1st epoch and memory usage grows until a MemoryError occurs...)
Question:
How can I modify my code to prevent this excessive memory usage when I use a larger number of samples?
My data has about 5000 samples for the train dataset and less than that for the valid dataset and test dataset.
Topic lstm keras tensorflow sequence
Category Data Science