Error Loading and Training on Tensorflow's 'Speech Commands Dataset'
I am trying to replicate the most basic version of this Google LEAF example. I am having problems loading in the Tensorflow Speech Commands Dataset. I load the datasets in as a TFRecord:
tfds.load('speech_commands', download='true', shuffle_files='false')
I then map the train, test and eval datasets through this pre-process function:
def preprocess(sample):
audio = sample['audio']
label = sample['label']
audio = tf.cast(audio, tf.float32) / tf.int16.max
return audio, label
I then create my model and attempt to train on my train dataset:
#Model is from leaf_audio/models
model = models.AudioClassifier(num_outputs=12)
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
model.compile(loss=loss_fn, optimizer = tf.keras.optimizers.Adam(1e-4), metrics=['sparse_categorical_accuracy'])
model.fit(train_dataset, batch_size=None, epochs=10)
On training I receive an error in the Audio Classifier layer:
ValueError: Exception encountered when calling layer sequential (type Sequential). Input 0 of layer global_max_pooling2d is incompatible with the layer: expected ndim=4, found ndim=2. Full shape received: (None, 16000)
I think this is something to do with me loading in the data incorrectly, however, I have followed the example to the line in each of the loading steps.
For the full code please follow this link.
Topic tensorflow dataset
Category Data Science