Testing accuracy very low, while training and validation accuracy ~ 85%

I have a training dataset of 10000 pictures and a test dataset of 15000 pictures. There are 23 types of birds.

First of all, I imported the necessary

import tensorflow as tf 
from tensorflow import keras
from tensorflow.keras.preprocessing.image import ImageDataGenerator 
from tensorflow.keras import layers 
from tensorflow.keras import Model 
import matplotlib.pyplot as plt

from keras.applications.inception_v3 import InceptionV3, preprocess_input

batch_size = 32
IM_WIDTH, IM_HEIGHT = 150, 150 # fixed size for inceptionV3
nb_epochs = 13

train_dir = '/kaggle/output/working_directory/'

I am using ImageDataGenerator for Image augmentation

#test_datagen = ImageDataGenerator(rescale = 1.0/255.)
test_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)

train_datagen = ImageDataGenerator(
            preprocessing_function=preprocess_input,
            rotation_range = 40, 
            width_shift_range = 0.2, 
            height_shift_range = 0.2,
            shear_range = 0.2, 
            zoom_range = 0.2, 
            horizontal_flip = True,
            validation_split=0.2) # set validation split

And importing data using flow_from_directory

train_generator = train_datagen.flow_from_directory(train_dir, 
                                                    batch_size = batch_size, 
                                                    class_mode = 'categorical', 
                                                    target_size = (IM_WIDTH, IM_HEIGHT),
                                                    shuffle=True,
                                                    subset='training')

validation_generator = train_datagen.flow_from_directory(train_dir, 
                                                              batch_size = batch_size, 
                                                              class_mode = 'categorical', 
                                                              target_size = (IM_WIDTH, IM_HEIGHT),
                                                              shuffle=True,
                                                              subset='validation')

test_generator = test_datagen.flow_from_directory(
    directory = '/kaggle/input/test/',
    target_size = (IM_WIDTH, IM_HEIGHT),
    color_mode = 'rgb',
    batch_size = 1,
    class_mode = None,
    shuffle = False)

Found 8225 images belonging to 23 classes.

Found 2045 images belonging to 23 classes.

Found 15009 images belonging to 1 classes.

Finally, I imported the actual model

from tensorflow.keras.applications.inception_v3 import InceptionV3
base_model = InceptionV3(input_shape = (IM_WIDTH, IM_HEIGHT, 3), include_top = False, weights = 'imagenet')

for layer in base_model.layers:
    layer.trainable = True

import keras
from tensorflow.keras.optimizers import RMSprop

x = layers.Flatten()(base_model.output)
x = layers.Dense(1024, activation='relu')(x)
x = layers.Dropout(0.4)(x)
x = layers.Dense(23, activation='softmax')(x)

model = tf.keras.models.Model(base_model.input, x)

model.compile(optimizer = keras.optimizers.Adam(lr=0.0001), loss = 'categorical_crossentropy', metrics = ['acc'])

from keras.callbacks import ModelCheckpoint
from keras.callbacks import EarlyStopping

filepath = 'best_model.h5'

es = EarlyStopping(monitor='val_acc', 
                   mode='max', 
                   verbose=1, 
                   patience=3)

ModelCheckpoint = ModelCheckpoint(filepath,
                             monitor='val_acc',
                             mode='max',
                             save_best_only=True,
                             verbose=1)

callbacks_list = [ModelCheckpoint, es]

inception = model.fit(train_generator, 
                      steps_per_epoch = train_generator.samples // batch_size,
                      validation_data = validation_generator,
                      validation_steps = validation_generator.samples// batch_size,
                      epochs = nb_epochs,
                      callbacks = callbacks_list)

Epoch 00012: val_acc did not improve from 0.86210 Epoch 13/13 257/257 [==============================] - 91s 355ms/step - loss: 0.2282 - acc: 0.9288 - val_loss: 0.5141 - val_acc: 0.8676

Epoch 00013: val_acc improved from 0.86210 to 0.86756, saving model to best_model.h5

Now, testing:

from keras.models import load_model

model = load_model('best_model.h5')

test_generator.reset()
STEP_SIZE_TEST=test_generator.n//test_generator.batch_size


y_pred = model.predict(test_generator,
                       steps = STEP_SIZE_TEST)

predictions = [np.argmax(pred) for pred in y_pred]

prediction = pd.DataFrame(predictions, columns=['label']).to_csv('prediction.csv')
df.to_csv(index=True)

After I submit the .cvs file, the accuracy is 4.5%. I am very confused as validation data returns approx. 85% and it is not compromised, the model is not training on validation data. Hence, I am very confused why does my model achieve only 4.5% on the testing dataset. I believe there is something wrong with .prediction and storing the predicted values, but I cannot figure it out.

Topic cnn inception convolutional-neural-network classification

Category Data Science


I believe this could help someone. The problem was that the output classes were randomly assigned. My classes are called: 0,1,2,3,4...,22. However, DataGenerator assigned output '5' to class 13, output '7' to class 15, and so on. Hence, the classes were shuffled. It is important to assign the output to each class.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.