Transfer learning: Poor performance with last layer replaced
I am using a transfer learning approach. For this I followed the tensorflow for poets tutorial. I use a pre-trained InceptionV3 architecture trained on the Imagenet dataset. The last layer and the softmax classification is replaced and retrained, using a new set of 7 classes.
Data
Per class I have around 4.000 - 5.000 images. I tried multiple training parameters with an AdamOptimizer. The labels are noisy, about 15-20% of the labels are incorrect. The images show products of a certain category (e.g. cars) and the labels classify different type of a feature (e.g. 7 different types of tires, wheels).
Parameters
- learning rate: 0.001
- iterations: 7.000
- batch size: 100
Performance
The test accuracy is 50%, train accuracy 68%. Visualising the learning rate, the network already reached 50% after 2000 iterations. What surprises me is the overall low performance as well as the lack of further improvement during the training time. Its also noteworthy that the network seems to make very hard to understand errors (not only mixing up similar classes but clearly distinguishable ones as well).
Now I wonder: Is this potentially because retraining only a single layer is too limited to pick up the subtle differences in certain parts of the images? How would you go about it to improve?
Topic transfer-learning cnn performance
Category Data Science