What are some general tips to improve my MNIST classifier?
I have built a CNN from scratch in python using Numpy, to tackle the MNIST hand-written digit recognition problem. It's composed out of a convolutional layer (3 3x3 filters), a maxpooling layer (2x2 pooling) and the 10-label output layer. I'm using softmax for the output activation function and cross-entropy as loss function. I've tried running it with a couple of different hyperparameters and so far the best accuracy i've gotten is 97%, when training on the whole train dataset (60000 images) for a single epoch and using SGD. The accuracy does vary a bit though, usually around 92-95% under these conditions. I've only tried one epoch when using the whole dataset because it already takes maybe 15 min for my algorithm to train using 60000 images (with cpu on my low grade school laptop). The thing is that I don't have an perception of how good/bad this is, i.e. how much time one would expect such an network to take and how accurate it would be. Is this really slow and inaccurate? I'd love some general tips on how I can improve my network, whether it's through some optimization methods or brute force (increased amount of layers/neurons). I've also tried implementing mini-batches but for some reason (maybe faulty implementation?) this only seems to decrease the accuracy.
Topic mnist cnn neural-network classification python
Category Data Science