Should I put time on my Vanilla ANN for classifying MNIST Dataset

Question

Should I put time on my Vanilla ANN for classifying MNIST Dataset

Amar Parajuli

2022年2月16日 00:03

I am building a Vanilla Neural Network in Python for my Final Year project, just using Numpy and Matplotlib, to classify the MNIST dataset. Here's the specifications of the model:

One Input Layer + One Hidden Layers + One Softmax Layer
Number Nodes in each layer :[784, 800, 10]
Activation function used: ReLU and Softmax.
Also Normalized the Train and Test set, by dividing it by 255
Have used Mini Batch Gradient Descent(Mini Batch Size=4096).
The Model shows very low accuracy on the Test set. Around 8% -11% accuracy.
And it shows an accuracy of around 66% - 72% for the Train set.

I haven't used Regularization until now. And I don't know if that will help.

Now fed up with this, I am thinking of just implementing the model in Tensorflow and see if that works. Or is it necessary to implement it from scratch? Because I think I know how every concept works(although not being able to find the loophole in my model). What do you have to say about it?

If you want to have a look at my code here's the link. It would be great if I can get any suggestions.

P.S.: Does it create any problem if a DL model is not written using OOP?

Topic mnist tensorflow deep-learning python

Category Data Science

Valentin Calomme · Accepted Answer · 2020年5月7日 07:18

Some quick notes:

A mini-batch size of 4096 is rather large, try training with sizes like 16, 32, or even 1 to see what happens.

Your test accuracy seems to be around random since there are 9/10 classes to predict. Does your model predict randomly (i.e. every class gets a similar score) or is it just wrong often?

is it necessary to implement it from scratch?

Necessary for who? Unless it's required for your coursework, nobody is forcing you to implement this from scratch.

P.S.: Does it create any problem if a DL model is not written using OOP?

If an implementation is correct, it's correct. Whether it's spaghetti code, whether it's a single file, whether it's beautiful, tested, and reviewed OOP software.

OOP is there to help human write more maintainable, robust, and readable software, but under the hood, the computer really couldn't care less.

Should I put time on my Vanilla ANN for classifying MNIST Dataset

About