Neural network accuracy in Torch depends on compute power?

I am new to machine learning and have quite good understanding of basic concepts.

I was implementing a 3 Layer neural network on MNIST dataset with 784, 100, 10 neurons in input, hidden, output layer respectively. I did not use any regularization here.

Firstly I trained the network on Intel i5 4th generation quad-core CPU with 4GB of ram which gave me 64% of accuracy. Then I trained the exact same network, with exact same code on Intel i5 7th generation quad core CPU with 8gb of ram which gave accuracy of about 89% This is the link to the implementation.

My question is in Torch, does the compute power effect the accuracy of the network? or is it something else that I am missing which has resulted in this huge change.

I did not use any different weight initialization method than the default provided in the torch libraries, so that is ruled out. I also did not use anything else which may effect the network to change its accuracy to this extent.

Topic torch jupyter accuracy neural-network

Category Data Science


Available compute power does not directly affect the accuracy of a neural network. If your different runs of the network have:

  • identical architecture and meta-params
  • identical code (including library code)
  • all training data is identical
  • all stochastic parts of training use the same random seed and generator
  • all data types are identical precision (e.g. all vectors and matrices are 32-bit or 64-bit floats)

then the behaviour of neural network training in each run is fully deterministic and repeatable. Having a faster processor will just get you to the result faster*.

The most likely difference between your tests is due to not seeding the random number generators used in the training the process. For you this includes weight initialisation, possibly train/test split and possibly shuffling training data in each epoch. As you did not use any regularisation, then accuracy of the trained network can vary quite a bit due to over-fitting.

To verify this, you can train a second or third time on each CPU. I expect you will see a lot of variation in final accuracy, regardless of which machine you run it on.


* This does mean that having a faster machine can result in you having a more accurate final network in practice when you are tuning the parameters, because you can try more variations of meta-params with multiple training sessions.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.