Techniques to increase the evaluation speed of a neural network

This is somewhat of an open ended question and in some respects a literature request (I would love to be pointed to a survey paper if one exists).

Suppose I am constructing a neural network to make some arbitrary prediction (either categorical, or numeric, doesn't matter). With this network I am concerned primarily with speed of evaluation. Obviously, I want the network to give as accurate as possible predictions, but I'm more than willing to sacrifice some accuracy if it will make the network run faster (the evaluation faster, I don't care about training speed as long reasonable). What are some the techniques that might be employed to do this?

Broadly, I see two general approaches. First, you could build a network and then prune it. This could be done by removing neurons until the accuracy degrades past the point of acceptability. I could also add exit conditions at each, or some, hidden layers to skip the remaining layers if the result I have is good enough. The second approach I see is to build a network from the start with speed in mind. I could do this by limiting the number of hidden layers and neurons. I could also use a simple activation function (such as linear or even identity) over the more common sigmoid.

What other techniques, if any, are there to increase the evaluation speed of a neural network? Also, if anyone is aware of any papers/articles/blogs/etc on this topic I would love to be pointed to them!

Topic reference-request neural-network performance efficiency

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.