Rules, rules of thumb, intuitions, on how to set up the best possible hyperparameter search

When I set up my neural networks, I really have very little idea what I'm doing in advance. It may just be a bit of educated guesswork as to it may need a few layers only or this activation function could be useful for this type of problem.

This type of thinking could be quite useful, but it could also be leading me astray in developing a loose framework that's not quite suitable. I may be thinking do a hyper parameter optimisation, searching the optimal number of layers between 1 to 20 and maybe I don't need to search these activation functions but limit myself to a few to search over.

Is this just a sort of 'art' that with experience you get better at knowing? Ideally we would search across the whole space possible, but there is no infinite computing resource.

Are there tests/rules of thumb/'metrics' to guide how we should think about choosing hyperparameters in the first place, whether this begins with perhaps understanding our data/problem better, or testing the hyper parameter importance themselves on the data or ...?

Topic hyperparameter-tuning hyperparameter deep-learning neural-network machine-learning

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.