When to tune hyperparameters in deep learning

I am currently playing around with different CNN and LSTM model architectures for my multivariate time series classification problem.

I can achieve validation accuracy of better than 50 %. I would like to lock down an exact architecture at some stage instead of experimenting endlessly. In order to decide this, I want to also tune my hyperparameters.

Question: How do I balance the need to experiment with different models, such as standalone CNN and CNN with LSTM against hyperparameter tuning? Is there such a thing as premature optimization?

I am running my training on AWS SageMaker and I can work in parallel if needed.

Cheers.

Topic sagemaker deep-learning aws time-series machine-learning

Category Data Science


For specific models, there is some hyper-parameter optimization algorithm and some toolbox that do it. If I say the famous one, the grid search is a well-known(but not optimum) algorithm for hyperparameter optimization. For more info you can visit this page

Moreover, you can find a good tutorial in this page1 or page2

but for cascading some architectures and testing them, you must design architecture and then tune it by above-mentioned algorithm. I may be led you to the optimum model for your deserved task.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.