Tune learning rate while tuning other HP

When doing hyperparameters optimisation, like a Random Search, should you add a search space for the learning rate ?

My intuition is that some HP might work better with a certain LR, and be sub-optimal with a lower LR. But if I add LR to the search space, I fear that the random search will only favour high LR trials, as they will reach lower loss for the same limited number of max epochs.

What would be the right way to do it ?

Topic hyperparameter-tuning learning-rate machine-learning

Category Data Science


Learning rate probably should not be considered an independent hyperparameter as it is usually a good idea to adjust it proportionally to batch size.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.