Difference between a log scale and linear scale : np.random.rand()
How and why will a linear np.random.rand()
(to generate a linear scale between 0.0001 and 1) not result in better distributed result but np.power(10,-4 * np.random.rand())
(log scale) will.
I am just accepting the below explanation as it is, But I can't understand the reason behind it.(simple math, i believe, which i am missing. Kindly help with some example)
Reference to my query:
Using an Appropriate Scale to Pick Hyperparameters
To understand this, consider the number of hidden units hyperparameter. The range we are interested in is from 50 to 100. We can use a grid which contains values between 50 and 100 and use that to find the best value:
Now consider the learning rate with a range between 0.0001 and 1. If we draw a number line with these extreme values and sample the values uniformly at random, around 90% of the values will fall between 0.1 to 1. In other words, we are using 90% resources to search between 0.1 to 1, and only 10% to search between 0.0001 to 0.1. This does not look correct! Instead, we can use a log scale to choose the values:
This query is from Andrew Ng's Specialization Course on Deep Learning - Coursera
Course 2 Week 3
Topic coursera hyperparameter-tuning deep-learning
Category Data Science