Tuning Batch size and Learning rate in neural net

Question

Tuning Batch size and Learning rate in neural net

Suvra Dutta

2021年11月19日 16:15

The following MCQ question is provided in Exam Readiness: AWS Certified Machine Learning - Specialty document. The correct answer has been marked in the document but I am not able to understand why this option is correct.

Question: A data scientist is working on optimizing a model during the training process by varying multiple parameters. The data scientist observes that, during multiple runs with identical parameters, the loss function converges to different, yet stable, values. What should the data scientist do to improve the training process?

A. Increase the learning rate. Keep the batch size the same. [REALISTIC DISTRACTOR]

B. Reduce the batch size. Decrease the learning rate. [CORRECT]

C. Keep the batch size the same. Decrease the learning rate. [REALISTIC DISTRACTOR]

D. Do not change the learning rate. Increase the batch size. [REALISTIC DISTRACTOR]

My understanding of the problem is that after every run the optimizer is getting stuck in different local minimas. In that case reducing batch size will add randomness and will avoid local minima. But how does decreasing learning help?

May be a large learning rate will make it wiggle too much(given small batch size)..... but still decreasing learning rate will increase the probability of hitting local minima.

Topic hyperparameter-tuning mini-batch-gradient-descent learning-rate deep-learning

Category Data Science

Tuning Batch size and Learning rate in neural net

About