What does anneal mean in the context of machine learning?

Question

What does anneal mean in the context of machine learning?

Reuben Walker

2020年11月22日 16:16

An article released by Open AI gives an overview of how Open AI Five works. There is a paragraph in the article stating:

Our agent is trained to maximize the exponentially decayed sum of future rewards, weighted by an exponential decay factor called γ. During the latest training run of OpenAI Five, we annealed γ from 0.998 (valuing future rewards with a half-life of 46 seconds) to 0.9997 (valuing future rewards with a half-life of five minutes).

Does annealing in this context mean the network found through training that γ was better as 0.9997? How would this be determined?

My limited understanding of the topic led me to the following assumption on how γ was annealed: Different versions of the network were trained for a given amount of time using different versions of γ. Then those different versions of the network played against each other or their true skill scored were compared to determine the ideal value of γ.

Topic openai-gym deep-learning definitions machine-learning

Category Data Science

Brian Spiering · Accepted Answer · 2020年11月22日 16:16

Annealing is short for simulated annealing. Simulated annealing is the process of slowly decreasing the probability of accepting worse solutions as the solution space is explored. Over the course of the experiment, γ value was slowly lowered to balance exploration and exploitation. γ is a machine learning hyperparameter so any hyperparameter search method would work (e.g., manually selection or cross-validation).

What does anneal mean in the context of machine learning?

About