temperature variable in boltzmmann-exploration in reinforcement learning

cvg

2019年9月26日 08:01

I have been using epsilon greedy action selection strategy and recently have come across boltzmann(softmax) action selection strategy. One thing I am not clear about boltzmann exploration is the temperature variable. How should we define this variable. Is this a constant variable or should be decreased over the period of training. and how to decide on the absolute value of this parameter?

Thanks

Topic deepmind softmax ai reinforcement-learning deep-learning

Category Data Science

temperature variable in boltzmmann-exploration in reinforcement learning

About