How to choose between discounted reward and average reward?
How to select between average reward and discounted reward?
And when average reward is more effective in comparison with discounter reward and when vice versa is correct?
Is is possible to use both of them in a problem? Because as I understand the RL reward is based on average reward or discounted future reward, but I think this paper use the discounted and average together. Is it correct: we use discounted future reward in order to training and average reward in test and evaluation? What is wrong in my understanding?
In this picture, figure 2 of the paper "Playing Atari with Deep Reinforcement Learning":
The authors report the "average reward". However, in the same paper, the authors also mention "discounted reward". So, I'm confused. What is the difference between discounted reward and average reward?
Topic discounted-reward dqn reinforcement-learning
Category Data Science