Does an RL agent learn during exploitation?

Question

Does an RL agent learn during exploitation?

cvg

2022年4月4日 05:04

I have started with RL and have some doubts regarding it.

Does an RL agent learn during exploitation, or does it only learn during exploration?
Is it possible to train a model only using exploitation (i.e. where exploration is not allowed)?

Topic ai reinforcement-learning machine-learning

Category Data Science

hh32 · Accepted Answer · 2019年11月4日 06:11

It depends on how you define learning. Usually learning in ML means to adapt some parameters of a model. In this case the agent does learn during exploitation. It will drive the probability mass to 1 for the currently best action, unless otherwise regularized.

Wilson dos Anjos Junior · Accepted Answer · 2019年6月6日 14:08

It depends on the game the agent is playing. If there are rewards all over the environment, the agent learns only when the coefficient of exploration is greater than zero. That is, if you are only allowing it to exploit, the agent may aswell just exploit the first reward it meets which ends a game.

In this case, it will find the first reward that ends the game, and will not change its algorithm (will not learn any other way of playing). On the other hand, if you allow it to explore, it may eventually find another better strategy (learn).

It is always good to conciliate the ratio of exploration and exploitation. It should always be capable of exploring, even if the coefficient is low. That is the whole advantage of Reinforcement Learning.

Does an RL agent learn during exploitation?

About