Does GPU decreases training time for on-policy RL?
I was wondering whether using a GPU will be effective if I am using an on-policy (eg PPO) RL as the model?
I.e, how can we use a GPU to decrease training time for an on-policy RL model?
I recently trained a model and GPU utilization was around 2%.
Topic policy-gradients gpu reinforcement-learning
Category Data Science