On what principle did Google's DeepMind learn to walk?

learner

2021年4月9日 15:54

I just saw this video on Youtube.

On what principle did Google's DeepMind learn to walk?

Was it Q-Learning or a Genetic Algorithm or Policy Gradient?

Topic deepmind q-learning genetic-algorithms deep-learning machine-learning

Category Data Science

Neil Slater answered at 2021年3月29日 12:12

The full method is explained in the paper Emergence of Locomotion Behaviours in Rich Environments by the DeepMind team.

Quoting from that paper:

Using a novel scalable variant of policy gradient reinforcement learning, our agents learn to run, jump, crouch and turn as required by the environment . . .

So to answer your question, the researchers used a policy gradient method.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.