Reinforcement Learning End Effector Moving To Camera and Stops Learning

Question

Reinforcement Learning End Effector Moving To Camera and Stops Learning

Muhammad Hamza Yousuf

2021年6月6日 09:21

I am working on training a 3 finger Jaw gripper. The environment I setup is this:

UR10 3 finger robot
Pybullet for Simulation
Stable baselines and DDPG
Observation space is RGB image stacked with Depth and Segmentation Mask
Action space is dx,dy,dz added to current position of end effector (wrist of robot) alpha, beta, gamma as orientation angles of end effector and joint positions of fingers.
Reward 1: (1 - ((end effector distance from object)/(some max distance)))*10
Reward 2: When all three fingers in contact with object, reward is = (height of object) * 30 while staying in contact
Reward 3: On certain height I add another 1000 and end the episode
Termination 2: After 1000 time steps (will reduce it to 300)

My problem is after training it for 125000 timesteps (in approx. 10 hours). The robot instead of maximizing its reward and moving close to object, moves directly toward camera (which is away from object) and stays there collecting approximately 6.5 reward each step instead of 10 it could get by moving closer to object.

Here is the Picture:

What could be the issue? This is my first try at reinforcement learning and spent about two weeks just learning how to and then setting up the environment. I am kinda clueless here as the reward function looks good enough.

Here is the code for the environment and where I am setting up the training

Topic reward openai-gym reinforcement-learning python

Category Data Science

Reinforcement Learning End Effector Moving To Camera and Stops Learning

About