Understanding DQN Algorithm
Im studying the deep q learning algorithm. You can see it in the picture here: DQN
I have a few questions about the deep q learning algorithm.
What do they mean with row 14: If D_i = 0, set Y_i = ...
They want me to take an action a' which maximizes the function Q which means i have to insert every action a in that state.
If i have a1 and a2 I have to insert a1 and then a2 to test which gives me the maximum right? But the input of my networks are states. So how do I know which action maximizes my network?
Do I have to look in the last layer. Where I have Q(s,a1) and
Q(s,a2) to look which one has a higher value and take that action?
Like in this architecture
Topic q-learning reinforcement-learning neural-network machine-learning
Category Data Science