comparison of linear Q-learning and DQN
I saw in DQN nature paper 2015 https://www.nature.com/articles/nature14236
(Extended Data Table 4) some comparisons between DQN and linear Q-learning.
The ratio of reward for different games are very different. For instance I saw the ratio of reward of Linear Q-learning is between 3.8 to 100 times more than DQN? What is the reason of this big difference? How we can clarify this? Is it related to complexity or sensitivity of game? I am not very familiar with complexity and properties of these games.
In addition, in Extended Data Table 2 of this paper it has written it is compared with best linear learner?
Would you please explained this sentence:"Best Linear Learner is the best result obtained by a linear function approximator on different types of hand designed feature"
What is the meaning that by best? The best results during test of linear? Because I only know one linear regression
y=ax+b
, I only replace the DNN section of DQN with a linear regression (i.e. just input and output and linear without any hidden layer) I am confused with best word in this table.
Also, do you have seen the comparison with Logistic Regression Q-learning? (What I mean by logistic regression Q-learning is to repalce the deep neural network section of DQN with a Sigmoid function between input and output without any hidden layer) What is the ratio of result of this approach in comparison with DQN?
Topic dqn openai-gym game linear-regression logistic-regression
Category Data Science