Computing the state-value function of a Markov decision process from the classical definition

data_science_learner

2022年1月12日 17:46

For the above Markov decision process under given action policy $a_1$, how can I determine the value of state $s_1$ using the state-value definition

$v(s)=E[G_t| S_t=s]$

where $G_t$ is the return? Assume that no discount (i.e., $\gamma=1$).