Modify Policy Evaluation Solution.ipynb according to David Silver's slides. #166

QikeLi · 2018-07-05T01:08:19Z

The solution provided for the Policy Evaluation does not agree with the equation on page 8 of Dr. David Silvers' slides for lecture 3.

amobiny · 2018-11-30T00:43:48Z

What you are saying is correct, but Denny is implementing a more general case.
In fact, in David Silver slides, there's an assumption that taking an action, a, in state s will give a reward, R, no matter what the state transition is. In Denny's implementation, he takes into account that an action could result in different rewards based on what state the environment puts you in. Since this environment is deterministic, both implementation gives the same answer.

Modify Policy Evaluation Solution according to David Silver's slides.

c21dec4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modify Policy Evaluation Solution.ipynb according to David Silver's slides. #166

Modify Policy Evaluation Solution.ipynb according to David Silver's slides. #166

QikeLi commented Jul 5, 2018

amobiny commented Nov 30, 2018

Modify Policy Evaluation Solution.ipynb according to David Silver's slides. #166

Are you sure you want to change the base?

Modify Policy Evaluation Solution.ipynb according to David Silver's slides. #166

Conversation

QikeLi commented Jul 5, 2018

amobiny commented Nov 30, 2018