Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provided policy_improvement() solution is not guaranteed to terminate #203

Open
link2xt opened this issue Jun 23, 2019 · 1 comment
Open

Comments

@link2xt
Copy link

link2xt commented Jun 23, 2019

To set policy_stable variable, provided code checks whether the policy is changed. If there are multiple optimal policies, the policy may change infinitely even though optimal policy is already found.

See Exercise 4.4 of the 2018 edition in Sutton & Barto book, it explicitly points out this bug in the pseudocode.

@link2xt
Copy link
Author

link2xt commented Jun 23, 2019

Also see related issue #202 about naming of the function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant