-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: autoreset wrappers #223
Conversation
Hi Sasha, |
Yup this is exactly the use case! |
There was an issue with the autoreset wrappers: they never showed you the final timestep. This is an issue if the final timestep is a truncation (discount = 1, timestep.last() = true), then we'd want this observation in order to get the next value for our value target, but currently this would not be possible.
Gym fixes does it in the same way I am proposing as can be seen here they place the final observation in infos. What this PR does is always place either the current observation or the terminal observation in
timestep.extras["real_next_obs"]
Here's how I'm currently using for SAC (and it's working well there):