Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for a) Seeding problems from gym env_checker b) infos API change #165

Closed
wants to merge 14 commits into from

Conversation

jjshoots
Copy link
Member

@jjshoots jjshoots commented Jun 2, 2022

There are two issues that the new gym release has brought to Supersuit.

a) Seeding issues

Because of the new env_checker in gym, when env_checker is enabled on gym.make() (which is done by default), the environment gets unintentionally seeded.
As a result, any vec envs derived from single envs end up having the same seed even when the seed given to them was None.
The current fix just reseeds any vec envs that are derived from single envs.

Although this works, I don't quite like this solution as it seems very hacky(?), though I may be educated otherwise.
I think the better solution is for env_checker to reset the env's seed instead of having it be done outside, which is probably the expected behaviour of env_checker (checking without modifying).

b) Infos API change

Because of the recent infos API change on gym 0.24.0, the infos style for supersuit vec env wrappers are now no longer coherent to the infos style of gym vec envs.

If I understand things correctly, there are two wrappers that need to be fixed:

ConcatVecEnv

I've implemented a simple hacky fix for this, though I think my hacky fix is still pretty suboptimal and could take some ideas from people.

ProcConcatVecEnv

I've not touched this wrapper yet mostly because I don't really know what's going on with it yet.

@jjshoots
Copy link
Member Author

jjshoots commented Jun 2, 2022

@benblack769 @jkterry1 @RedTachyon @pseudo-rnd-thoughts I'd appreciate your feedback on this.

@pseudo-rnd-thoughts
Copy link
Member

For the info tests, it looks like you just need to update the tests to reflect the updated info style

@benblack769
Copy link
Contributor

A couple comments:

  1. I took a look at the env_checker, and didn't see anything that should affect seeding. Perhaps this is a broader issue that affects all environments?
  2. I think the solution to seeds being the same is simply seeding with self.seed(None) inside __init__() in ConcatVecEnv and ProcConcatVecEnv

@pseudo-rnd-thoughts
Copy link
Member

The sending issue will be removed in v25.

@RedTachyon
Copy link
Member

@benblack769 The issue was e.g. at https://github.com/openai/gym/blob/master/gym/utils/env_checker.py#L72
But as @pseudo-rnd-thoughts says, it's currently being fixed.

Re 2: self.seed(...) is deprecated so depending on that is not the greatest idea. Nothing should be necessary to do when the env checker is fixed, but in case of a similar issue, it's probably best to just directly delete env._np_random

@jjshoots
Copy link
Member Author

@benblack769 could you perhaps look at updating info styles for procconcatvecenv?

@benblack769
Copy link
Contributor

@jjshoots For the infos, you should be able to basically copy the approached used in gym. https://github.com/openai/gym/blob/master/gym/vector/vector_env.py#L238 https://github.com/openai/gym/blob/master/gym/vector/async_vector_env.py#L376 , basically just transforming the infos after the fact.

The apporpriate place to do this in proccatenvs would be here: https://github.com/Farama-Foundation/SuperSuit/blob/master/supersuit/vector/multiproc_vec.py#L202

To get reset() to return info, you will also need to extract the infos returned here https://github.com/Farama-Foundation/SuperSuit/blob/master/supersuit/vector/multiproc_vec.py#L179

@jjshoots
Copy link
Member Author

Ok I give up. I have no idea what's going on, everything somehow depends on everything, not a single file has comments or docstrings or typehints, and there are cathartic lists of dicts of tuples of lists of stuff that's impossible to interpret because nothing has comments or is even remotely close to being explained.

If anyone wants to have a look and take a whack at it be my guest, otherwise it shall remain broken until we revamp all the wrappers and merge it into PZ.

@benblack769
Copy link
Contributor

@jjshoots Sorry about the lack of comments :( I don't have a lot of excuses for this.

As for these new changes, I don't really understand the new gym infos change either.

However, with my very poor understanding, I feel this is something that you can implement with a simple wrapper (maybe not perfectly efficiently, but functionally), without knowing anything about the implementation of the base vector environments other than their output format (the old list of dicts format)? Isn't it just a few data transformations and concatenations defined here?

@jjshoots
Copy link
Member Author

Hey @benblack769, don't worry about it, it happens. :) Technically yes, it's just a transformation of the data, but as it stands I don't really understand how the code flow works (I haven't spent enough time with it probably), and the infos are quite densely packed (think lists of dicts of lists), which makes it hard to decipher. For now I think we may just leave this until we deprecate the old wrappers and move things into PettingZoo/Gym. Although if you're free to come on a call one day to explain what's happening, that's be great too.

@WillDudley
Copy link
Contributor

#188

Related, I'm pretty sure.

@jjshoots jjshoots closed this Sep 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants