[Question] Does hyperparameter tuning support custom vectorized environments? #439

antoinedang · 2024-03-15T13:14:40Z

❓ Question

Hello,
I have implemented a custom Vectorized Environment using Mujoco (which adheres to stable baseline 3's VecEnv standard), but I haven't found any evidence of RL Zoo 3 supporting (or not supporting) vectorized environments. When I pass my environment name in after registering it with OpenAI gym, RL Zoo 3 always tries to put a VecEnv wrapper on it (such as dummy or subprocenv) and it crashes with an error due to the fact that the interface for a normal Env is not the same as a VecEnv. I am wondering if there is a way (an argument I missed or some source code I could modify) such that I can directly pass in the name of a vectorized environment and RL Zoo 3 will skip the step of wrapping it in a DummyVecEnv and/or SubProcEnv wrapper. I've tried vec_env_wrapper argument in my hyperparameters config, setting env_wrapper to None, and many Google and source code searches but haven't found anything. It doesn't sound like RL Zoo 3 supports this out of the box, but I'm wondering if this is by choice, if I missed a section in the documentation or a past issue already raised, or if I can update the source code so it works for me? (I dont know much about the inner workings of RL Zoo 3, but it seems like an additional argument such as "is_env_vectorized" and an if statement would do the trick).

For context, my hyperparameter config is:

default_hyperparams = dict(
    policy = 'MlpPolicy',
    n_timesteps = 1e7,
    batch_size = 256,
    n_steps = 512,
    gamma = 0.95,
    learning_rate = 3.56987e-05,
    ent_coef = 0.00238306,
    clip_range = 0.3,
    n_epochs = 5,
    gae_lambda = 0.9,
    max_grad_norm = 2,
    vf_coef = 0.431892,
    policy_kwargs = dict(
                        log_std_init = -2,
                        ortho_init = False,
                        activation_fn = nn.ReLU,
                        net_arch = dict(pi=[256, 256], vf=[256, 256])
                    )
    )


hyperparams = {
    "GPUHumanoid": default_hyperparams
}

And I am calling the train.py script with arguments relevant to hyperparameter tuning with this script:

sys.argv = ["python", "-optimize",
                "--algo", "ppo",
                "--env", "GPUHumanoid",
                "--log-folder", "data/tuning_logs",
                "-n", "50000",
                "--n-trials", "1000",
                "--n-jobs", "2",
                "--sampler", "tpe",
                "--pruner", "median",
                "--env-kwargs", "num_envs:256",
                "--conf-file", "simulation.hyperparam_config"]
train()

The error code I get when I use the above arguments + hyperparameter config is:

/usr/local/lib/python3.10/dist-packages/gymnasium/utils/passive_env_checker.py:189: UserWarning: WARN: The result returned by `env.reset()` was not a tuple of the form `(obs, info)`, where `obs` is a observation and `info` is a dictionary containing additional information. Actual type: `<class 'numpy.ndarray'>`
  logger.warn(
too many values to unpack (expected 2)

It seems like its expecting the Env interface, and not VecEnv, but I can see from the source code that Envs are wrapped in a DummyVecEnv after being gym.make()'d.

Is there something I am missing? Any help would be greatly appreciated!

Checklist

I have checked that there is no similar issue in the repo
I have read the SB3 documentation
I have read the RL Zoo documentation
If code there is, it is minimal and working
If code there is, it is formatted using the markdown code blocks for both code and stack traces.

The text was updated successfully, but these errors were encountered:

antoinedang · 2024-03-15T14:00:08Z

Looking at the source code, it seems like it could be done by adding an if/else in

rl-baselines3-zoo/rl_zoo3/exp_manager.py

Line 622 in aa38145

env = make_vec_env(

such that make_vec_env is not called if the environment is already a vec_env, and instead we just set env = make_env(**env_kwargs)? More specifically, replace lines 622-632 with:

if self._hyperparams.get("env_is_vectorized", False):
        env = make_env(num_envs=n_envs, **env_kwargs)
else:
        env = make_vec_env(
            make_env,
            n_envs=n_envs,
            seed=self.seed,
            env_kwargs=env_kwargs,
            monitor_dir=log_dir,
            wrapper_class=self.env_wrapper,
            vec_env_cls=self.vec_env_class,  # type: ignore[arg-type]
            vec_env_kwargs=self.vec_env_kwargs,
            monitor_kwargs=self.monitor_kwargs,
        )

Curious if this might break things down the line, and/or if there is an already built solution I'm missing? (I'd rather not have to integrate the entire rl_zoo3 repo in my project for cleanliness' sake)

araffin · 2024-03-15T14:08:04Z

Looking at the source code, it seems like it could be done by adding an if/else in

In your case, the best is probably to fork the RL Zoo to adapt it to your needs (you can still install it as an editable package so you don't have to integrate it in your codebase).
gym.make() is supposed to return a gym.Env, not a VecEnv.

Curious if this might break things down the line, and/or if there is an already built solution I'm missing? (I'd rather not have to integrate the entire rl_zoo3 repo in my project for cleanliness' sake)

We have something similar for a tentative PR with envpool integration: #355

antoinedang · 2024-03-15T14:22:25Z

Looking at the source code, it seems like it could be done by adding an if/else in

In your case, the best is probably to fork the RL Zoo to adapt it to your needs (you can still install it as an editable package so you don't have to integrate it in your codebase). gym.make() is supposed to return a gym.Env, not a VecEnv.

Curious if this might break things down the line, and/or if there is an already built solution I'm missing? (I'd rather not have to integrate the entire rl_zoo3 repo in my project for cleanliness' sake)

We have something similar for a tentative PR with envpool integration: #355

Sounds good, thanks for letting me know. I'll fork and make the changes.

In case anyone else has the same question, I'll be updating the code here:
https://github.com/mcgill-robotics/Humanoid-rl-baselines3-zoo

araffin · 2024-03-15T15:51:36Z

Maybe, could you do a pr to update the docs?

antoinedang · 2024-03-15T16:28:00Z

Maybe, could you do a pr to update the docs?

Sure, however I'm new to the repo so I'm not sure the standards / where to do this. What exactly should I update and with what information? Should I do something along the lines of "If your custom environment implements the Stable Baselines 3 VecEnv class, you will have to update the source code (see issue [....])." in https://github.com/DLR-RM/rl-baselines3-zoo/blob/master/docs/guide/custom_env.rst?

araffin · 2024-03-18T08:35:54Z

What exactly should I update and with what information?
https://github.com/DLR-RM/rl-baselines3-zoo/blob/master/docs/guide/custom_env.rst

yes, this file.

with an explanation/link (link to this issue) on what to do when you have VectorEnv that are not gym.Env.

Something like https://stable-baselines3.readthedocs.io/en/master/guide/examples.html#sb3-and-procgenenv

antoinedang added the question Further information is requested label Mar 15, 2024

antoinedang closed this as completed Mar 15, 2024

araffin added the documentation Improvements or additions to documentation label Mar 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Does hyperparameter tuning support custom vectorized environments? #439

[Question] Does hyperparameter tuning support custom vectorized environments? #439

antoinedang commented Mar 15, 2024

antoinedang commented Mar 15, 2024 •

edited

Loading

araffin commented Mar 15, 2024 •

edited

Loading

antoinedang commented Mar 15, 2024

araffin commented Mar 15, 2024

antoinedang commented Mar 15, 2024 •

edited

Loading

araffin commented Mar 18, 2024

[Question] Does hyperparameter tuning support custom vectorized environments? #439

[Question] Does hyperparameter tuning support custom vectorized environments? #439

Comments

antoinedang commented Mar 15, 2024

❓ Question

Checklist

antoinedang commented Mar 15, 2024 • edited Loading

araffin commented Mar 15, 2024 • edited Loading

antoinedang commented Mar 15, 2024

araffin commented Mar 15, 2024

antoinedang commented Mar 15, 2024 • edited Loading

araffin commented Mar 18, 2024

antoinedang commented Mar 15, 2024 •

edited

Loading

araffin commented Mar 15, 2024 •

edited

Loading

antoinedang commented Mar 15, 2024 •

edited

Loading