Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Suggestion for V3] All RL algorithms should behave like current DDPG and automatically normalize input features #773

Closed
siferati opened this issue Mar 30, 2020 · 2 comments
Labels
v3 Discussion about V3

Comments

@siferati
Copy link

I'd like to suggest a couple features for V3, in case it hasn't been suggested already:

  • All RL algorithms are able to automatically normalize input features, similar to how it currently works with DDPG. I believe it is better this way, rather than forcing the user to use wrappers, such as VecNormalize.

  • All RL algorithms are able to wrap the given environment (or list of environments) into DummyVecEnv (or list of DummyVecEnv). Again, I believe it is better for this task to fall under the developer rather than the user.

@araffin araffin added the v3 Discussion about V3 label Mar 30, 2020
@araffin
Copy link
Collaborator

araffin commented Mar 30, 2020

Hello,

All RL algorithms are able to automatically normalize input features, similar to how it currently works with DDPG. I believe it is better this way, rather than forcing the user to use wrappers, such as VecNormalize.

I would argue against that one. In the current master version (and in v3), all algorithms now support normalization through VecNormalize wrapper. The main issue for me would be unecessary complexity and clarity of what is happening. Unless you have an elegant way to implement that feature in pytorch (I would be happy to see it).

Also,VecNormalize allows to normalize reward and is nice to have it separated from the RL algorithm (because it acts/and uses info only from the environment).

All RL algorithms are able to wrap the given environment (or list of environments) into DummyVecEnv (or list of DummyVecEnv). Again, I believe it is better for this task to fall under the developer rather than the user.

What do you mean exactly?
VecEnv will be the default for v3 (cf #576 #733 ). However, if you mean that all algorithms should support multiprocessing this is something else.
It would be nice to have unless it adds to much complexity. Also, it would remove the per-episode training capability for SAC or TD3 for instance, which is essential in a robotic setting.

@araffin
Copy link
Collaborator

araffin commented May 9, 2020

multiprocessing for all algorithms is in the roadmap for Stable-Baselines3 v1.1+ DLR-RM/stable-baselines3#1

@araffin araffin closed this as completed May 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v3 Discussion about V3
Projects
None yet
Development

No branches or pull requests

2 participants