[Suggestion for V3] All RL algorithms should behave like current DDPG and automatically normalize input features #773

siferati · 2020-03-30T11:49:42Z

I'd like to suggest a couple features for V3, in case it hasn't been suggested already:

All RL algorithms are able to automatically normalize input features, similar to how it currently works with DDPG. I believe it is better this way, rather than forcing the user to use wrappers, such as VecNormalize.
All RL algorithms are able to wrap the given environment (or list of environments) into DummyVecEnv (or list of DummyVecEnv). Again, I believe it is better for this task to fall under the developer rather than the user.

araffin · 2020-03-30T12:03:40Z

Hello,

All RL algorithms are able to automatically normalize input features, similar to how it currently works with DDPG. I believe it is better this way, rather than forcing the user to use wrappers, such as VecNormalize.

I would argue against that one. In the current master version (and in v3), all algorithms now support normalization through VecNormalize wrapper. The main issue for me would be unecessary complexity and clarity of what is happening. Unless you have an elegant way to implement that feature in pytorch (I would be happy to see it).

Also,VecNormalize allows to normalize reward and is nice to have it separated from the RL algorithm (because it acts/and uses info only from the environment).

All RL algorithms are able to wrap the given environment (or list of environments) into DummyVecEnv (or list of DummyVecEnv). Again, I believe it is better for this task to fall under the developer rather than the user.

What do you mean exactly?
VecEnv will be the default for v3 (cf #576 #733 ). However, if you mean that all algorithms should support multiprocessing this is something else.
It would be nice to have unless it adds to much complexity. Also, it would remove the per-episode training capability for SAC or TD3 for instance, which is essential in a robotic setting.

araffin · 2020-05-09T12:05:21Z

multiprocessing for all algorithms is in the roadmap for Stable-Baselines3 v1.1+ DLR-RM/stable-baselines3#1

araffin added the v3 Discussion about V3 label Mar 30, 2020

araffin closed this as completed May 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Suggestion for V3] All RL algorithms should behave like current DDPG and automatically normalize input features #773

[Suggestion for V3] All RL algorithms should behave like current DDPG and automatically normalize input features #773

siferati commented Mar 30, 2020

araffin commented Mar 30, 2020

araffin commented May 9, 2020

[Suggestion for V3] All RL algorithms should behave like current DDPG and automatically normalize input features #773

[Suggestion for V3] All RL algorithms should behave like current DDPG and automatically normalize input features #773

Comments

siferati commented Mar 30, 2020

araffin commented Mar 30, 2020

araffin commented May 9, 2020