SB3 v1.6.0: Recurrent PPO (PPO LSTM), better defaults for learning from pixels with SAC/TD3
SB3 Contrib: https://github.com/Stable-Baselines-Team/stable-baselines3-contrib
Breaking Changes:
- Changed the way policy "aliases" are handled ("MlpPolicy", "CnnPolicy", ...), removing the former
register_policy
helper,policy_base
parameter and usingpolicy_aliases
static attributes instead (@Gregwar) - SB3 now requires PyTorch >= 1.11
- Changed the default network architecture when using
CnnPolicy
orMultiInputPolicy
with SAC or DDPG/TD3,
share_features_extractor
is now set to False by default and thenet_arch=[256, 256]
(instead ofnet_arch=[]
that was before)
SB3-Contrib
- Added Recurrent PPO (PPO LSTM). See Stable-Baselines-Team/stable-baselines3-contrib#53
Bug Fixes:
- Fixed saving and loading large policies greater than 2GB (@jkterry1, @ycheng517)
- Fixed final goal selection strategy that did not sample the final achieved goal (@qgallouedec)
- Fixed a bug with special characters in the tensorboard log name (@quantitative-technologies)
- Fixed a bug in
DummyVecEnv
's andSubprocVecEnv
's seeding function. None value was unchecked (@ScheiklP) - Fixed a bug where
EvalCallback
would crash when trying to synchronizeVecNormalize
stats when observation normalization was disabled - Added a check for unbounded actions
- Fixed issues due to newer version of protobuf (tensorboard) and sphinx
- Fix exception causes all over the codebase (@cool-RR)
- Prohibit simultaneous use of optimize_memory_usage and handle_timeout_termination due to a bug (@MWeltevrede)
- Fixed a bug in
kl_divergence
check that would fail when using numpy arrays with MultiCategorical distribution
Others:
- Upgraded to Python 3.7+ syntax using
pyupgrade
- Removed redundant double-check for nested observations from
BaseAlgorithm._wrap_env
(@TibiGG)
Documentation:
- Added link to gym doc and gym env checker
- Fix typo in PPO doc (@bcollazo)
- Added link to PPO ICLR blog post
- Added remark about breaking Markov assumption and timeout handling
- Added doc about MLFlow integration via custom logger (@git-thor)
- Updated Huggingface integration doc
- Added copy button for code snippets
- Added doc about EnvPool and Isaac Gym support