Skip to content

Commit

Permalink
Fix PPO logging of clip_fractions (#150)
Browse files Browse the repository at this point in the history
* bugfix for PPO logging of clip_fractions

* Update changelog.rst

Co-authored-by: Antonin RAFFIN <antonin.raffin@ensta.org>
  • Loading branch information
diditforlulz273 and araffin authored Sep 1, 2020
1 parent f8c25d3 commit 4fd408b
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 1 deletion.
2 changes: 2 additions & 0 deletions docs/misc/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ New Features:
Bug Fixes:
^^^^^^^^^^
- Fixed a bug where the environment was reset twice when using ``evaluate_policy``
- Fix logging of ``clip_fraction`` in PPO (@diditforlulz273)

Deprecations:
^^^^^^^^^^^^^
Expand Down Expand Up @@ -398,3 +399,4 @@ And all the contributors:
@MarvineGothic @jdossgollin @SyllogismRXS @rusu24edward @jbulow @Antymon @seheevic @justinkterry @edbeeching
@flodorner @KuKuXia @NeoExtended @PartiallyTyped @mmcenta @richardwu @kinalmehta @rolandgvc @tkelestemur @mloo3
@tirafesi @blurLake @koulakis @joeljosephjin @shwang @rk37 @andyshih12 @RaphaelWag @xicocaio
@diditforlulz273
2 changes: 1 addition & 1 deletion stable_baselines3/ppo/ppo.py
Original file line number Diff line number Diff line change
Expand Up @@ -228,7 +228,7 @@ def train(self) -> None:
logger.record("train/policy_gradient_loss", np.mean(pg_losses))
logger.record("train/value_loss", np.mean(value_losses))
logger.record("train/approx_kl", np.mean(approx_kl_divs))
logger.record("train/clip_fraction", np.mean(clip_fraction))
logger.record("train/clip_fraction", np.mean(clip_fractions))
logger.record("train/loss", loss.item())
logger.record("train/explained_variance", explained_var)
if hasattr(self.policy, "log_std"):
Expand Down

0 comments on commit 4fd408b

Please sign in to comment.