Refactor common #540

Miffyli · 2019-11-04T07:09:19Z

Refactoring common to include most shared code between learning algorithms.

Description

Move (almost) any shared code out from algorithm-specific directories into commons, e.g:

a2c/utils.py defined most of the TF layers and utilities. Now in common/tf_layers.py
ReplayBuffer was defined under DQN but also used by DDPG. Moved to common/replay_buffer.py .
Some math utils like safe_mean moved from under PPO2 to common/math_util.py

Also removed unused code common/tf_util.py, calc_entropy_softmax.

PPO1/TRPO still share code between themselves (defined in PPO1). This should also be moved in commons, but at the same time they are the only ones using this very specific type of code (trajectory generation).

Things to-do and discuss:

Unifying some of the code. E.g. there are now two different types of parameter schedulers in common/schedulers.py.

Motivation and Context

closes #503

While there is a common directory for shared code between learning algorithms, bunch of shared code existed under learning algorithms that was imported elsewhere code (e.g. Half of the TF definitions were under a2c/utils.py). As discussed in #503, it would make more sense to have learning algorithms import shared code from common, and common should not depend on per-algorithm definitions.

This also aims to keep all backend (Tensorflow) related code few files, now contained in tf_util.py and tf_layers.py under common. This should help a tiny bit when transitioning to other backends.

Types of changes

Refactoring (no changes to documentation)

Checklist:

I've read the CONTRIBUTION guide (required)
I have updated the changelog accordingly (required).
I have updated the documentation accordingly.

araffin · 2019-11-04T08:10:48Z

Good that you started the refactoring =)
I will try to give you some feedback this week and open at the same time a PR for tf2 to have more visibility on the progress

Miffyli · 2019-11-09T09:36:53Z

Given that this is mostly related to (and motivated by) the transition to new a new backend, I think we should merge this with the TF2 branch (#542) whenever it is ready for merging.

araffin · 2019-11-09T09:39:41Z

I would merge it with both ;) (I should have the time today to look at this one)

araffin

Overall, LGTM ;) (documentation is missing but you know it)
I don't know if we should be backward compatible when merging that with master (as we did for the action noise for instance)
This compatibility is obviously not required for tf2 branch.

stable_baselines/a2c/utils.py

stable_baselines/common/schedules.py

araffin · 2019-11-09T10:51:11Z

stable_baselines/common/tf_util.py

+# Logging
+# ================================================================
+
+def total_episode_reward_logger(rew_acc, rewards, masks, writer, steps):


Note for myself: we should fix this one, it is not working properly when n_envs > 0 for a while now...

Miffyli · 2019-11-20T14:03:53Z

Regarding this PR: I am starting to have conflicted feelings about moving things to common. One one hand it makes intuitive sense to have all shared code between and algorithms should not depend on each other, but on the other hand some of the current implementations are very algorithm specific (e.g. only used by PPO1 and TRPO). Doing this PR would end up with code in common which in reality is tightly tied to those specific algorithms.

Should we continue loyal to these guidelines (algorithms import common, common does not import algorithms), or should we figure out better refactoring while doing this? Some refactoring was already discussed on Schedulers.

araffin

I understand your problem, so what I would do:

put as much as you can (and think) in common, this should be fairly general helper/class
put the helpers that are really algorithm specific into a utils.py next to the algorithm (that's already the case for A2C and TRPO)

stable_baselines/deepq/__init__.py

Miffyli · 2020-02-16T11:40:22Z

Alright, now everything should be done. The only remaining "oddball" is GAIL importing from TRPO, but since these two are too intimately linked I decided to leave them as it is (and these algos should go to Adam's repository, as I understood).

@araffin
If good for merging, we can first merge e.g. #644 and I can sort out the merge conflicts and whatnot before merging this one :)

Miffyli · 2020-02-27T20:00:40Z

Merged #644 and other PRs. Everything is ready for review from my end.

araffin · 2020-02-27T20:33:09Z

docs/misc/changelog.rst

+
+   - Algorithms no longer import from each other.
+   - `common` no longer import from algorithms.
+   - Moved shared code to new files `common/math_util.py`, `common/buffers.py` and `common/tf_layers.py`.


Those are breaking changes, no?

Maybe it would be good to list which function were moved and where

Right, good point. Somehow I completely missed that ^^. I will document this.

Updated changelog with such a list.

araffin

LGTM =)

Miffyli added 15 commits October 27, 2019 13:53

Move tensorflow layer definitions to a new file

c2bbe5f

Move Scheduler from A2C utils to common schedules file

a965d6d

Add missing definitions for legacy Scheduler

3145d2c

Move tensorflow-related utilities tf_utils

80ecc97

Move total_episode_reward_logger to tf_util

99942a9

Move get_by_index to ACER codes (only used by ACER)

6f781d3

Move EpisodeStats to ACER codes (only used by ACER)

f3afa3a

Finish refactoring a2c/utils.py

4a2eeeb

Refactor ppo2.py (move shared code elsewhere)

be3eb9a

Refactor replay buffer (mode out from deepq to commons)

e309589

Remove shared function from SAC and TD3 (get_vars)

b4cee99

Remove unused code

f2a0b61

Move flatten_lists to common file

a8c38fd

Fix imports in tests

b8ba9c5

Add missing import to ACER

7990f18

Merge branch 'master' into refactor/common

aed2578

araffin reviewed Nov 9, 2019

View reviewed changes

araffin reviewed Nov 20, 2019

View reviewed changes

stable_baselines/deepq/__init__.py Outdated Show resolved Hide resolved

araffin added this to the v3.0.0 milestone Nov 20, 2019

araffin added the v3 Discussion about V3 label Nov 23, 2019

araffin mentioned this pull request Nov 23, 2019

V3.0 implementation design #576

Closed

Miffyli added 5 commits January 3, 2020 15:36

merge master

9910714

Merge branch 'master' into refactor/common

e9169f0

Fix ACER dtype error

784235b

Rename replay_buffer -> buffers

3a11de4

Remove unused import

a478dc1

araffin mentioned this pull request Feb 5, 2020

Code Cleanup #680

Open

Miffyli added 6 commits February 16, 2020 12:28

Merge branch 'master' into refactor-common

e6a1e32

Fix import in a test

158f883

Move orphan method to more social circles

012201c

Move PPO1/TRPO seg_gen to commons

71c030e

Update changelog

daf72ee

Move SAC/TD3 policy code under more suitable tf_layers

9c0426a

Miffyli marked this pull request as ready for review February 16, 2020 11:38

Miffyli requested a review from araffin February 16, 2020 11:40

Miffyli added 3 commits February 27, 2020 18:16

Merge master

67c952e

Merge branch 'master' into refactor/common

31fbbe2

Update to new traj_seg_gen

6632dad

araffin reviewed Feb 27, 2020

View reviewed changes

Add list of what was moved where

73f4a1f

araffin approved these changes Feb 28, 2020

View reviewed changes

araffin merged commit a4efff0 into master Feb 28, 2020

araffin deleted the refactor/common branch February 28, 2020 19:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor common #540

Refactor common #540

Miffyli commented Nov 4, 2019 •

edited

Loading

araffin commented Nov 4, 2019

Miffyli commented Nov 9, 2019

araffin commented Nov 9, 2019

araffin left a comment

araffin Nov 9, 2019

Miffyli commented Nov 20, 2019

araffin left a comment •

edited

Loading

Miffyli commented Feb 16, 2020

Miffyli commented Feb 27, 2020

araffin Feb 27, 2020

Miffyli Feb 27, 2020

Miffyli Feb 27, 2020

araffin left a comment

Refactor common #540

Refactor common #540

Conversation

Miffyli commented Nov 4, 2019 • edited Loading

Description

Motivation and Context

Types of changes

Checklist:

araffin commented Nov 4, 2019

Miffyli commented Nov 9, 2019

araffin commented Nov 9, 2019

araffin left a comment

Choose a reason for hiding this comment

araffin Nov 9, 2019

Choose a reason for hiding this comment

Miffyli commented Nov 20, 2019

araffin left a comment • edited Loading

Choose a reason for hiding this comment

Miffyli commented Feb 16, 2020

Miffyli commented Feb 27, 2020

araffin Feb 27, 2020

Choose a reason for hiding this comment

Miffyli Feb 27, 2020

Choose a reason for hiding this comment

Miffyli Feb 27, 2020

Choose a reason for hiding this comment

araffin left a comment

Choose a reason for hiding this comment

Miffyli commented Nov 4, 2019 •

edited

Loading

araffin left a comment •

edited

Loading