Port TorchRL tutorial for DDPG Loss to pytorch.org #2413

nairbv · 2023-06-02T21:04:58Z

Port existing tutorial from torchrl to pytorch.org.

Description

copy of https://pytorch.org/rl/tutorials/coding_ddpg.html
The tutorial uses half_cheetah so used that for the image in the index.

cc @vmoens

Checklist

The issue that is being fixed is referred in the description (see above "Fixes #ISSUE_NUMBER")
Only one issue is addressed in this pull request
Labels from the issue that this PR is fixing are added to this pull request
No unnecessary issues are included into this pull request.

facebook-github-bot · 2023-06-02T21:05:03Z

Hi @nairbv!

Thank you for your pull request.

We require contributors to sign our Contributor License Agreement, and yours needs attention.

You currently have a record in our system, but the CLA is no longer valid, and will need to be resubmitted.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

netlify · 2023-06-02T21:09:03Z

✅ Deploy Preview for pytorch-tutorials-preview ready!

Name	Link
🔨 Latest commit	`f4a5e4b`
🔍 Latest deploy log	https://app.netlify.com/sites/pytorch-tutorials-preview/deploys/64888d7b2273f50008c7f833
😎 Deploy Preview	https://deploy-preview-2413--pytorch-tutorials-preview.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

svekars

A few editorial suggestions

intermediate_source/coding_ddpg.py

svekars · 2023-06-02T21:48:28Z

intermediate_source/coding_ddpg.py

+# environment and reset it when required.
+# Data collectors are designed to help developers have a tight control
+# on the number of frames per batch of data, on the (a)sync nature of this
+# collection and on the resources allocated to the data collection (e.g. GPU,


Suggested change

# collection and on the resources allocated to the data collection (e.g. GPU,

# collection and on the resources allocated to the data collection (for example, GPU,

svekars · 2023-06-02T21:48:43Z

intermediate_source/coding_ddpg.py

+# Data collectors are designed to help developers have a tight control
+# on the number of frames per batch of data, on the (a)sync nature of this
+# collection and on the resources allocated to the data collection (e.g. GPU,
+# number of workers etc).


Suggested change

# number of workers etc).

# number of workers, and so on).

svekars · 2023-06-02T21:49:06Z

intermediate_source/coding_ddpg.py

+# - the policy,
+# - the total number of frames before the collector is considered empty,
+# - the maximum number of frames per trajectory (useful for non-terminating
+#   environments, like dm_control ones).


Suggested change

# environments, like dm_control ones).

# environments, like the ``dm_control`` ones).

svekars · 2023-06-02T21:50:18Z

intermediate_source/coding_ddpg.py

+#
+# .. note::
+#   As already mentioned above, to get a more reasonable performance,
+#   use a greater value for ``total_frames`` e.g. 1M.


Suggested change

# use a greater value for ``total_frames`` e.g. 1M.

# use a greater value for ``total_frames`` for example, 1M.

svekars · 2023-06-02T21:50:33Z

intermediate_source/coding_ddpg.py

+# Conclusion
+# ----------
+#
+# In this tutorial, we have learnt how to code a loss module in TorchRL given


Suggested change

# In this tutorial, we have learnt how to code a loss module in TorchRL given

# In this tutorial, we have learned how to code a loss module in TorchRL given

svekars

A few editorial suggestions

svekars · 2023-06-05T17:50:10Z

@nairbv the PR looks great overall! Can you please address the editorial suggestions, sign the CLA, and fix the spellcheck.

Port existing tutorial from torchrl to pytorch.org. pytorch#2351

svekars · 2023-06-05T18:37:58Z

Please rebase to run against the latest GH workflow.

nairbv · 2023-06-05T19:25:08Z

did a rebase and made some of the stylistic changes, still waiting internally to be added to CLA

nairbv · 2023-06-05T20:22:47Z

the pyspelling spell checker seems to be having an issue with acronyms like DDPG, common terms like cuda, as well as variable names mentioned from within code comments. What's the recommended way of addressing these kinds of CI errors?

Co-authored-by: Svetlana Karslioglu <svekars@fb.com>

svekars · 2023-06-05T22:46:02Z

@nairbv Check the https://github.com/pytorch/tutorials/blob/main/en-wordlist.txt for the spelling of CUDA and some other words and for the new acronyms, you can propose an update to that file. All code references (name of functions, classes, API methods, etc) should be enclosed in double quotes.

nairbv · 2023-06-06T13:30:20Z

you can propose an update to that file

hmm.... that works for terms like DDPG I guess but...

All code references (name of functions, classes, API methods, etc) should be enclosed in double quotes.

This seems unconventional for single-line code comments. E.g. env on line 502 and 860, putting ticks around these won't cause them to be rendered the way they would in the text of the tutorial, and isn't how small code comments are typically written within code. It also wouldn't make sense to add variable names from a particular code block into a dictionary file.

In cases like the env example, are double ticks really the right solution? Is there any other option?

svekars · 2023-06-06T15:20:43Z

It is a hack, yes. But Pyspelling can't differentiate between a one-line comment and a multiline comment.

vmoens

Happy with this. See my comment about next steps

vmoens · 2023-06-06T16:12:39Z

intermediate_source/coding_ddpg.py

+#   loss component;
+# - How to use (or not) a target network, and how to update its parameters;
+# - How to create an optimizer associated with a loss module.
+#


Some ideas of next steps:

using @dispatch for loss modules (see [Feature] Distpatch IQL loss module rl#1230 for instance)

Flexible keys (see https://github.com/pytorch/rl/pulls?page=2&q=is%3Apr+is%3Aclosed)
These features should not necessarily be part of the tutorial but it's interesting to mention that the current implementation could be improved with these.

is this comment to document potential future tutorial updates, or are you suggesting we should add as a "next steps" section in the content of the tutorial?

We should add a "next steps" or something like that.
The point is that most losses have more features than this "toy example" and it could be weird to read this tutorial from top to bottom without mentioning some nice features such as customization of the tensordict keys or using the loss without tensordict at all.
For now we don't have a tutorial about these features, but I would mention that these things are possible.

Added, but not sure if that's exactly how you wanted it. Your link above for flexible keys just goes to a long list of issues, you might have meant to paste something else?

should have been pytorch/rl#1175

intermediate_source/coding_ddpg.py

vmoens

Happy with this. See my comment about next steps

facebook-github-bot · 2023-06-08T18:58:40Z

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

vmoens · 2023-06-09T21:02:37Z

Regarging https://github.com/pytorch/tutorials/actions/runs/5214703804/jobs/9414299217?pr=2413#step:8:14384
Unfortunately I can't reproduce the error locally, I will need to dig a bit deeper to make sense of it...

facebook-github-bot · 2023-06-09T21:06:50Z

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

vmoens

This should fix our bug

vmoens · 2023-06-09T16:09:53Z

intermediate_source/coding_ddpg.py

+    create_env_fn=[
+        parallel_env,
+    ]


I would suggest not to use a parallel env here but just the async collector + regular env

intermediate_source/coding_ddpg.py

Co-authored-by: Vincent Moens <vincentmoens@gmail.com>

intermediate_source/coding_ddpg.py

vmoens · 2023-06-12T20:47:22Z

intermediate_source/coding_ddpg.py

-def parallel_env_constructor(
-    env_per_collector,
+def env_constructor(
    transform_state_dict,
 ):
-    if env_per_collector == 1:
-
-        def make_t_env():
-            env = make_transformed_env(make_env())
-            env.transform[2].init_stats(3)
-            env.transform[2].loc.copy_(transform_state_dict["loc"])
-            env.transform[2].scale.copy_(transform_state_dict["scale"])
-            return env
-
-        env_creator = EnvCreator(make_t_env)
-        return env_creator
-
-    parallel_env = ParallelEnv(
-        num_workers=env_per_collector,
-        create_env_fn=EnvCreator(lambda: make_env()),
-        create_env_kwargs=None,
-        pin_memory=False,
-    )
-    env = make_transformed_env(parallel_env)
-    # we call `init_stats` for a limited number of steps, just to instantiate
-    # the lazy buffers.
-    env.transform[2].init_stats(3, cat_dim=1, reduce_dim=[0, 1])
-    env.transform[2].load_state_dict(transform_state_dict)
-    return env
+    def make_t_env():
+        env = make_transformed_env(make_env())
+        env.transform[2].init_stats(3)
+        env.transform[2].loc.copy_(transform_state_dict["loc"])
+        env.transform[2].scale.copy_(transform_state_dict["scale"])
+        return env
+    env_creator = EnvCreator(make_t_env)
+    return env_creator


@svekars we can revert this and use the old parallel_env_constructor

def parallel_env_constructor( env_per_collector, transform_state_dict, ): if env_per_collector == 1: def make_t_env(): env = make_transformed_env(make_env()) env.transform[2].init_stats(3) env.transform[2].loc.copy_(transform_state_dict["loc"]) env.transform[2].scale.copy_(transform_state_dict["scale"]) return env env_creator = EnvCreator(make_t_env) return env_creator parallel_env = ParallelEnv( num_workers=env_per_collector, create_env_fn=EnvCreator(lambda: make_env()), create_env_kwargs=None, pin_memory=False, ) env = make_transformed_env(parallel_env) # we call `init_stats` for a limited number of steps, just to instantiate # the lazy buffers. env.transform[2].init_stats(3, cat_dim=1, reduce_dim=[0, 1]) env.transform[2].load_state_dict(transform_state_dict) return env

intermediate_source/coding_ddpg.py

svekars

LGTM in general. Need to make sure to add the image from https://github.com/pytorch/rl/blob/main/docs/source/_static/img/replaybuffer_traj.png. This also feels like a advanced tutorial rather than intermediate so might want to move to advanced_source. We can do in a follow up PR.

github-actions bot added rl Issues related to reinforcement learning tutorial, DQN, and so on docathon-h1-2023 A label for the docathon in H1 2023 medium labels Jun 2, 2023

svekars reviewed Jun 2, 2023

View reviewed changes

svekars requested a review from vmoens June 2, 2023 21:51

Port TorchRL tutorial for DDPG Loss to pytorch.org

1b2f520

Port existing tutorial from torchrl to pytorch.org. pytorch#2351

nairbv force-pushed the add_coding_ddpg branch from 6c0acfc to 1b2f520 Compare June 5, 2023 18:32

some stylistic changes, e.g. UK->US

95d7b1c

nairbv and others added 4 commits June 5, 2023 16:36

capitalize CUDA

871f50b

Co-authored-by: Svetlana Karslioglu <svekars@fb.com>

they / They

09840d0

Co-authored-by: Svetlana Karslioglu <svekars@fb.com>

update URL format

67f25c6

Co-authored-by: Svetlana Karslioglu <svekars@fb.com>

Apply suggestions from code review

22f5555

Co-authored-by: Svetlana Karslioglu <svekars@fb.com>

Merge branch 'main' into add_coding_ddpg

8bcd15c

facebook-github-bot added the cla signed label Jun 6, 2023

vmoens reviewed Jun 6, 2023

View reviewed changes

workaround pyspell errors

ecf0b6e

github-actions bot removed the cla signed label Jun 6, 2023

facebook-github-bot added the cla signed label Jun 6, 2023

fix some more pyspell issues

60ffc17

nairbv and others added 2 commits June 8, 2023 14:48

add early import of torchrl to avoid conflicts on mp.set_start_method

3a21bca

Merge branch 'main' into add_coding_ddpg

b4d74d5

github-actions bot removed the cla signed label Jun 8, 2023

facebook-github-bot added the cla signed label Jun 8, 2023

try again on fixing multiproc spawn issue, couldnt repro locally

2e946ea

github-actions bot removed the cla signed label Jun 9, 2023

Merge branch 'main' into add_coding_ddpg

7ba9abd

facebook-github-bot added the cla signed label Jun 9, 2023

github-actions bot removed the cla signed label Jun 9, 2023

facebook-github-bot added the cla signed label Jun 9, 2023

vmoens reviewed Jun 12, 2023

View reviewed changes

Apply suggestions from code review

d756188

Co-authored-by: Vincent Moens <vincentmoens@gmail.com>

svekars reviewed Jun 12, 2023

View reviewed changes

intermediate_source/coding_ddpg.py Outdated Show resolved Hide resolved

Svetlana Karslioglu added 2 commits June 12, 2023 10:49

Update intermediate_source/coding_ddpg.py

271e5d3

Merge branch 'main' into add_coding_ddpg

b575982

github-actions bot removed the cla signed label Jun 12, 2023

facebook-github-bot added the cla signed label Jun 12, 2023

vmoens reviewed Jun 12, 2023

View reviewed changes

svekars reviewed Jun 12, 2023

View reviewed changes

intermediate_source/coding_ddpg.py Outdated Show resolved Hide resolved

Svetlana Karslioglu and others added 5 commits June 12, 2023 14:05

Update intermediate_source/coding_ddpg.py

7c48bff

fix

ed57fca

Merge branch 'main' into add_coding_ddpg

b878abd

SyncDataCollector

425a694

SyncDataCollector

1263c60

svekars approved these changes Jun 13, 2023

View reviewed changes

Merge branch 'main' into add_coding_ddpg

f4a5e4b

svekars merged commit b7c93f4 into pytorch:main Jun 13, 2023
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Port TorchRL tutorial for DDPG Loss to pytorch.org #2413

Port TorchRL tutorial for DDPG Loss to pytorch.org #2413

nairbv commented Jun 2, 2023 •

edited by svekars

Loading

facebook-github-bot commented Jun 2, 2023

netlify bot commented Jun 2, 2023 •

edited

Loading

svekars left a comment

svekars Jun 2, 2023

svekars Jun 2, 2023

svekars Jun 2, 2023

svekars Jun 2, 2023

svekars Jun 2, 2023

svekars left a comment

svekars commented Jun 5, 2023

svekars commented Jun 5, 2023

nairbv commented Jun 5, 2023

nairbv commented Jun 5, 2023

svekars commented Jun 5, 2023

nairbv commented Jun 6, 2023

svekars commented Jun 6, 2023

vmoens left a comment

vmoens Jun 6, 2023

nairbv Jun 6, 2023

vmoens Jun 7, 2023

nairbv Jun 7, 2023

vmoens Jun 7, 2023

vmoens left a comment

facebook-github-bot commented Jun 8, 2023

vmoens commented Jun 9, 2023

facebook-github-bot commented Jun 9, 2023

vmoens left a comment

vmoens Jun 9, 2023

vmoens Jun 12, 2023 •

edited

Loading

vmoens Jun 12, 2023

svekars left a comment •

edited

Loading

	# collection and on the resources allocated to the data collection (e.g. GPU,
	# collection and on the resources allocated to the data collection (for example, GPU,

	# environments, like dm_control ones).
	# environments, like the ``dm_control`` ones).

	# use a greater value for ``total_frames`` e.g. 1M.
	# use a greater value for ``total_frames`` for example, 1M.

	# In this tutorial, we have learnt how to code a loss module in TorchRL given
	# In this tutorial, we have learned how to code a loss module in TorchRL given

Port TorchRL tutorial for DDPG Loss to pytorch.org #2413

Port TorchRL tutorial for DDPG Loss to pytorch.org #2413

Conversation

nairbv commented Jun 2, 2023 • edited by svekars Loading

Description

Checklist

facebook-github-bot commented Jun 2, 2023

Process

netlify bot commented Jun 2, 2023 • edited Loading

✅ Deploy Preview for pytorch-tutorials-preview ready!

svekars left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

svekars left a comment

Choose a reason for hiding this comment

svekars commented Jun 5, 2023

svekars commented Jun 5, 2023

nairbv commented Jun 5, 2023

nairbv commented Jun 5, 2023

svekars commented Jun 5, 2023

nairbv commented Jun 6, 2023

svekars commented Jun 6, 2023

vmoens left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vmoens left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Jun 8, 2023

vmoens commented Jun 9, 2023

facebook-github-bot commented Jun 9, 2023

vmoens left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vmoens Jun 12, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

svekars left a comment • edited Loading

Choose a reason for hiding this comment

nairbv commented Jun 2, 2023 •

edited by svekars

Loading

netlify bot commented Jun 2, 2023 •

edited

Loading

vmoens Jun 12, 2023 •

edited

Loading

svekars left a comment •

edited

Loading