[feat] experimental: Add spectrain support #372

sidgoyal78 · 2021-02-08T06:07:04Z

Before submitting

Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
Did you read the contributor guideline?
Did you make sure to update the docs?
Did you write any new necessary tests?

What does this PR do?

Adds support for training models using spectrain based asynchronous pipelining. Reference: https://arxiv.org/pdf/1809.02839.pdf

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

benchmarks/benchmark_dataset.py

sidgoyal78 · 2021-02-12T00:15:14Z

benchmarks/benchmark_dataset.py

@@ -0,0 +1,56 @@
+import torch


@msbaines : should I put this file and the experimental_ampnet.py inside an benchmarks/experimental/ folder.

Looks like this file (benchmark_dataset.py) was removed earlier.

drive by comment: Mentioned in another conversation, we refactored fairscale benchmarks and have common dataset loaders. We should try and use that unless it doesn't work for your use case.

Yeah, that's a great suggestion. I plan to refactor this, but I will first make a PR with xpipe (which depends on the dataloader from this script). Once we merge that, I will plan to refactor.

benchmarks/experimental_ampnet.py

anj-s · 2021-03-02T01:31:19Z

benchmarks/experimental_ampnet.py

@@ -518,6 +613,7 @@ def bench_mpi(args):
 parser.add_argument("--max-batch", type=int, default=4, help="Max number of batches")
 parser.add_argument("--socket-name", type=str, default=None, help="socket ifname for gloo/tp")
 parser.add_argument("--num-decoder-layers", type=int, default=10, help="Number of decoder layers in the model")
+parser.add_argument("--spectrain", action="store_true", default=False, help="Use spectrain based weight prediction")


Do we want to enable these benchmarks in circleCI?

I think we can do that later.

sidgoyal78 · 2021-03-08T21:43:53Z

@anj-s and @msbaines : Thanks for reviewing the PR. I addressed most of your comments, and would be great if you could take a final look.

benchmarks/experimental/benchmark_dataset.py

anj-s · 2021-03-09T19:00:12Z

benchmarks/experimental/benchmark_dataset.py

@@ -0,0 +1,58 @@
+# Copyright (c) Facebook, Inc. and its affiliates. All rights reserved.


Missing license section?

I think there's a discrepancy across many scripts. I noticed that this is the comment that is present in most of the other scripts (see, benchmarks/pipe.py, etc). However, there's an extra license section in benchmarks/experimental/offload.py

Let me make an issue and we can address it separately.

benchmarks/experimental/experimental_ampnet.py

anj-s · 2021-03-09T19:09:59Z

@anj-s and @msbaines : Thanks for reviewing the PR. I addressed most of your comments, and would be great if you could take a final look.

thank you @sidgoyal78 for the PR and making changes! Another thing to mention is that the model can also be reused similar to benchmarks/pipe.py when you end up refactoring.

sidgoyal78 · 2021-03-09T23:49:55Z

@anj-s : I opened an issue #506 to address your point about header/license. Maybe we can discuss there and I can send out a quick PR.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 8, 2021

sidgoyal78 requested a review from msbaines February 8, 2021 06:07

msbaines approved these changes Feb 9, 2021

View reviewed changes

benchmarks/benchmark_dataset.py Outdated Show resolved Hide resolved

sidgoyal78 commented Feb 12, 2021

View reviewed changes

anj-s reviewed Mar 2, 2021

View reviewed changes

benchmarks/experimental_ampnet.py Outdated Show resolved Hide resolved

anj-s reviewed Mar 2, 2021

View reviewed changes

benchmarks/experimental_ampnet.py Outdated Show resolved Hide resolved

anj-s reviewed Mar 2, 2021

View reviewed changes

benchmarks/experimental_ampnet.py Outdated Show resolved Hide resolved

anj-s reviewed Mar 2, 2021

View reviewed changes

sidgoyal78 added 2 commits March 8, 2021 13:33

experimental: Add spectrain support

d5600b5

Address review comments

b647a9d

sidgoyal78 force-pushed the experimental_spectrain branch from c83de91 to b647a9d Compare March 8, 2021 21:42

anj-s reviewed Mar 9, 2021

View reviewed changes

benchmarks/experimental/benchmark_dataset.py Show resolved Hide resolved

anj-s reviewed Mar 9, 2021

View reviewed changes

benchmarks/experimental/experimental_ampnet.py Outdated Show resolved Hide resolved

anj-s reviewed Mar 9, 2021

View reviewed changes

benchmarks/experimental/experimental_ampnet.py Outdated Show resolved Hide resolved

anj-s approved these changes Mar 9, 2021

View reviewed changes

Address review comments

17662be

sidgoyal78 merged commit 5e8a642 into master Mar 10, 2021

min-xu-ai deleted the experimental_spectrain branch July 26, 2022 03:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feat] experimental: Add spectrain support #372

[feat] experimental: Add spectrain support #372

sidgoyal78 commented Feb 8, 2021 •

edited

Loading

sidgoyal78 Feb 12, 2021

anj-s Mar 2, 2021

sidgoyal78 Mar 8, 2021

anj-s Mar 2, 2021

sidgoyal78 Mar 8, 2021

sidgoyal78 commented Mar 8, 2021

anj-s Mar 9, 2021

sidgoyal78 Mar 9, 2021

anj-s commented Mar 9, 2021

sidgoyal78 commented Mar 9, 2021

		@@ -0,0 +1,58 @@
		# Copyright (c) Facebook, Inc. and its affiliates. All rights reserved.

[feat] experimental: Add spectrain support #372

[feat] experimental: Add spectrain support #372

Conversation

sidgoyal78 commented Feb 8, 2021 • edited Loading

Before submitting

What does this PR do?

PR review

Did you have fun?

sidgoyal78 Feb 12, 2021

Choose a reason for hiding this comment

anj-s Mar 2, 2021

Choose a reason for hiding this comment

sidgoyal78 Mar 8, 2021

Choose a reason for hiding this comment

anj-s Mar 2, 2021

Choose a reason for hiding this comment

sidgoyal78 Mar 8, 2021

Choose a reason for hiding this comment

sidgoyal78 commented Mar 8, 2021

anj-s Mar 9, 2021

Choose a reason for hiding this comment

sidgoyal78 Mar 9, 2021

Choose a reason for hiding this comment

anj-s commented Mar 9, 2021

sidgoyal78 commented Mar 9, 2021

sidgoyal78 commented Feb 8, 2021 •

edited

Loading