Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support process group #228

Merged
merged 21 commits into from
Oct 15, 2021
Merged

feat: support process group #228

merged 21 commits into from
Oct 15, 2021

Conversation

wangraying
Copy link
Member

@wangraying wangraying commented Sep 27, 2021

BREAKING CHANGE:

  • model.with_bagua(...) now can specify process groups

@todo
Copy link

todo bot commented Sep 27, 2021

remove the dependency on torch process group

# TODO remove the dependency on torch process group
if not dist.is_initialized():
torch.distributed.init_process_group(
backend="nccl",
store=_default_store,
rank=get_rank(),


This comment was generated by todo based on a TODO comment in be66fbc in #228. cc @BaguaSys.

@pr-triage pr-triage bot removed the PR: draft label Sep 27, 2021
@todo
Copy link

todo bot commented Sep 27, 2021

combine **inplace API

# TODO combine **inplace API
def alltoall_inplace(
tensor: torch.Tensor,
comm=None,
):
"""The in-place version of :func:`alltoall`."""


This comment was generated by todo based on a TODO comment in bc126fe in #228. cc @BaguaSys.

@wangraying wangraying mentioned this pull request Sep 28, 2021
Copy link
Contributor

@NOBLES5E NOBLES5E left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comments

Comment on lines 97 to 98
@unittest.skip("fixme")
# @skip_if_cuda_not_available()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[blackfmt] reported by reviewdog 🐶

Suggested change
@unittest.skip("fixme")
# @skip_if_cuda_not_available()
@unittest.skip("fixme")
# @skip_if_cuda_not_available()

@@ -94,7 +94,8 @@ def run_bagua_broad(rank, nprocs, bagua_params, envs, opt_class, opt_hyper_param


class Test_Broadcast_Module(unittest.TestCase):
@skip_if_cuda_not_available()
@unittest.skip("fixme")
# @skip_if_cuda_not_available()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[blackfmt] reported by reviewdog 🐶

Suggested change
# @skip_if_cuda_not_available()
# @skip_if_cuda_not_available()

@NOBLES5E NOBLES5E changed the title feat: add support for process group feat: support process group Oct 15, 2021
@NOBLES5E NOBLES5E merged commit cd499b8 into master Oct 15, 2021
@NOBLES5E NOBLES5E deleted the process_group branch October 15, 2021 04:39
@pr-triage pr-triage bot added the PR: merged label Oct 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants