Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sequence parallel strategy support. #819

Merged
merged 12 commits into from
Oct 18, 2022

Conversation

GhostScreaming
Copy link
Contributor

@GhostScreaming GhostScreaming commented Oct 9, 2022

Add sequence support strategy support for GPT pipeline parallel model.

  1. Loss curve fits its peer (mp4_pp2 and mp8).
    loss (1)
    loss_all
  2. Both forward and backward output have been aligned with its peer (mp2_pp2) in the first step.
  3. Function _is_valid_send_recv_partial() in paddle/distributed/fleet/meta_parallel/pp_utils/p2p_communication.py should been modified. Corresponding PR of paddle repo will be submitted.

1. Add sequence parallel strategy for GPTModelHybrid
2. Output has been checked layer by layer both in forward
   and backward progress, and its loss curve of the beginning
   5000 steps fits the peer
3. Performance is improved for about 10% with sequence_parallel
   strategy compared with pretrain_gpt_1.3B_mp8
1. Add sequence_parallel option for GPTModel
2. When mp=1, sequence_parallel option should
   always be set False
Copy link
Member

@ForFishes ForFishes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ForFishes ForFishes merged commit fa4cd96 into PaddlePaddle:develop Oct 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants