Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do we comprehend the factor between algBw and busBw? #235

Open
lianghao208 opened this issue Jul 20, 2024 · 5 comments
Open

How do we comprehend the factor between algBw and busBw? #235

lianghao208 opened this issue Jul 20, 2024 · 5 comments

Comments

@lianghao208
Copy link

AllGather, Alltoall, Gather, ReduceScatter, Scatter:

algBw = (n-1)/n * busBw

AllReduce:

algBw = 2*(n-1)/n * busBw

Broadcast, Reduce, Send/Recv:

algBw = busBw

How do we comprehend the factor between algBw and busBw?

Particularly, I think the communication amount of Broadcast is just the same as Scatter, why are the factors different between them?

And It seems Alltoall communicate a lot more than AllGather, why are their factors the same?

@kiskra-nvidia
Copy link
Member

You can find the explanation in https://github.com/NVIDIA/nccl-tests/blob/master/doc/PERFORMANCE.md...

In particular, regarding the difference between Broadcast and Scatter, Broadcast always needs to send out a complete buffer, whereas Scatter doesn't need to send the part destined for the root process (since that data is already there). I.e., for n == 2, Scatter needs to send out only half of the data that Brodcast needs to send.

@lianghao208
Copy link
Author

In particular, regarding the difference between Broadcast and Scatter, Broadcast always needs to send out a complete buffer, whereas Scatter doesn't need to send the part destined for the root process (since that data is already there). I.e., for n == 2, Scatter needs to send out only half of the data that Brodcast needs to send.

@kiskra-nvidia Thanks for the link, it helps. But I am still confused about difference between Broadcast and Scatter. For Broadcast, do you mean the root process(has a complete buffer) still needs to send out a complete buffer to itself, whereas Scatter doesn't? Since the root process already has the complete data, should the number of communicate be n-1 as well?

@kiskra-nvidia
Copy link
Member

Perhaps we misunderstood each other. I was answering your question about the communication amount, which I understood to be a question about the volume of data. Broadcast needs to send a complete buffer S to n-1 destinations. Scatter needs to send to n-1 destinations as well, but for each destination it needs to send just 1/n-th of the buffer S. So it's the same in terms of the number of messages but not in terms of the volume of data.

@lianghao208
Copy link
Author

@kiskra-nvidia
So if the volume of data send in Broadcast is S*(n-1), then the volume of data send in Scatter will be S*(n-1)/n.
If I understand correctly, in Broadcast, the conversion relation between algBw and busBw will be:

algBw = (n-1)busBw

instead of

algBw = busBw

Do you know what else do I miss?

@LJjia
Copy link

LJjia commented Aug 6, 2024

I has same question

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants