-
Notifications
You must be signed in to change notification settings - Fork 907
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support SDXL and its distributed inference #1514
base: main
Are you sure you want to change the base?
Conversation
@Zars19 thanks for the contribution to TensorRT-LLM! @nv-guomingz can you help take care of this? :) Thanks |
Sure, I'll collobrate with @Zars19 for enabling SDXL with TRT-LLM. |
Hi @Zars19 , could u please resolve the code conflicts firstly? |
I have resolved the conflict :) @nv-guomingz |
Hi @Zars19 thanks for your patience. |
@nv-guomingz I completed the git rebase |
Any updates on the code review? |
After rebasing the code, I haven't received feedback for a while now |
The idea of patch parallelism comes from the CVPR 2024 paper Distrifusion. In order to reduce the difficulty of implementation, all communications in the example are synchronous.
This can help SDXL achieve better performance, especially when the resolution is very high
A100, 50 steps, 2048x2048, SDXL