Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Match "mul + where" as the pointwise operation before softmax in attention fusion #3381

Merged
merged 13 commits into from
Aug 20, 2024

Conversation

umangyadav
Copy link
Member

@umangyadav umangyadav commented Aug 16, 2024

Fixes #2812
Depends on ROCm/rocMLIR#1615

These changes can be removed later once we have general machinery of matching against fuse_reduce

Seeing 10% boost on DistillGPT2 with these changes.

Copy link
Contributor

@shivadbhavsar shivadbhavsar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approving but the mlir sha change should probably not be a part of this PR. I think we wait till updated sha in develop includes this mlir commit before merging this

@migraphx-bot
Copy link
Collaborator

Test Batch Rate new
d872ea
Rate old
01c94f
Diff Compare
torchvision-resnet50 64 3,232.01 3,236.49 -0.14%
torchvision-resnet50_fp16 64 7,006.13 6,885.59 1.75%
torchvision-densenet121 32 2,374.45 2,430.36 -2.30%
torchvision-densenet121_fp16 32 4,009.18 4,079.72 -1.73%
torchvision-inceptionv3 32 1,638.26 1,634.52 0.23%
torchvision-inceptionv3_fp16 32 2,730.66 2,737.62 -0.25%
cadene-inceptionv4 16 770.23 770.69 -0.06%
cadene-resnext64x4 16 805.79 807.28 -0.18%
slim-mobilenet 64 7,457.44 7,438.46 0.26%
slim-nasnetalarge 64 208.26 207.42 0.41%
slim-resnet50v2 64 3,341.91 3,340.22 0.05%
bert-mrpc-onnx 8 1,151.55 1,149.04 0.22%
bert-mrpc-tf 1 309.78 311.11 -0.43%
pytorch-examples-wlang-gru 1 427.23 431.85 -1.07%
pytorch-examples-wlang-lstm 1 384.36 386.06 -0.44%
torchvision-resnet50_1 1 809.76 801.13 1.08%
cadene-dpn92_1 1 432.41 399.05 8.36% 🔆
cadene-resnext101_1 1 379.33 376.62 0.72%
onnx-taau-downsample 1 344.62 344.52 0.03%
dlrm-criteoterabyte 1 35.06 35.06 -0.01%
dlrm-criteoterabyte_fp16 1 57.83 57.35 0.84%
agentmodel 1 7,903.33 7,988.11 -1.06%
unet_fp16 2 57.90 57.75 0.26%
resnet50v1_fp16 1 934.44 932.17 0.24%
resnet50v1_int8 1 971.30 947.12 2.55%
bert_base_cased_fp16 64 1,149.54 1,141.18 0.73%
bert_large_uncased_fp16 32 355.58 351.92 1.04%
bert_large_fp16 1 212.13 208.10 1.94%
distilgpt2_fp16 16 2,157.91 2,154.47 0.16%
yolov5s 1 508.45 504.76 0.73%
tinyllama 1 43.43 43.37 0.13%
vicuna-fastchat 1 168.73 178.11 -5.27% 🔴
whisper-tiny-encoder 1 407.74 410.98 -0.79%
whisper-tiny-decoder 1 433.28 421.85 2.71%

This build is not recommended to merge 🔴

@migraphx-bot
Copy link
Collaborator


     ✅ bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

     ✅ bert-mrpc-tf: PASSED: MIGraphX meets tolerance

     ✅ pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

     ✅ pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

     ✅ torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

     ✅ cadene-dpn92_1: PASSED: MIGraphX meets tolerance

     ✅ cadene-resnext101_1: PASSED: MIGraphX meets tolerance

     ✅ dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

     ✅ agentmodel: PASSED: MIGraphX meets tolerance

     ✅ unet: PASSED: MIGraphX meets tolerance

     ✅ resnet50v1: PASSED: MIGraphX meets tolerance

     ✅ bert_base_cased_fp16: PASSED: MIGraphX meets tolerance

🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output


     ✅ bert_large: PASSED: MIGraphX meets tolerance

     ✅ yolov5s: PASSED: MIGraphX meets tolerance

     ✅ tinyllama: PASSED: MIGraphX meets tolerance

     ✅ vicuna-fastchat: PASSED: MIGraphX meets tolerance

     ✅ whisper-tiny-encoder: PASSED: MIGraphX meets tolerance

     ✅ whisper-tiny-decoder: PASSED: MIGraphX meets tolerance

     ✅ distilgpt2_fp16: PASSED: MIGraphX meets tolerance

@causten causten merged commit 016be6e into develop Aug 20, 2024
44 of 46 checks passed
@causten causten deleted the fuse_where branch August 20, 2024 14:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fuse where into MLIR attention
4 participants