Match "mul + where" as the pointwise operation before softmax in attention fusion #3381

umangyadav · 2024-08-16T12:29:38Z

Fixes #2812
Depends on ROCm/rocMLIR#1615

These changes can be removed later once we have general machinery of matching against fuse_reduce

Seeing 10% boost on DistillGPT2 with these changes.

shivadbhavsar

approving but the mlir sha change should probably not be a part of this PR. I think we wait till updated sha in develop includes this mlir commit before merging this

migraphx-bot · 2024-08-16T23:04:34Z

Test	Batch	Rate new d872ea	Rate old 01c94f	Diff	Compare
torchvision-resnet50	64	3,232.01	3,236.49	-0.14%	✅
torchvision-resnet50_fp16	64	7,006.13	6,885.59	1.75%	✅
torchvision-densenet121	32	2,374.45	2,430.36	-2.30%	✅
torchvision-densenet121_fp16	32	4,009.18	4,079.72	-1.73%	✅
torchvision-inceptionv3	32	1,638.26	1,634.52	0.23%	✅
torchvision-inceptionv3_fp16	32	2,730.66	2,737.62	-0.25%	✅
cadene-inceptionv4	16	770.23	770.69	-0.06%	✅
cadene-resnext64x4	16	805.79	807.28	-0.18%	✅
slim-mobilenet	64	7,457.44	7,438.46	0.26%	✅
slim-nasnetalarge	64	208.26	207.42	0.41%	✅
slim-resnet50v2	64	3,341.91	3,340.22	0.05%	✅
bert-mrpc-onnx	8	1,151.55	1,149.04	0.22%	✅
bert-mrpc-tf	1	309.78	311.11	-0.43%	✅
pytorch-examples-wlang-gru	1	427.23	431.85	-1.07%	✅
pytorch-examples-wlang-lstm	1	384.36	386.06	-0.44%	✅
torchvision-resnet50_1	1	809.76	801.13	1.08%	✅
cadene-dpn92_1	1	432.41	399.05	8.36%	🔆
cadene-resnext101_1	1	379.33	376.62	0.72%	✅
onnx-taau-downsample	1	344.62	344.52	0.03%	✅
dlrm-criteoterabyte	1	35.06	35.06	-0.01%	✅
dlrm-criteoterabyte_fp16	1	57.83	57.35	0.84%	✅
agentmodel	1	7,903.33	7,988.11	-1.06%	✅
unet_fp16	2	57.90	57.75	0.26%	✅
resnet50v1_fp16	1	934.44	932.17	0.24%	✅
resnet50v1_int8	1	971.30	947.12	2.55%	✅
bert_base_cased_fp16	64	1,149.54	1,141.18	0.73%	✅
bert_large_uncased_fp16	32	355.58	351.92	1.04%	✅
bert_large_fp16	1	212.13	208.10	1.94%	✅
distilgpt2_fp16	16	2,157.91	2,154.47	0.16%	✅
yolov5s	1	508.45	504.76	0.73%	✅
tinyllama	1	43.43	43.37	0.13%	✅
vicuna-fastchat	1	168.73	178.11	-5.27%	🔴
whisper-tiny-encoder	1	407.74	410.98	-0.79%	✅
whisper-tiny-decoder	1	433.28	421.85	2.71%	✅

This build is not recommended to merge 🔴

migraphx-bot · 2024-08-16T23:04:36Z

✅ bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

✅ bert-mrpc-tf: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

✅ torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

✅ cadene-dpn92_1: PASSED: MIGraphX meets tolerance

✅ cadene-resnext101_1: PASSED: MIGraphX meets tolerance

✅ dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

✅ agentmodel: PASSED: MIGraphX meets tolerance

✅ unet: PASSED: MIGraphX meets tolerance

✅ resnet50v1: PASSED: MIGraphX meets tolerance

✅ bert_base_cased_fp16: PASSED: MIGraphX meets tolerance

🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output

✅ bert_large: PASSED: MIGraphX meets tolerance

✅ yolov5s: PASSED: MIGraphX meets tolerance

✅ tinyllama: PASSED: MIGraphX meets tolerance

✅ vicuna-fastchat: PASSED: MIGraphX meets tolerance

✅ whisper-tiny-encoder: PASSED: MIGraphX meets tolerance

✅ whisper-tiny-decoder: PASSED: MIGraphX meets tolerance

✅ distilgpt2_fp16: PASSED: MIGraphX meets tolerance

…ntion fusion (#3381)

umangyadav added 9 commits August 14, 2024 16:00

use random mode

1a72979

use random mode by default

4cf19be

fix generate test

7902ffb

Merge remote-tracking branch 'origin/develop' into change_mode

800fc08

only use random when doing tuning

d7a428a

revert etst

ae8fcfe

add mul + where matcher for the attention

2c9609e

allow scalar

357e547

BUMP SHA to pull fix for where from rocMLIR

1e6dd2b

umangyadav requested a review from causten as a code owner August 16, 2024 12:29

Merge branch 'develop' into fuse_where

b74980f

umangyadav requested review from pfultz2 and shivadbhavsar August 16, 2024 12:43

umangyadav assigned shivadbhavsar Aug 16, 2024

bump SHA

5871ed5

shivadbhavsar approved these changes Aug 16, 2024

View reviewed changes

umangyadav and others added 2 commits August 16, 2024 20:42

fix licensing

c82994c

Merge branch 'develop' into fuse_where

d872ea1

causten merged commit 016be6e into develop Aug 20, 2024
44 of 46 checks passed

causten deleted the fuse_where branch August 20, 2024 14:45

TedThemistokleous pushed a commit that referenced this pull request Aug 21, 2024

Match "mul + where" as the pointwise operation before softmax in atte…

a3f367c

…ntion fusion (#3381)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Match "mul + where" as the pointwise operation before softmax in attention fusion #3381

Match "mul + where" as the pointwise operation before softmax in attention fusion #3381

umangyadav commented Aug 16, 2024 •

edited

Loading

shivadbhavsar left a comment

migraphx-bot commented Aug 16, 2024

migraphx-bot commented Aug 16, 2024

Match "mul + where" as the pointwise operation before softmax in attention fusion #3381

Match "mul + where" as the pointwise operation before softmax in attention fusion #3381

Conversation

umangyadav commented Aug 16, 2024 • edited Loading

shivadbhavsar left a comment

Choose a reason for hiding this comment

migraphx-bot commented Aug 16, 2024

migraphx-bot commented Aug 16, 2024

umangyadav commented Aug 16, 2024 •

edited

Loading