-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generalize attention fusion #3403
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #3403 +/- ##
========================================
Coverage 92.04% 92.04%
========================================
Files 506 506
Lines 20856 20864 +8
========================================
+ Hits 19196 19204 +8
Misses 1660 1660 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This build is not recommended to merge 🔴 |
🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output |
Comeplete solution for #2812
Changes (applicable when mlir attention is enabled):
dot -> fused_reduce -> dot (-> pointwise)
pattern in fuse_mlir passVerified that the attention fusion works as before on various transformer models in our nas (bert, gpt, etc.)