Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMD] Reland instruction scheduling hint changes #4940

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

ravil-mobile
Copy link
Contributor

@ravil-mobile ravil-mobile commented Oct 17, 2024

This commit relands #4819
with the following fixes:

  • Replaced temlate-based rewindUnaryOps to use regular for-loops. The new way is more robust and can handle other unary ops automatically.
  • Replaced instr.sched.barriers using the ones from rocdl dialect from the MLIR upstream

@ravil-mobile ravil-mobile changed the title Ravil/bug fix [AMD] Fixed a bug resulted in reverting PR#4919 Oct 17, 2024
@antiagainst antiagainst changed the title [AMD] Fixed a bug resulted in reverting PR#4919 [AMD] Reland instruction scheduling hint changes Oct 17, 2024
@ravil-mobile ravil-mobile force-pushed the ravil/bug-fix branch 2 times, most recently from 5b044e9 to 73b15e8 Compare October 18, 2024 09:57
@ravil-mobile ravil-mobile marked this pull request as ready for review October 18, 2024 09:58
@ravil-mobile ravil-mobile force-pushed the ravil/bug-fix branch 2 times, most recently from 4cb27d1 to 00ab1fe Compare October 22, 2024 10:59
Replaced temlate-based impl. of `rewindUnaryOps` in
`SchedInstructions.cpp` using regular for-loops.
The new impl. is more robust and can handle
other unary ops automatically.
* add a test for the presence of OpIdx attribute
The extra check tests whether the data are loaded from HBM
using `buffer_load` instructions. The CKV3 scheduling is
skipped if the check fails.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants