Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rustc_span][perf] Hoist lookup sorted by words out of the loop. #114395

Merged
merged 1 commit into from
Aug 3, 2023

Conversation

ttsugriy
Copy link
Contributor

@ttsugriy ttsugriy commented Aug 3, 2023

@lqd commented on #114351 asking if sort_by_words(lookup) is computed repeatedly. I was assuming that rustc should have no difficulties to hoist it automatically outside of the loop to avoid repeated pure computation, but according to
https://godbolt.org/z/frs8Kj1rq it seems like I was wrong:
original version seems to have 2 calls per loop iteration

.LBB16_3:
        mov     rbx, qword ptr [r13]
        mov     r14, qword ptr [r13 + 8]
        lea     rdi, [rsp + 40]
        mov     rsi, rbx
        mov     rdx, r14
        call    example::sort_by_words
        lea     rdi, [rsp + 64]
        mov     rsi, qword ptr [rsp + 8]
        mov     rdx, qword ptr [rsp + 16]
        call    example::sort_by_words
        mov     rdi, qword ptr [rsp + 40]
        mov     rdx, qword ptr [rsp + 56]
        mov     rsi, qword ptr [rsp + 64]
        cmp     rdx, qword ptr [rsp + 80]
        mov     qword ptr [rsp + 32], rdi
        mov     qword ptr [rsp + 24], rsi
        jne     .LBB16_5
        call    qword ptr [rip + bcmp@GOTPCREL]
        test    eax, eax
        sete    al
        mov     dword ptr [rsp + 4], eax
        mov     rsi, qword ptr [rsp + 72]
        test    rsi, rsi
        jne     .LBB16_8
        jmp     .LBB16_9

but the manually hoisted version just 1:

.LBB16_3:
        mov     r13, qword ptr [r15]
        mov     r14, qword ptr [r15 + 8]
        lea     rdi, [rsp + 64]
        mov     rsi, r13
        mov     rdx, r14
        call    example::sort_by_words
        mov     rdi, qword ptr [rsp + 64]
        mov     rdx, qword ptr [rsp + 16]
        cmp     qword ptr [rsp + 80], rdx
        mov     qword ptr [rsp + 32], rdi
        jne     .LBB16_5
        mov     rsi, qword ptr [rsp + 8]
        call    qword ptr [rip + bcmp@GOTPCREL]
        test    eax, eax
        sete    bpl
        mov     rsi, qword ptr [rsp + 72]
        test    rsi, rsi
        jne     .LBB16_8
        jmp     .LBB16_9

This code is probably not very hot, but there is no reason to leave such a low hanging fruit.

@lqd commented on rust-lang#114351 asking
if `sort_by_words(lookup)` is computed repeatedly. I was assuming that
rustc should have no difficulties to hoist it automatically outside of
the loop to avoid repeated pure computation, but according to
 https://godbolt.org/z/frs8Kj1rq it seems like I was wrong:
original version seems to have 2 calls per loop iteration
```
.LBB16_3:
        mov     rbx, qword ptr [r13]
        mov     r14, qword ptr [r13 + 8]
        lea     rdi, [rsp + 40]
        mov     rsi, rbx
        mov     rdx, r14
        call    example::sort_by_words
        lea     rdi, [rsp + 64]
        mov     rsi, qword ptr [rsp + 8]
        mov     rdx, qword ptr [rsp + 16]
        call    example::sort_by_words
        mov     rdi, qword ptr [rsp + 40]
        mov     rdx, qword ptr [rsp + 56]
        mov     rsi, qword ptr [rsp + 64]
        cmp     rdx, qword ptr [rsp + 80]
        mov     qword ptr [rsp + 32], rdi
        mov     qword ptr [rsp + 24], rsi
        jne     .LBB16_5
        call    qword ptr [rip + bcmp@GOTPCREL]
        test    eax, eax
        sete    al
        mov     dword ptr [rsp + 4], eax
        mov     rsi, qword ptr [rsp + 72]
        test    rsi, rsi
        jne     .LBB16_8
        jmp     .LBB16_9
```
but the manually hoisted version just 1:
```
.LBB16_3:
        mov     r13, qword ptr [r15]
        mov     r14, qword ptr [r15 + 8]
        lea     rdi, [rsp + 64]
        mov     rsi, r13
        mov     rdx, r14
        call    example::sort_by_words
        mov     rdi, qword ptr [rsp + 64]
        mov     rdx, qword ptr [rsp + 16]
        cmp     qword ptr [rsp + 80], rdx
        mov     qword ptr [rsp + 32], rdi
        jne     .LBB16_5
        mov     rsi, qword ptr [rsp + 8]
        call    qword ptr [rip + bcmp@GOTPCREL]
        test    eax, eax
        sete    bpl
        mov     rsi, qword ptr [rsp + 72]
        test    rsi, rsi
        jne     .LBB16_8
        jmp     .LBB16_9
```
This code is probably not very hot, but there is no reason to leave
such a low hanging fruit.
@rustbot
Copy link
Collaborator

rustbot commented Aug 3, 2023

r? @eholk

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Aug 3, 2023
@lqd
Copy link
Member

lqd commented Aug 3, 2023

Great, thank you!

As we said on the other PR, this code is quite cold, and IIUC only used in diagnostics: there shouldn't be any perf impact, so let's roll it up.

r? lqd @bors r+ rollup

@bors
Copy link
Contributor

bors commented Aug 3, 2023

📌 Commit 6ae2677 has been approved by lqd

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 3, 2023
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Aug 3, 2023
[rustc_span][perf] Hoist lookup sorted by words out of the loop.

`@lqd` commented on rust-lang#114351 asking if `sort_by_words(lookup)` is computed repeatedly. I was assuming that rustc should have no difficulties to hoist it automatically outside of the loop to avoid repeated pure computation, but according to
 https://godbolt.org/z/frs8Kj1rq it seems like I was wrong:
original version seems to have 2 calls per loop iteration
```
.LBB16_3:
        mov     rbx, qword ptr [r13]
        mov     r14, qword ptr [r13 + 8]
        lea     rdi, [rsp + 40]
        mov     rsi, rbx
        mov     rdx, r14
        call    example::sort_by_words
        lea     rdi, [rsp + 64]
        mov     rsi, qword ptr [rsp + 8]
        mov     rdx, qword ptr [rsp + 16]
        call    example::sort_by_words
        mov     rdi, qword ptr [rsp + 40]
        mov     rdx, qword ptr [rsp + 56]
        mov     rsi, qword ptr [rsp + 64]
        cmp     rdx, qword ptr [rsp + 80]
        mov     qword ptr [rsp + 32], rdi
        mov     qword ptr [rsp + 24], rsi
        jne     .LBB16_5
        call    qword ptr [rip + bcmp@GOTPCREL]
        test    eax, eax
        sete    al
        mov     dword ptr [rsp + 4], eax
        mov     rsi, qword ptr [rsp + 72]
        test    rsi, rsi
        jne     .LBB16_8
        jmp     .LBB16_9
```
but the manually hoisted version just 1:
```
.LBB16_3:
        mov     r13, qword ptr [r15]
        mov     r14, qword ptr [r15 + 8]
        lea     rdi, [rsp + 64]
        mov     rsi, r13
        mov     rdx, r14
        call    example::sort_by_words
        mov     rdi, qword ptr [rsp + 64]
        mov     rdx, qword ptr [rsp + 16]
        cmp     qword ptr [rsp + 80], rdx
        mov     qword ptr [rsp + 32], rdi
        jne     .LBB16_5
        mov     rsi, qword ptr [rsp + 8]
        call    qword ptr [rip + bcmp@GOTPCREL]
        test    eax, eax
        sete    bpl
        mov     rsi, qword ptr [rsp + 72]
        test    rsi, rsi
        jne     .LBB16_8
        jmp     .LBB16_9
```
This code is probably not very hot, but there is no reason to leave such a low hanging fruit.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Aug 3, 2023
[rustc_span][perf] Hoist lookup sorted by words out of the loop.

``@lqd`` commented on rust-lang#114351 asking if `sort_by_words(lookup)` is computed repeatedly. I was assuming that rustc should have no difficulties to hoist it automatically outside of the loop to avoid repeated pure computation, but according to
 https://godbolt.org/z/frs8Kj1rq it seems like I was wrong:
original version seems to have 2 calls per loop iteration
```
.LBB16_3:
        mov     rbx, qword ptr [r13]
        mov     r14, qword ptr [r13 + 8]
        lea     rdi, [rsp + 40]
        mov     rsi, rbx
        mov     rdx, r14
        call    example::sort_by_words
        lea     rdi, [rsp + 64]
        mov     rsi, qword ptr [rsp + 8]
        mov     rdx, qword ptr [rsp + 16]
        call    example::sort_by_words
        mov     rdi, qword ptr [rsp + 40]
        mov     rdx, qword ptr [rsp + 56]
        mov     rsi, qword ptr [rsp + 64]
        cmp     rdx, qword ptr [rsp + 80]
        mov     qword ptr [rsp + 32], rdi
        mov     qword ptr [rsp + 24], rsi
        jne     .LBB16_5
        call    qword ptr [rip + bcmp@GOTPCREL]
        test    eax, eax
        sete    al
        mov     dword ptr [rsp + 4], eax
        mov     rsi, qword ptr [rsp + 72]
        test    rsi, rsi
        jne     .LBB16_8
        jmp     .LBB16_9
```
but the manually hoisted version just 1:
```
.LBB16_3:
        mov     r13, qword ptr [r15]
        mov     r14, qword ptr [r15 + 8]
        lea     rdi, [rsp + 64]
        mov     rsi, r13
        mov     rdx, r14
        call    example::sort_by_words
        mov     rdi, qword ptr [rsp + 64]
        mov     rdx, qword ptr [rsp + 16]
        cmp     qword ptr [rsp + 80], rdx
        mov     qword ptr [rsp + 32], rdi
        jne     .LBB16_5
        mov     rsi, qword ptr [rsp + 8]
        call    qword ptr [rip + bcmp@GOTPCREL]
        test    eax, eax
        sete    bpl
        mov     rsi, qword ptr [rsp + 72]
        test    rsi, rsi
        jne     .LBB16_8
        jmp     .LBB16_9
```
This code is probably not very hot, but there is no reason to leave such a low hanging fruit.
bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 3, 2023
…iaskrgr

Rollup of 8 pull requests

Successful merges:

 - rust-lang#113657 (Expand, rename and improve `incorrect_fn_null_checks` lint)
 - rust-lang#114237 (parser: more friendly hints for handling `async move` in the 2015 edition)
 - rust-lang#114300 (Suggests turbofish in patterns)
 - rust-lang#114372 (const validation: point at where we found a pointer but expected an integer)
 - rust-lang#114395 ([rustc_span][perf] Hoist lookup sorted by words out of the loop.)
 - rust-lang#114403 (fix the span in the suggestion of remove question mark)
 - rust-lang#114408 (Temporary remove myself from review rotation)
 - rust-lang#114415 (Skip checking of `rustc_codegen_gcc` with vendoring enabled)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 2413f50 into rust-lang:master Aug 3, 2023
11 checks passed
@rustbot rustbot added this to the 1.73.0 milestone Aug 3, 2023
@ttsugriy ttsugriy deleted the hoist-lookup branch August 3, 2023 23:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants