Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JIT] Cleanup to lsra inspired by #73424 #76481

Merged
merged 8 commits into from
Oct 11, 2022
Merged

Conversation

markples
Copy link
Member

@markples markples commented Oct 1, 2022

Inspired by comments (here and here)

  • Remove RefPosition* currentRefPosition = &reverseIterator as it creates two variables that are essentially the same, especially since the iterator has operators like ->. In fact, I had to spend a bit of time verifying that there weren't any updates to one but not the other in the code. (And actually Fix undefined behavior found with g++-12 ubsan #73424 did slightly impact this relationship since it moved the assignment from the loop iteration step to the loop body, but I don't believe that this mattered.)
    I renamed the "iterator" variable to the "position" name to reduce textual churn in the code. This didn't work as well for the range-based loops since they yield -references- to the underlying object so a bunch of C++ punctuation changes.
  • Break apart two complicated for conditions and remove duplication between the condition and the loop body. Search for continueLoop to see them.
  • Delete list's operator& on each iterator since they are no longer used and not part of normal iterators.

Manually verified no diffs (other than memory addresses) on a jitdump for a test case with minopts on and off to exercise the dump code in lsra. Manually verified asm diffs were spurious (differences in runtime addresses, sometimes also causing changes in the compiler internal representation size and then locations of labels).

This regains the unexplained small throughput loss in #73424.

@ghost ghost assigned markples Oct 1, 2022
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Oct 1, 2022
@ghost
Copy link

ghost commented Oct 1, 2022

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Inspired by comments (here and here)

  • Remove RefPosition* currentRefPosition = &reverseIterator as it creates two variables that are essentially the same, especially since the iterator has operators like ->. In fact, I had to spend a bit of time verifying that there weren't any updates to one but not the other in the code. (And actually Fix undefined behavior found with g++-12 ubsan #73424 did slightly impact this relationship since it moved the assignment from the loop iteration step to the loop body, but I don't believe that this mattered.)
    I renamed the "iterator" variable to the "position" name to reduce textual churn in the code. This didn't work as well for the range-based loops since they yield -references- to the underlying object so a bunch of C++ punctuation changes.
  • Break apart two complicated for conditions and remove duplication between the condition and the loop body. Search for continueLoop to see them.
  • Delete list's operator& on each iterator since they are no longer used and not part of normal iterators.
Author: markples
Assignees: markples
Labels:

area-CodeGen-coreclr

Milestone: -

@markples
Copy link
Member Author

markples commented Oct 1, 2022

Redo of #73589 which was auto-closed

@markples
Copy link
Member Author

markples commented Oct 1, 2022

I verified no-diff (other than memory addresses) on jitdump for a test case with minopts on and off to exercise the dump code in lsra.

@markples
Copy link
Member Author

markples commented Oct 1, 2022

@jakobbotsch PTAL (since you commented on the first version of this), cc @dotnet/jit-contrib

@markples markples changed the title Cleanup to lsra inspired by #73424 [JIT] Cleanup to lsra inspired by #73424 Oct 1, 2022
@markples markples marked this pull request as ready for review October 1, 2022 09:15
@jakobbotsch
Copy link
Member

It looks like there are some diffs, not expected I assume? Also the formatting jobs aren't happy.

@markples
Copy link
Member Author

markples commented Oct 3, 2022

It looks like there are some diffs, not expected I assume? Also the formatting jobs aren't happy.

Yes, I will look. I'm not sure if the churn here is making a case for leaving the code alone or strengthening the original case asking for simplification.

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jit-analyze output doesn't show up. @jakobbotsch do you know why?

image

break;
}
if (!continueLoop)
{
// Avoid loop iteration that will update currentRefPosition
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just keep the break that is at the beginning of the loop?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The complicated expressions were one of the motivations for the PR (the & iterator thing was the other). When I looked at simplifying them, I noticed that a few included the same set of cases as the logic in the rest of the loop (in this case, the list of RefTypeExpUse, RefTypeDummyDef, and RefTypeBB appears twice), so I tried to combine them.

The printf in the default case was another clue - it seems to be more like an assertion that we shouldn't get there because the original check is the same as the switch.

I'm open to suggestions for other ideas. That I have some asm diffs both encourages me to do the cleanup since I must have misread the code somewhere but also worries me a bit about churn. Is there an easy way to verify that #73424 didn't have any asm diffs? (The job seems to be gone)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense and cleaner. I was just curious about the reasoning.

@jakobbotsch
Copy link
Member

jit-analyze output doesn't show up. @jakobbotsch do you know why?

[10:03:16] Total bytes of delta: -36 (-0.00% of base)

[10:03:16] Generated asm is located under C:\h\w\A34A08D0\p\artifacts\spmi\asm.coreclr_tests.run.Linux.arm64.checked\base C:\h\w\A34A08D0\p\artifacts\spmi\asm.coreclr_tests.run.Linux.arm64.checked\diff

[10:03:16] No textual differences. Is this an issue with coredistools?

[10:03:16] 88 contexts with differences found (1 improvements, 0 regressions)

Hmm, it is a combination of #53773 and my recent change in #76238 exacerbating this problem. I think that during full replay of all contexts, we are more likely to see the problem #53773, while when we create individual disassembly files we are less likely to see it (due to less churn on the memory allocator and so on). So since we are basing the diff numbers on the first run only now we see the problem more often.

@EgorBo do you think there is a chance we can merge #76112 with the main functionality disabled? Or merge the part of it that fixes #53773?

@jakobbotsch
Copy link
Member

jakobbotsch commented Oct 3, 2022

@markples I don't think you need to worry much about those diffs then, they probably are just spurious. I'd double check the asmdiffs artifacts to check that the diffs look to be around field addresses and run some extra runs locally to check the diffs there.

@kunalspathak
Copy link
Member

@markples - jit format errors are fixed by running the following command locally and then committing the modified files.

d:\git\jitutils\bin\jit-format -a x64 -b Checked -o windows -c <repo>\src\coreclr --verbose --fix --projects dll

@markples
Copy link
Member Author

markples commented Oct 4, 2022

@markples I don't think you need to worry much about those diffs then, they probably are just spurious. I'd double check the asmdiffs artifacts to check that the diffs look to be around field addresses and run some extra runs locally to check the diffs there.

I see something different between the runs (got another when I pushed the source formatting commit, not sure if I've caused myself problems by rebasing the branch to a more current main..)

In the first one, all of the base/diff directories are identical even though the report shows a few size differences. For example, Linux.arm64 shows -36 bytes and 88 "contexts with diffs". Strangely, its base/diff only has 49 pairs of .dasm files.

in a more recent one, the size difference is gone, but now I see some line ordering changes and the first numeric one is #187 vs #0xD1FFAB1E in a movk, which looks like maybe 53773 leading to a different value that then tricks the diff-canonicalization that appears to be going on.

The first run is confusing to me, but I guess I should ignore it. I'll look at the new one, especially the line order.

@jakobbotsch
Copy link
Member

@markples Not sure if you missed my comment above, but the spurious diffs are most likely #53773 if they look like changing field addresses in the disassembly. I don't think you need to worry about those.

@jakobbotsch
Copy link
Member

Well looks like I was too quick, I completely skipped that you were quoting me :-)

I think you see identical disassembly because SPMI works in two steps: first it does a run over all contexts, finding diffs. Here it allocates a lot of memory and thus static field addresses are going to have higher probability of being different simply by how the allocator is perturbed.

In the second step, it creates disassembly for contexts with diffs. This is done with single invocations of superpmi on the contexts that it found in step 1. So the memory addresses it sees here are probably more likely to be equal since there's less noise in the environment.

The end result is that things do not look internally consistent since SPMI sees different disassembly in step 1 and step 2.

@BruceForstall
Copy link
Member

Can you show examples of the arm64 diffs (and note which MCH file they came from)? My change #76616 should fix some of the spurious diffs.

@markples
Copy link
Member Author

markples commented Oct 4, 2022

@BruceForstall The code ones so far look like this:

             movz    x0, #0xD1FFAB1E
-            movk    x0, #187 LSL #16
+            movk    x0, #0xD1FFAB1E LSL #16
             movk    x0, #0xD1FFAB1E LSL #32

which is similar to the sequence in that link.

The line ordering seems scarier, though now I'm seeing that both occur near movz/movk sequences. On the other hand, it would be hard to be far from those in that function.

             ldr     x0, [x0, #0x18]
             movz    x1, #0xD1FFAB1E      // code for hackishModuleName:hackishMethodName
             movk    x1, #0xD1FFAB1E LSL #16
+                                               ;; size=756 bbWeight=1    PerfScore 233.50
+G_M24375_IG04:        ; , extend
             movk    x1, #127 LSL #32
             ldr     x1, [x1]
-                                               ;; size=764 bbWeight=1    PerfScore 237.00
-G_M24375_IG04:        ; , extend
             blr     x1

@markples
Copy link
Member Author

markples commented Oct 4, 2022

@BruceForstall Sorry, I left this out - I think this is coreclr_tests.run.Linux.arm64.checked.mch. I'm not completely sure as I don't think I've seen a link between the dasm files and where they came from, but that is the only one to record any diffs in the summary.

@EgorBo
Copy link
Member

EgorBo commented Oct 5, 2022

@EgorBo do you think there is a chance we can merge #76112 with the main functionality disabled? Or merge the part of it that fixes #53773?

@jakobbotsch I plan to merge it this week, the main issue that blocked it has been resolved

@markples
Copy link
Member Author

Test failures:

@markples
Copy link
Member Author

apologies for the force push - in auditing the change before merging, I discovered that I'd resolved a merge conflict backwards and redoing it was the easiest path forward

@markples
Copy link
Member Author

Throughput:

Linux x64

Overall (-0.03%)

Collection PDIFF
benchmarks.run.Linux.x64.checked.mch -0.03%
coreclr_tests.run.Linux.x64.checked.mch -0.04%
libraries.crossgen2.Linux.x64.checked.mch -0.03%
libraries.pmi.Linux.x64.checked.mch -0.03%
libraries_tests.pmi.Linux.x64.checked.mch -0.03%

MinOpts (-0.05%)

Collection PDIFF
benchmarks.run.Linux.x64.checked.mch -0.10%
coreclr_tests.run.Linux.x64.checked.mch -0.05%
libraries.crossgen2.Linux.x64.checked.mch -0.06%
libraries.pmi.Linux.x64.checked.mch -0.10%
libraries_tests.pmi.Linux.x64.checked.mch -0.08%

FullOpts (-0.03%)

Collection PDIFF
benchmarks.run.Linux.x64.checked.mch -0.02%
coreclr_tests.run.Linux.x64.checked.mch -0.03%
libraries.crossgen2.Linux.x64.checked.mch -0.03%
libraries.pmi.Linux.x64.checked.mch -0.03%
libraries_tests.pmi.Linux.x64.checked.mch -0.03%

windows x86

Overall (-0.02%)

Collection PDIFF
benchmarks.run.windows.x86.checked.mch -0.01%
coreclr_tests.run.windows.x86.checked.mch -0.02%
libraries.crossgen2.windows.x86.checked.mch -0.02%
libraries.pmi.windows.x86.checked.mch -0.02%
libraries_tests.pmi.windows.x86.checked.mch -0.02%

MinOpts (-0.04%)

Collection PDIFF
benchmarks.run.windows.x86.checked.mch -0.05%
coreclr_tests.run.windows.x86.checked.mch -0.04%
libraries.crossgen2.windows.x86.checked.mch -0.04%
libraries.pmi.windows.x86.checked.mch -0.05%
libraries_tests.pmi.windows.x86.checked.mch -0.05%

FullOpts (-0.02%)

Collection PDIFF
benchmarks.run.windows.x86.checked.mch -0.01%
coreclr_tests.run.windows.x86.checked.mch -0.02%
libraries.crossgen2.windows.x86.checked.mch -0.02%
libraries.pmi.windows.x86.checked.mch -0.02%
libraries_tests.pmi.windows.x86.checked.mch -0.02%

@markples markples merged commit 20d4e30 into dotnet:main Oct 11, 2022
@markples markples deleted the cleanup/lsra branch October 11, 2022 17:13
@ghost ghost locked as resolved and limited conversation to collaborators Nov 10, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants