Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPMI: Improve speed significantly for large diffs #76238

Merged
merged 20 commits into from
Sep 29, 2022

Conversation

jakobbotsch
Copy link
Member

@jakobbotsch jakobbotsch commented Sep 27, 2022

This starts communicating more information about diffs back from
superpmi and starts using it in the driver. The current information is
the base size, diff size, base instructions, diff instructions and
context size.

The driver uses the base size/diff size to pick a small number of
interesting contexts and only creates disassembly for these contexts.
jit-analyze is then also invoked on only these contexts, but the
contexts are picked so that they overlap with what jit-analyze would
display. Additionally, we also pick the 20 smallest contexts (in
terms of context size) -- I frequently use the smallest contexts with
diffs to investigate my changes.

The new behavior is only enabled when no custom metrics are specified.
If custom metrics are specified we fall back to producing all
disassembly files and leave it up to jit-analyze to extract and analyze
the metrics specified.

Also, the retainTopFilesOnly option is no longer necessary since our CI
pipeline will at most produce 100 .dasm files now. Another benefit is that
this means that all contexts mentioned in the jit-analyze output will now be
part of the artifacts.

The net result is that we can now get SPMI diffs for changes with even
hundreds of thousands of diffs in the same time as it takes to get diffs
for a small change.

Fix #76178

This starts communicating more information about diffs back from
superpmi and starts using it in the driver. The current information is
the base size, diff size, base instructions, diff instructions and
context size.

The driver uses the base size/diff size to pick a small number of
interesting contexts and only creates disassembly for these contexts.
jit-analyze is then also invoked on only these contexts, but the
contexts are picked so that they overlap with what jit-analyze would
display. Additionally, we also pick the top 20 smallest contexts (in
terms of context size) -- I frequently use the smallest contexts with
diffs to investigate my change.

The new behavior is only enabled when no custom metrics are specified.
If custom metrics are specified we fall back to producing all
disassembly files and leave it up to jit-analyze to extract and analyze
the metrics specified.

The net result is that we can now get SPMI diffs for changes with even
hundreds of thousands of diffs in the same time as it takes to get diffs
for a small change.

Fix dotnet#76178
This is no longer necessary since we avoid creating these disassembly
files in the first place. Another benefit is that all contexts mentioned
by jit-analyze output will now be part of the artifacts.
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Sep 27, 2022
@ghost ghost assigned jakobbotsch Sep 27, 2022
@ghost
Copy link

ghost commented Sep 27, 2022

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

This starts communicating more information about diffs back from
superpmi and starts using it in the driver. The current information is
the base size, diff size, base instructions, diff instructions and
context size.

The driver uses the base size/diff size to pick a small number of
interesting contexts and only creates disassembly for these contexts.
jit-analyze is then also invoked on only these contexts, but the
contexts are picked so that they overlap with what jit-analyze would
display. Additionally, we also pick the 20 smallest contexts (in
terms of context size) -- I frequently use the smallest contexts with
diffs to investigate my changes.

The new behavior is only enabled when no custom metrics are specified.
If custom metrics are specified we fall back to producing all
disassembly files and leave it up to jit-analyze to extract and analyze
the metrics specified.

The net result is that we can now get SPMI diffs for changes with even
hundreds of thousands of diffs in the same time as it takes to get diffs
for a small change.

Fix #76178

Author: jakobbotsch
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@@ -171,7 +181,8 @@ int __cdecl main(int argc, char* argv[])
#endif

bool collectThroughput = false;
MCList failingToReplayMCL, diffMCL;
MCList failingToReplayMCL;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will likely be useful to give the failing list the same treatment so that we can have superpmi.py replay print the smallest contexts with failures, instead of printing everything. I will leave this for a future PR however.

@kunalspathak
Copy link
Member

Very nice to see it.

@jakobbotsch jakobbotsch marked this pull request as ready for review September 27, 2022 21:02
@jakobbotsch
Copy link
Member Author

cc @dotnet/jit-contrib PTAL @AndyAyersMS
This should be ready for review barring any unforeseen failures in the related pipelines. The morph.cpp change is just to see it work, I will revert it before merging.

Hopefully I can get to see diffs for #76017 and #76185 with this change.

@jakobbotsch
Copy link
Member Author

The linux-arm64 job seems to have hung during the .mch download. The superpmi replay is failing on arm32 due to the morph change, seems like a bug in either lowering or LSRA:

[22:19:58] ISSUE: <ASSERT> #271154 D:\a\_work\1\s\src\coreclr\jit\lsra.cpp (11934) - Assertion failed 'refPosition->RegOptional()' in 'System.Numerics.Tests.ComplexTests:Abs(double,double)' during 'LSRA build intervals' (IL size 22; hash 0xbf3bc3d1; FullOpts)

I'll try a different change with large diffs.

@jakobbotsch
Copy link
Member Author

Clean run for a JIT change that has diffs in around 100k contexts: https://dev.azure.com/dnceng-public/public/_build/results?buildId=33036&view=ms.vss-build-web.run-extensions-tab

@kunalspathak
Copy link
Member

seems like a bug in either lowering or LSRA

Can we have an issue for this?

@jakobbotsch
Copy link
Member Author

seems like a bug in either lowering or LSRA

Can we have an issue for this?

Opened #76382.

@jakobbotsch
Copy link
Member Author

jakobbotsch commented Sep 29, 2022

Ping @AndyAyersMS (or maybe @kunalspathak / @BruceForstall want to take a look?)

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks good.

I usually use IL size to prioritize simple repro cases, but I imagine MC size likely ends up being even a better choice.


// This is a difference in ASM outputs from Jit1 & Jit2 and not a playback failure
// We will add this MC to the diffs info if there is one.
// Otherwise this will end up in failingMCList
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this a bit puzzling. Can you say more about why we would add this to the failing MCL?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just preserving the existing behavior (it was inside InvokeNearDiffer before).
I'm not sure why we are regarding diffs as failures when -diffMCList (-diffsInfo after this change) is unspecified.

@jakobbotsch jakobbotsch merged commit cc934c7 into dotnet:main Sep 29, 2022
@jakobbotsch jakobbotsch deleted the jit-analyze-spmi branch September 29, 2022 17:24
@BruceForstall
Copy link
Member

This is great work; thanks for doing this.

subproc_helper.run_to_completion(create_replay_artifacts, self, mch_file, asm_complus_vars_full_env, text_differences, base_asm_location, diff_asm_location, ".dasm")

if self.coreclr_args.diff_jit_dump:
logging.info("Creating JitDump files: %s %s", base_dump_location, diff_dump_location)
subproc_helper.run_to_completion(create_replay_artifacts, self, mch_file, jit_dump_complus_vars_full_env, jit_dump_differences, base_dump_location, diff_dump_location, ".txt")

logging.info("Differences found. To replay SuperPMI use:")
logging.info("Differences found. To replay SuperPMI use:".format(len(diffs)))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this ".format..."? Shouldn't it use the logging '%s' format? And I don't see any replacement token.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logging uses the same conventions as the old % formatting operator. This is the newer str.format, which I think would normally be preferred... except that logging probably isn't going to format the string unless the particular logging level has been specified.

(and yes, it is missing {} or {0})

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops yes, at one point I was printing the number of diffs as part of this line, but I decided to print it on a new line at the end of each collection diff so that it gets printed last instead, and you can see it while waiting for the next diffs.

I'll keep in mind to remove this with my next change.

Copy link
Member Author

@jakobbotsch jakobbotsch Oct 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like logging can be switched to the new formatter: https://docs.python.org/3/howto/logging-cookbook.html#using-particular-formatting-styles-throughout-your-application

That would maybe be a nice cleanup at some point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, looks like 'style' came with 3.2. Not sure what version of Python is on the CI systems, but presumably better than that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I (and some others) have been using new style formatting in various scripts for a while now without issues.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SPMI should avoid DASM roundtrip for analysis
5 participants