Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Examples where heavy intrinsics usage runs into internal jit limits on optimization #11905

Open
AndyAyersMS opened this issue Jan 27, 2019 · 4 comments
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions tenet-performance Performance related issue
Milestone

Comments

@AndyAyersMS
Copy link
Member

Tracking issue for cases where heavy intrinsics usage leads to poor optimization because methods hit various internal jit limits.

category:cq
theme:inlining
skill-level:expert
cost:medium

@saucecontrol
Copy link
Member

I closed #11903 because it's being addressed in a different way. However, absent the regression caused by the HWIntrinsics API change, that example was still very close to the JIT throttling limits without being absurdly complex. I wanted to bring over @AndyAyersMS comment from over there so it doesn't get lost, as it would be a good compromise solution for these cases.

The limits are there to prevent jit algorithms from taking up too much memory, too much time, or both. Perhaps we could tie increasing the limits into AggressiveOptimization so we have a better idea that the performance of a method is deemed critical and so optimizing it is worth the extra jit time and memory.

@benaadams
Copy link
Member

@AndyAyersMS will this have become more problematic now Arm paths are being added, or are the .IsSupported paths dropped early?

@AndyAyersMS
Copy link
Member Author

I think we're ok. Early pruning helps. Also, the jit will create temps for inlinee args and locals lazily as it is importing the inlinee, so increasing the number of locals in a method (say because C# now sees much more code) should not be problem, provided only a subset of them can be reached from any particular architecture.

@kunalspathak did some checking to make sure that adding arm specialization to methods that already has xarch specialization didn't cause any changes in the xarch code.

@SingleAccretion
Copy link
Contributor

So that this doesn't get lost. From #48669:

We may want to revaluate this limit. Last time we looked (~5 years ago) there were very few methods that came near. But perhaps things have changed.

I have collected some quick data from the PMI diffs of the shared framework (for win-x64). It looks like the situations is still that most methods have a relatively small number of locals.

Locals        Methods 
0    - 100  : 352956 : 99.230%
100  - 200  : 2136   : 00.601%
200  - 300  : 382    : 00.107%
300  - 400  : 138    : 00.039%
400  - 500  : 31     : 00.009%
500  - 2334 : 51     : 00.014%

@BruceForstall BruceForstall removed the JitUntriaged CLR JIT issues needing additional triage label Jan 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

6 participants