Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update optimization flags/levels in jit interface #7751

Open
JosephTremoulet opened this issue Mar 29, 2017 · 12 comments
Open

Update optimization flags/levels in jit interface #7751

JosephTremoulet opened this issue Mar 29, 2017 · 12 comments
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone

Comments

@JosephTremoulet
Copy link
Contributor

JosephTremoulet commented Mar 29, 2017

With AOT scenarios now including ngen and crossgen and ready-to-run, desktop and coreclr and corert, and with work in flight on IBC and tiered jitting, it's a good time to take another look at the flags used to control which optimizations the jit performs, and try to get a set in place that will give us the right interface going forward to evolve our codegen/policies in the different scenarios.

category:design
theme:jit-ee-interface
skill-level:beginner
cost:small
impact:small

@JosephTremoulet
Copy link
Contributor Author

The current setup works like so:

  • When the EE asks the JIT to compile a method, it passes CorJitFlags, which include independently-settable:
    • CORJIT_FLAG_SPEED_OPT
    • CORJIT_FLAG_SIZE_OPT
    • CORJIT_FLAG_MIN_OPT "disable optimization"
    • CORJIT_FLAG_DEBUG_CODE "no code mangling"
  • The JIT translates these to JitFlags; the mapping is 1:1
  • The JIT then translates these into three independent things:
    • "debuggable code" vs not
    • Which optimizations to run. We set this to CFLG_MINOPT, which is defined
      as CFLG_TREETRANS, if MIN_OPT was requested or debuggable code was
      requested or we're jitting a cctor and the cctor isn't an inlinee;
      otherwise, we set this to CFLG_MAXOPT, which is defined as all of them:
      • CLFLG_REGVAR
      • CLFLG_RNGCHKOPT
      • CLFLG_DEADASGN
      • CLFLG_CODEMOTION
      • CLFLG_QMARK
      • CLFLG_TREETRANS
      • CLFLG_INLINING
      • CLFLG_CONSTANTFOLD
      • CLFLG_STRUCTPROMOTE
    • Whether to favor size, speed, or "blended" (setting both flags would just
      result in optimizing for size, and cctors are forced to always optimize
      for side)

@JosephTremoulet
Copy link
Contributor Author

Pulling from the list of flags today and some of the discussion around tiered jitting, we know that the following are some of the goals that we may have for code to be compiled:

  • Make sure stepping through code in the debugger will work well (we have this today in the "debuggable code" flag)
  • Make sure the jit is using the most straightforward/bulletproof lowering (we have this today in minopts)
  • Make sure the jit is going to spend as little time as possible jitting (we don't have this today, and will want this for fjit)
  • Make sure the jit generates code that will run as fast as possible (we don't have this today, and will want this for re-jitting hot kernels and/or build-lab and/or IBC)
  • Make the jit try to strike the best balance it can for single-shot scenario (this is the main scenario today)
  • Make sure the jit is sensitive to code size (we have this today in favor size / favor speed / blended)

Questions to consider: Have I left some out? Will the picture change if we get to the point that tiered compilation allows on-stack replacement and we can perform speculative optimizations, or would the above goals just combine orthogonally with a set of allowable speculative assumptions?

@JosephTremoulet
Copy link
Contributor Author

If the above list is sufficient, it seems to me like it could be represented with eight distinct states:

1: Use straightforward lowering that is bulletproof and facilitates stepping through in the debugger
2: Optimize for high throughput
3 - 8: The six combinations of:
- Optimization "level" (default / high-CQ)
- Size/speed sensitivity (size-sensitive / blend / size-insensitive)

@BruceForstall
Copy link
Member

Another design goal is generally simplicity: reduce the number of combinations that must be tested.

nit: "debuggable" is more than just "facilitates stepping through in the debugger", it also includes accurate variable value access (extend variable lifetimes, make sure debugger knows where the variables live). We also should improve debuggability in the presence of optimization to support live-site attach or dump debugging.

@JosephTremoulet
Copy link
Contributor Author

We also should improve debuggability in the presence of optimization to support live-site attach or dump debugging.

Good point; sometimes we'll want to change what optimizations we run to improve debuggability (item 1 in my list above), but sometimes we'll want the same optimizations but also to generate debug info as well as we can, so ISTM there should also be a "generate debug info" flag in the interface that's orthogonal to the optimization flags.

@mikedn
Copy link
Contributor

mikedn commented Mar 29, 2017

Make sure the jit is using the most straightforward/bulletproof lowering (we have this today in minopts)

Small observation - there's stuff going in lowering that is certainly not required but is done without regard to minopts, magic division for example. It behaved like this before I moved the code around and I always wondered why it doesn't bail out in minopts mode.

@noahfalk
Copy link
Member

there should also be a "generate debug info" flag in the interface that's orthogonal to the optimization flags.

Is there any scenario in which we don't want debug info? I assume that nearly every scenario we have is of the form "I want to perform 0 or more optimizations that might degrade the debugging experience, but aside from forced losses in optimization, give me the best debugging experience that remains possible"

I'd guess that a flag for 'generate debug info' will always be true, in which case we could save some complexity/test time by eliminating it as a free variable.

@jkotas
Copy link
Member

jkotas commented Mar 30, 2017

The flag for 'generate debug info' exists already (CORJIT_FLAG_DEBUG_INFO), and it is always set by the VM for the reasons that @noahfalk mentioned.

@JosephTremoulet
Copy link
Contributor Author

Is there any scenario in which we don't want debug info? ... I'd guess that a flag for 'generate debug info' will always be true
The flag for 'generate debug info' exists already (CORJIT_FLAG_DEBUG_INFO), and it is always set by the VM

Interesting... I was mainly assuming we'd want this because we added it already, and (IIUC) both cl and csc make generation of PDBs (which can be quite large, especially with optimized code) optional. But if we don't think it's worth it for e.g. footprint-constrained deployments, or carry-over from .Net Native (what does it do in this regard?), then "we could save some complexity/test time by eliminating it as a free variable" sounds fine to me.

@JosephTremoulet
Copy link
Contributor Author

I'm also wondering if we should have a dimension for signalling tiered jitting vs single-shot jitting vs "build lab" (by which I mean AOT compilation with some sort of opt-in for more throughput-intensive optimizations)... it's entirely conceivable, for example, that the 2nd round of tiered jitting would want to be a bit more aggressive than single-shot jitting is, or that some AOT scenarios want to be more aggressive still but not to the point of wanting to "turn the dial to 11" like tiered jitting of very hot kernels may want... my inclination is to go with @BruceForstall's point that "Another design goal is generally simplicity: reduce the number of combinations that must be tested" and avoid adding a dimension like that until/unless we have concrete plans to use it for something, so have the "first stab" use the matrix outlined above, but I'm curious what others think.

@jkotas
Copy link
Member

jkotas commented Mar 30, 2017

We had knobs for controlling whether or not to generate debug info in .NET Framework 1.0. It made things like attaching a debugger to existing process or dump debugging work poorly. We turned it on by default in .NET Framework 2.0, together with some work to minimize the debuginfo over head, so that these things just work. The key value prop of .NET is that things work pretty well by default (and you are willing to pay a bit for it) and not having to think about tough choices like not being able to debug vs. having big native PDB around.

I think it would be fine to have a switch that disables generation of debug info for measurements and experiments; but the mainstream .NET experience should be debuggable by default.

.Net Native (what does it do in this regard?)

.Net Native is on the big PDB plan only today. We got feedback that it is not what people expect from .NET. There is work to make it better, e.g. make Environment.StackTrace work well even without the big PDB around.

@JosephTremoulet
Copy link
Contributor Author

Per discussion in dotnet/coreclr#10580, we'll adopt the goal of consolidating the discretionary policy decision-making on the JIT side of the interface, which means that the interface is really about communicating context to the jit.

To that end, dotnet/coreclr#10580 will add flags to identify tier 0 and tier 1 compilation requests (and distinguish them from single-shot compilation requests).

Currently, the VM sets the MIN_OPT flag if the NoOptimization flag is set in the method's metadata, or if a COMPLUS_ variable requests it. Presumably the JIT could make those checks, but on the other hand it's probably good to leave a way for the VM to force "bulletproof low-risk" compilation in the interface.

AFAICT, we aren't really using the SIZE_OPT and SPEED_OPT flags (they seem to only be set in response to COMPLUS variables), and could remove them.

Presumably we'll want to add a "compile the code as quickly as possible" flag on the jit side, using it for tier 0 and maybe cctors.

@JosephTremoulet JosephTremoulet removed their assignment Sep 14, 2017
@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@msftgits msftgits added this to the Future milestone Jan 31, 2020
@BruceForstall BruceForstall added the JitUntriaged CLR JIT issues needing additional triage label Oct 28, 2020
@BruceForstall BruceForstall removed JitUntriaged CLR JIT issues needing additional triage enhancement Product code improvement that does NOT require public API changes/additions optimization tenet-performance Performance related issue labels Nov 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

No branches or pull requests

6 participants