Update optimization flags/levels in jit interface #7751

JosephTremoulet · 2017-03-29T16:46:54Z

With AOT scenarios now including ngen and crossgen and ready-to-run, desktop and coreclr and corert, and with work in flight on IBC and tiered jitting, it's a good time to take another look at the flags used to control which optimizations the jit performs, and try to get a set in place that will give us the right interface going forward to evolve our codegen/policies in the different scenarios.

category:design
theme:jit-ee-interface
skill-level:beginner
cost:small
impact:small

JosephTremoulet · 2017-03-29T16:47:59Z

The current setup works like so:

When the EE asks the JIT to compile a method, it passes CorJitFlags, which include independently-settable:
- CORJIT_FLAG_SPEED_OPT
- CORJIT_FLAG_SIZE_OPT
- CORJIT_FLAG_MIN_OPT "disable optimization"
- CORJIT_FLAG_DEBUG_CODE "no code mangling"
The JIT translates these to JitFlags; the mapping is 1:1
The JIT then translates these into three independent things:
- "debuggable code" vs not
- Which optimizations to run. We set this to CFLG_MINOPT, which is defined
  as CFLG_TREETRANS, if MIN_OPT was requested or debuggable code was
  requested or we're jitting a cctor and the cctor isn't an inlinee;
  otherwise, we set this to CFLG_MAXOPT, which is defined as all of them:
  - CLFLG_REGVAR
  - CLFLG_RNGCHKOPT
  - CLFLG_DEADASGN
  - CLFLG_CODEMOTION
  - CLFLG_QMARK
  - CLFLG_TREETRANS
  - CLFLG_INLINING
  - CLFLG_CONSTANTFOLD
  - CLFLG_STRUCTPROMOTE
- Whether to favor size, speed, or "blended" (setting both flags would just
  result in optimizing for size, and cctors are forced to always optimize
  for side)

JosephTremoulet · 2017-03-29T16:54:21Z

Pulling from the list of flags today and some of the discussion around tiered jitting, we know that the following are some of the goals that we may have for code to be compiled:

Make sure stepping through code in the debugger will work well (we have this today in the "debuggable code" flag)
Make sure the jit is using the most straightforward/bulletproof lowering (we have this today in minopts)
Make sure the jit is going to spend as little time as possible jitting (we don't have this today, and will want this for fjit)
Make sure the jit generates code that will run as fast as possible (we don't have this today, and will want this for re-jitting hot kernels and/or build-lab and/or IBC)
Make the jit try to strike the best balance it can for single-shot scenario (this is the main scenario today)
Make sure the jit is sensitive to code size (we have this today in favor size / favor speed / blended)

Questions to consider: Have I left some out? Will the picture change if we get to the point that tiered compilation allows on-stack replacement and we can perform speculative optimizations, or would the above goals just combine orthogonally with a set of allowable speculative assumptions?

JosephTremoulet · 2017-03-29T17:02:11Z

If the above list is sufficient, it seems to me like it could be represented with eight distinct states:

1: Use straightforward lowering that is bulletproof and facilitates stepping through in the debugger
2: Optimize for high throughput
3 - 8: The six combinations of:
- Optimization "level" (default / high-CQ)
- Size/speed sensitivity (size-sensitive / blend / size-insensitive)

BruceForstall · 2017-03-29T17:12:48Z

Another design goal is generally simplicity: reduce the number of combinations that must be tested.

nit: "debuggable" is more than just "facilitates stepping through in the debugger", it also includes accurate variable value access (extend variable lifetimes, make sure debugger knows where the variables live). We also should improve debuggability in the presence of optimization to support live-site attach or dump debugging.

JosephTremoulet · 2017-03-29T17:39:01Z

We also should improve debuggability in the presence of optimization to support live-site attach or dump debugging.

Good point; sometimes we'll want to change what optimizations we run to improve debuggability (item 1 in my list above), but sometimes we'll want the same optimizations but also to generate debug info as well as we can, so ISTM there should also be a "generate debug info" flag in the interface that's orthogonal to the optimization flags.

mikedn · 2017-03-29T18:02:36Z

Make sure the jit is using the most straightforward/bulletproof lowering (we have this today in minopts)

Small observation - there's stuff going in lowering that is certainly not required but is done without regard to minopts, magic division for example. It behaved like this before I moved the code around and I always wondered why it doesn't bail out in minopts mode.

noahfalk · 2017-03-29T23:54:31Z

there should also be a "generate debug info" flag in the interface that's orthogonal to the optimization flags.

Is there any scenario in which we don't want debug info? I assume that nearly every scenario we have is of the form "I want to perform 0 or more optimizations that might degrade the debugging experience, but aside from forced losses in optimization, give me the best debugging experience that remains possible"

I'd guess that a flag for 'generate debug info' will always be true, in which case we could save some complexity/test time by eliminating it as a free variable.

jkotas · 2017-03-30T00:08:02Z

The flag for 'generate debug info' exists already (CORJIT_FLAG_DEBUG_INFO), and it is always set by the VM for the reasons that @noahfalk mentioned.

JosephTremoulet · 2017-03-30T00:17:43Z

Is there any scenario in which we don't want debug info? ... I'd guess that a flag for 'generate debug info' will always be true
The flag for 'generate debug info' exists already (CORJIT_FLAG_DEBUG_INFO), and it is always set by the VM

Interesting... I was mainly assuming we'd want this because we added it already, and (IIUC) both cl and csc make generation of PDBs (which can be quite large, especially with optimized code) optional. But if we don't think it's worth it for e.g. footprint-constrained deployments, or carry-over from .Net Native (what does it do in this regard?), then "we could save some complexity/test time by eliminating it as a free variable" sounds fine to me.

JosephTremoulet · 2017-03-30T00:26:27Z

I'm also wondering if we should have a dimension for signalling tiered jitting vs single-shot jitting vs "build lab" (by which I mean AOT compilation with some sort of opt-in for more throughput-intensive optimizations)... it's entirely conceivable, for example, that the 2nd round of tiered jitting would want to be a bit more aggressive than single-shot jitting is, or that some AOT scenarios want to be more aggressive still but not to the point of wanting to "turn the dial to 11" like tiered jitting of very hot kernels may want... my inclination is to go with @BruceForstall's point that "Another design goal is generally simplicity: reduce the number of combinations that must be tested" and avoid adding a dimension like that until/unless we have concrete plans to use it for something, so have the "first stab" use the matrix outlined above, but I'm curious what others think.

jkotas · 2017-03-30T00:46:59Z

We had knobs for controlling whether or not to generate debug info in .NET Framework 1.0. It made things like attaching a debugger to existing process or dump debugging work poorly. We turned it on by default in .NET Framework 2.0, together with some work to minimize the debuginfo over head, so that these things just work. The key value prop of .NET is that things work pretty well by default (and you are willing to pay a bit for it) and not having to think about tough choices like not being able to debug vs. having big native PDB around.

I think it would be fine to have a switch that disables generation of debug info for measurements and experiments; but the mainstream .NET experience should be debuggable by default.

.Net Native (what does it do in this regard?)

.Net Native is on the big PDB plan only today. We got feedback that it is not what people expect from .NET. There is work to make it better, e.g. make Environment.StackTrace work well even without the big PDB around.

JosephTremoulet · 2017-03-31T14:55:04Z

Per discussion in dotnet/coreclr#10580, we'll adopt the goal of consolidating the discretionary policy decision-making on the JIT side of the interface, which means that the interface is really about communicating context to the jit.

To that end, dotnet/coreclr#10580 will add flags to identify tier 0 and tier 1 compilation requests (and distinguish them from single-shot compilation requests).

Currently, the VM sets the MIN_OPT flag if the NoOptimization flag is set in the method's metadata, or if a COMPLUS_ variable requests it. Presumably the JIT could make those checks, but on the other hand it's probably good to leave a way for the VM to force "bulletproof low-risk" compilation in the interface.

AFAICT, we aren't really using the SIZE_OPT and SPEED_OPT flags (they seem to only be set in response to COMPLUS variables), and could remove them.

Presumably we'll want to add a "compile the code as quickly as possible" flag on the jit side, using it for tier 0 and maybe cctors.

russellhadley assigned JosephTremoulet Mar 29, 2017

JosephTremoulet removed their assignment Sep 14, 2017

msftgits transferred this issue from dotnet/coreclr Jan 31, 2020

msftgits added this to the Future milestone Jan 31, 2020

AndyAyersMS mentioned this issue Feb 21, 2020

Add support in crossgen2 for 32-byte alignment #32602

Merged

BruceForstall added the JitUntriaged CLR JIT issues needing additional triage label Oct 28, 2020

BruceForstall removed JitUntriaged CLR JIT issues needing additional triage enhancement Product code improvement that does NOT require public API changes/additions optimization tenet-performance Performance related issue labels Nov 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update optimization flags/levels in jit interface #7751

Update optimization flags/levels in jit interface #7751

JosephTremoulet commented Mar 29, 2017 •

edited by BruceForstall

Loading

JosephTremoulet commented Mar 29, 2017

JosephTremoulet commented Mar 29, 2017

JosephTremoulet commented Mar 29, 2017

BruceForstall commented Mar 29, 2017

JosephTremoulet commented Mar 29, 2017

mikedn commented Mar 29, 2017

noahfalk commented Mar 29, 2017

jkotas commented Mar 30, 2017 •

edited

Loading

JosephTremoulet commented Mar 30, 2017

JosephTremoulet commented Mar 30, 2017

jkotas commented Mar 30, 2017

JosephTremoulet commented Mar 31, 2017

Update optimization flags/levels in jit interface #7751

Update optimization flags/levels in jit interface #7751

Comments

JosephTremoulet commented Mar 29, 2017 • edited by BruceForstall Loading

JosephTremoulet commented Mar 29, 2017

JosephTremoulet commented Mar 29, 2017

JosephTremoulet commented Mar 29, 2017

BruceForstall commented Mar 29, 2017

JosephTremoulet commented Mar 29, 2017

mikedn commented Mar 29, 2017

noahfalk commented Mar 29, 2017

jkotas commented Mar 30, 2017 • edited Loading

JosephTremoulet commented Mar 30, 2017

JosephTremoulet commented Mar 30, 2017

jkotas commented Mar 30, 2017

JosephTremoulet commented Mar 31, 2017

JosephTremoulet commented Mar 29, 2017 •

edited by BruceForstall

Loading

jkotas commented Mar 30, 2017 •

edited

Loading