Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify JIT label handling #51208

Merged
merged 2 commits into from
Apr 15, 2021

Conversation

BruceForstall
Copy link
Member

Remove the BBF_JMP_TARGET flag that was set early and attempted
to be maintained all through compilation. Instead, use
BBF_USE_LABEL to indicate to the emitter where we need labels.

Also, stop setting and maintaining BBF_USE_LABEL early.

Add a pass over the blocks when preparing for codegen that sets
most of the necessary BBF_USE_LABEL flags. This flag will never
be set before codegen. A few places set the flag after codegen starts,
namely genCreateTempLabel and BBJ_COND handling for alignment.
Note that this flag must be set before the block is processed for
codegen (and an insGroup is created).

Together, these changes make it easier to work with the flow graph
without worrying about maintaining these bits of information through
various optimizations.

Add a few more details about alignment processing to the dump.

There are a few code asm diffs due to alignment processing not previously
creating a label to help compute how large a loop is.

There are a lot of textual asm diffs due to there being (mostly)
fewer labels, plus some additional insGroup output. This can happen if
a block was labeled with BBF_JMP_TARGET or BBF_USE_LABEL before,
but didn't need to be, perhaps after some optimizations. Now, the flag is
never added in the first place.

There are a large number of GC info diffs. Labels are where GC info state
changes are recorded between codegen and the emitter. If we eliminate an
unnecessary emitter label, then we also eliminate a capture of the current
codegen GC state. Since the emitter is lazy at marking GC deaths, this
means that we see a lot of lengthened GC lifetimes -- until the next
label, or some other cause of GC kill. Often, you see a register kill
followed by register birth just disappear, and the register is maintained
alive across the interim.

Remove the BBF_JMP_TARGET flag that was set early and attempted
to be maintained all through compilation. Instead, use
BBF_USE_LABEL to indicate to the emitter where we need labels.

Also, stop setting and maintaining BBF_USE_LABEL early.

Add a pass over the blocks when preparing for codegen that sets
most of the necessary BBF_USE_LABEL flags. This flag will never
be set before codegen. A few places set the flag after codegen starts,
namely `genCreateTempLabel` and BBJ_COND handling for alignment.
Note that this flag must be set before the block is processed for
codegen (and an insGroup is created).

Together, these changes make it easier to work with the flow graph
without worrying about maintaining these bits of information through
various optimizations.

Add a few more details about alignment processing to the dump.

There are a few code asm diffs due to alignment processing not previously
creating a label to help compute how large a loop is.

There are a lot of textual asm diffs due to there being (mostly)
fewer labels, plus some additional insGroup output. This can happen if
a block was labeled with `BBF_JMP_TARGET` or `BBF_USE_LABEL` before,
but didn't need to be, perhaps after some optimizations. Now, the flag is
never added in the first place.

There are a large number of GC info diffs. Labels are where GC info state
changes are recorded between codegen and the emitter. If we eliminate an
unnecessary emitter label, then we also eliminate a capture of the current
codegen GC state. Since the emitter is lazy at marking GC deaths, this
means that we see a lot of lengthened GC lifetimes -- until the next
label, or some other cause of GC kill. Often, you see a register kill
followed by register birth just disappear, and the register is maintained
alive across the interim.
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 14, 2021
@BruceForstall
Copy link
Member Author

@AndyAyersMS @kunalspathak @dotnet/jit-contrib PTAL

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

// an EH region, and the block that immediately follows a region. Go through the EH
// table and mark all these blocks with BBF_HAS_LABEL to make this happen.
//
// This code is closely couple with genReportEH() in the sense that any block
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// This code is closely couple with genReportEH() in the sense that any block
// This code is closely coupled with genReportEH() in the sense that any block

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some feedback around loop alignment related changes. Could you also update your PR description to say BBF_HAS_LABEL instead of BBF_USE_LABEL?

if (block->bbNext != nullptr)
{
JITDUMP("Mark " FMT_BB " as label: alignment end-of-loop\n", block->bbNext->bbNum);
block->bbNext->bbFlags |= BBF_HAS_LABEL;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be an assert that block->bbNext->bbFlags contains BBF_HAS_LABEL already? In what cases it won't have that flag? I see a comment in genMarkLabelsForCodegen that says (fall-through flow doesn't require a label)...was that always true or is that what we will start seeing with your change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is a case where the following block will not necessarily have a label. It was always true that fall-through flow doesn't require a label, but before there were more cases where it "accidentally" did have a label. The alignment code never specified that it needed a label, so it would calculate loop sizes based on whatever happened to be the next label. After my change, we have fewer labels, so some loops ended up "too large" for alignment. Here, I force the label to exist when we know alignment wants one, for calculating loop sizes.

The one possibility is I could move this setting to genMarkLabelsForCodegen; I decided against it, but it doesn't really matter. Maybe centralizing the logic would be better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so some loops ended up "too large" for alignment.

Yes, we should definitely compare loopsAligned before and after to make sure we are aligning right amount of loops with this change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Over aspnet/benchmarks/libraries.crossgen/tests SPMI collections, I see loops/align candidates/loops aligned of 6135/1550/647 for baseline and the same bug 646 loops aligned for diff (my change). So it appears that exactly one fewer loop was aligned.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. Did you figure out the CI failure and is it related?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion was due to the JIT deciding a loop was no longer a loop, then compacting the preheader with the former loop head, moving the align bit with the merge (with my new code). But since the (former) loop head no longer was the target of any branch, it didn't get a label. Solution: remove the align bit when this optimization happens.

if (bNext->isLoopAlign())
{
// Only if the new block is jump target or has label
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this condition because I did see cases where we were wrongly adding the BBF_LOOP_ALIGN flag on blocks that no longer constitute a loop. Perhaps, the failures that you are seeing in CI run is related to that. You might also want to double check the loopsAligned stats before vs. after your change because after relaxing this condition, we might be aligning too many loops (some of them are not valid candidates anymore).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that the number of loop aligned changes with my changes to the set of labels applied, so any before/after will mostly (I think) not be related to this case.

We don't have labels at this point in compilation, so we can't do the check here that was previously done.

I don't quite understand why this check was there. Say we're compacting BB1 and BB2. And BB2 is marked BBF_LOOP_ALIGN, meaning it's the top of a loop. Wouldn't the compacted BB1+BB2 also be the top of a loop? Is this a case where your recent change to fgUpdateLoopsAfterCompacting would fix any problem that you were seeing before?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I recall correctly, it was the case for which you are seeing the CI failures (I may be wrong). In the original code, you might add an assert around that code and you should find cases where the hypothesis of compacting loopTop is not true.

    if (bNext->isLoopAlign())
    {
        // Only if the new block is jump target or has label
        if (((block->bbFlags & BBF_JMP_TARGET) != 0) || ((block->bbFlags & BBF_HAS_LABEL) != 0))
        {
            block->bbFlags |= BBF_LOOP_ALIGN;
            JITDUMP("Propagating LOOP_ALIGN flag from " FMT_BB " to " FMT_BB " during compacting.\n", bNext->bbNum,
                    block->bbNum);
        }
        else
       {
         assert(!"This is the case");
       }
    }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No SPMI diffs with that assert. I triggered a PR with the assert to see if it occurs elsewhere: #51290

@BruceForstall
Copy link
Member Author

/azp run runtime-coreclr outerloop

@BruceForstall
Copy link
Member Author

/azp run runtime-coreclr jitstress

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

1 similar comment
@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

JITDUMP("\n");

#if FEATURE_LOOP_ALIGN
if (begBlk->isLoopAlign())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good catch, I think I missed this earlier although there was one rare case that I handled in :

runtime/src/coreclr/jit/morph.cpp

Lines 16607 to 16610 in f958f2b

#if FEATURE_LOOP_ALIGN
optLoopTable[loopNum].lpFirst->bbFlags &= ~BBF_LOOP_ALIGN;
JITDUMP("Removing LOOP_ALIGN flag from bogus loop in " FMT_BB "\n",
optLoopTable[loopNum].lpFirst->bbNum);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just sent out draft PR #51313 to test how often we retain align flag on blocks that are no longer loop. Probably not because we might have hit an assert during codegen, but just want to double check.

@BruceForstall
Copy link
Member Author

All stress failures are from known problems.

@BruceForstall BruceForstall merged commit a2e36a7 into dotnet:main Apr 15, 2021
@BruceForstall BruceForstall deleted the RemoveBBF_JMP_TARGET branch April 15, 2021 16:45
@JulieLeeMSFT JulieLeeMSFT added this to the 6.0.0 milestone Apr 15, 2021
@ghost ghost locked as resolved and limited conversation to collaborators May 15, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants