Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for genReg1/genReg2->SIMD8 store on x86 windows. #52581

Merged
merged 3 commits into from
May 13, 2021

Conversation

sandreenko
Copy link
Contributor

Fixes #51506

@sandreenko sandreenko added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 11, 2021
@sandreenko sandreenko self-assigned this May 11, 2021
// and needs to be assembled into a single xmm register,
// note we can't check reg0=EAX, reg1=EDX because they could be already moved.

inst_RV_RV(ins_Copy(targetReg, TYP_FLOAT), targetReg, reg0, TYP_INT);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't use the last argument on x86 except for check that this is not a byte register. On other platforms, we need type only for size reasons (emitActualTypeSize(type)) so it does not really matter if we do inst_RV_RV(ins_Copy(targetReg, TYP_FLOAT), targetReg, reg0, TYP_INT) or inst_RV_RV(ins_Copy(targetReg, TYP_FLOAT), targetReg, reg0, TYP_FLOAT) here. However, looking at the other examples I think it expects the type of the source operand, not the dest.
Example:

inst_RV_RV(ins_Copy(op1loReg, TYP_FLOAT), targetReg, op1loReg, TYP_INT);

but it would be nice to get a confirmation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its meant to be used as inst_RV_RV(ins_Copy(srcReg, tgtType), tgtReg, srcReg, srcType)

For ins_Copy(srcReg, tgtType), its validated that srcReg and tgtType are "different" (meaning one is floating-point and one is integral). The tgtType is then used to determine if int->float or float->int and the correct instruction is selected (this is just movd on x86, but fmov or mov on arm64).

For srcType, this is merely used to pick the right emit attribute so that encoding/disassembly will be correct. In the case of int32<->float it doesn't matter since both are EA_4BYTE, but the typical usage so far has just been srcType.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sandreenko - Could you please update the method definition of inst_RV_RV() and rename the parameter from type -> srcType in that case, for future maintainability?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sandreenko - Could you please update the method definition of inst_RV_RV() and rename the parameter from type -> srcType in that case, for future maintainability?

Sure

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I think I didn't explain enough. To clarify its not necessarily srcType for all instructions, just for the ins_Copy(srcReg, tgtType) instructions.

There can be instructions where the emitAttr needs to be from the dstType or where its something else. It really depends on the particular instruction and what attribute it needs to correctly encode itself.

@sandreenko
Copy link
Contributor Author

PTAL @echesakov , @dotnet/jit-contrib . The change is similar to #46899

@sandreenko sandreenko requested a review from echesakov May 11, 2021 14:47
Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment.

src/coreclr/jit/codegenxarch.cpp Outdated Show resolved Hide resolved
@sandreenko
Copy link
Contributor Author

/azp run runtime-coreclr crossgen2-composite

the pipeline is very red, but lets see if Vector2_3_4 gets fixed in ci.

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@dotnet dotnet deleted a comment from azure-pipelines bot May 11, 2021
@dotnet dotnet deleted a comment from azure-pipelines bot May 11, 2021
@sandreenko
Copy link
Contributor Author

I was able to repro it without crossgen2 using the same test with complus_JitNoStructPromotion=1, now I have checked that the new code gives us the correct results (with and without SSE4.1).

Ready for review.

Note that crossgen2 throws away the produces code and when we corerun the test we compile RunVector2Tests from scratch but it is not related to this issue, FYI @trylek, @davidwrighton .

@trylek
Copy link
Member

trylek commented May 11, 2021

Thanks Sergey for sharing the additional details. Have you been by any chance able to identify what is the reason the Crossgen2 code for the method is not being used at runtime? Do we skip it in Crossgen2 compilation? (Probably not, otherwise we wouldn't be hitting the SIMD JIT assertion.) Is that getting thrown out at runtime by some of the method fixup checks?

@sandreenko
Copy link
Contributor Author

Thanks Sergey for sharing the additional details. Have you been by any chance able to identify what is the reason the Crossgen2 code for the method is not being used at runtime? Do we skip it in Crossgen2 compilation? (Probably not, otherwise we wouldn't be hitting the SIMD JIT assertion.) Is that getting thrown out at runtime by some of the method fixup checks?

I see the method in the "composite-r2r.dll" r2rdump, thanks for providing the command for it. So runtime throws it away, not sure why.

A quick debugging shows that it is rejected here:

#ifdef FEATURE_READYTORUN
if (IsDynamicMethod() && GetLoaderModule()->IsSystem() && MayUsePrecompiledILStub())

this condition returns false so we don't try to load the R2R code.
It is either IsDynamicMethod() or GetLoaderModule()->IsSystem() but I don't have a debug VM to tell which of them rejects it.

@sandreenko
Copy link
Contributor Author

/azp run runtime

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

// Arguments:
// ins - the instruction to generate;
// reg1 - the first register to use, the dst for most instructions;
// tree - the second register to use, the src for most instructions;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tree => reg2

@sandreenko
Copy link
Contributor Author

The tests passed in https://dev.azure.com/dnceng/public/_build/results?buildId=1136599&view=results, "6 failing and 96 successful checks" is just another infra bug.

@sandreenko sandreenko merged commit d77854a into dotnet:main May 13, 2021
@karelz karelz added this to the 6.0.0 milestone May 20, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Jun 19, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Crossgen2] compiler crash in Interop\PInvoke\Vector2_3_4\Vector2_3_4 on x86
6 participants