Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[x86] stress failure in RayTracer.GetNaturalColor with DOTNET_JitStress=2 #102590

Closed
VSadov opened this issue May 23, 2024 · 3 comments
Closed

[x86] stress failure in RayTracer.GetNaturalColor with DOTNET_JitStress=2 #102590

VSadov opened this issue May 23, 2024 · 3 comments
Assignees
Labels
arch-x86 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI Known Build Error Use this to report build issues in the .NET Helix tab
Milestone

Comments

@VSadov
Copy link
Member

VSadov commented May 23, 2024

Assert failure(PID 14856 [0x00003a08], Thread: 2656 [0x0a60]): !CREATE_CHECK_STRING(pMT && pMT->Validate())

CORECLR! Object::ValidateInner + 0x14B (0x7436d48b)
CORECLR! Object::Validate + 0x98 (0x7436d308)
CORECLR! WKS::GCHeap::Promote + 0x8F (0x746c0c1f)
CORECLR! GcEnumObject + 0x57 (0x744866c7)
CORECLR! EnumGcRefsX86 + 0xC70 (0x74298a50)
CORECLR! EECodeManager::EnumGcRefs + 0x114 (0x74297d94)
CORECLR! GcStackCrawlCallBack + 0x156 (0x74486936)
CORECLR! Thread::MakeStackwalkerCallback + 0x48 (0x743ad355)
CORECLR! Thread::StackWalkFramesEx + 0x16C (0x743ae72f)
CORECLR! Thread::StackWalkFrames + 0x15D (0x743ae53f)
    File: E:\dotnet004\runtime\src\coreclr\vm\object.cpp:553
    Image: E:\dotnet004\runtime\artifacts\tests\coreclr\windows.x86.Checked\Tests\Core_Root\corerun.exe

To repro run
runtime\artifacts\tests\coreclr\windows.x86.Checked\JIT\Performance\JIT.performance\JIT.performance.cmd with

DOTNET_GCStress=0xC
DOTNET_JitStress=2
DOTNET_ReadyToRun=0
DOTNET_TieredCompilation=0

removing DOTNET_JitStress=2 changes the codegen considerably and makes the failure disappear, so it looks like it might be specific to JitStress.

It may be necessary to run the test a few times (3-5 times, maybe more) before seeing the failure. I'd run it in a loop.
Perhaps some uninitialized data is involved, thus nondeterminism.

The actual failing instruction is:

G_M31285_IG01:        ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, nogc <-- Prolog IG
       push     ebp
       mov      ebp, esp
       push     edi
       push     esi
       push     ebx
       sub      esp, 160
       mov      gword ptr [ebp-0x9C], ecx
                    ; GC ptr vars +{V00}
       mov      bword ptr [ebp-0xA0], edx
                    ; GC ptr vars +{V01}
       mov      eax, gword ptr [ebp+0x08]
                    ; gcrRegs +[eax]
       vmovups  xmm0, xmmword ptr [ebp+0x24]
                                                ;; size=32 bbWeight=1 PerfScore 10.50
G_M31285_IG02:        ; bbWeight=1, gcVars=0000000000000900 {V00 V01}, gcrefRegs=00000001 {eax}, byrefRegs=00000000 {}, gcvars, byref
       vxorps   xmm3, xmm3, xmm3
                                                ;; size=4 bbWeight=1 PerfScore 0.33
G_M31285_IG03:        ; bbWeight=1, extend
       vmovups  xmmword ptr [ebp-0x50], xmm3
                                                ;; size=5 bbWeight=1 PerfScore 1.00
G_M31285_IG04:        ; bbWeight=1, extend
       mov      gword ptr [ebp+0x08], eax
                                                ;; size=3 bbWeight=1 PerfScore 1.00
G_M31285_IG05:        ; bbWeight=1, extend
       mov      edx, gword ptr [eax+0x08]
                    ; gcrRegs +[edx]
                                                ;; size=3 bbWeight=1 PerfScore 2.00
G_M31285_IG06:        ; bbWeight=1, extend
       mov      gword ptr [ebp-0xA4], edx
                    ; GC ptr vars +{V08}
                                                ;; size=6 bbWeight=1 PerfScore 1.00
G_M31285_IG07:        ; bbWeight=1, extend
       xor      ecx, ecx
                                                ;; size=2 bbWeight=1 PerfScore 0.25
G_M31285_IG08:        ; bbWeight=1, extend
       cmp      dword ptr [edx+0x04], 0
                                                ;; size=4 bbWeight=1 PerfScore 3.00
G_M31285_IG09:        ; bbWeight=1, extend
       jle      G_M31285_IG175
                                                ;; size=6 bbWeight=1 PerfScore 1.00
G_M31285_IG10:        ; bbWeight=4, gcVars=0000000000000940 {V00 V01 V08}, gcrefRegs=00000004 {edx}, byrefRegs=00000000 {}, gcvars, byref
                    ; gcrRegs -[eax]
       mov      dword ptr [ebp-0x10], ecx
                                                ;; size=3 bbWeight=4 PerfScore 4.00
G_M31285_IG11:        ; bbWeight=4, extend
       mov      ebx, gword ptr [edx+4*ecx+0x08]
                    ; gcrRegs +[ebx]
                                                ;; size=4 bbWeight=4 PerfScore 8.00
G_M31285_IG12:        ; bbWeight=4, extend
       vmovsd   xmm4, qword ptr [ebx+0x04]
                                                ;; size=5 bbWeight=4 PerfScore 16.00
G_M31285_IG13:        ; bbWeight=4, extend
       vinsertps xmm4, xmm4, dword ptr [ebx+0x0C], 40
                                                ;; size=7 bbWeight=4 PerfScore 12.00
G_M31285_IG14:        ; bbWeight=4, extend
       vsubps   xmm4, xmm4, xmm0
                                                ;; size=4 bbWeight=4 PerfScore 12.00
G_M31285_IG15:        ; bbWeight=4, extend
       vdpps    xmm5, xmm4, xmm4, 127
                                                ;; size=6 bbWeight=4 PerfScore 48.00
G_M31285_IG16:        ; bbWeight=4, extend
       vcvtss2sd xmm5, xmm5, xmm5
                                                ;; size=4 bbWeight=4 PerfScore 16.00
G_M31285_IG17:        ; bbWeight=4, extend
       vsqrtsd  xmm5, xmm5, xmm5
                                                ;; size=4 bbWeight=4 PerfScore 48.00
G_M31285_IG18:        ; bbWeight=4, extend
       vmovsd   qword ptr [ebp-0x98], xmm5
                                                ;; size=8 bbWeight=4 PerfScore 4.00
G_M31285_IG19:        ; bbWeight=4, extend
       vcvtsd2ss xmm6, xmm6, xmm5
                                                ;; size=4 bbWeight=4 PerfScore 16.00
G_M31285_IG20:        ; bbWeight=4, extend
       vxorps   xmm7, xmm7, xmm7
                                                ;; size=4 bbWeight=4 PerfScore 1.33
G_M31285_IG21:        ; bbWeight=4, extend
       vucomiss xmm6, xmm7
                                                ;; size=4 bbWeight=4 PerfScore 8.00
G_M31285_IG22:        ; bbWeight=4, isz, extend
       jp       SHORT G_M31285_IG24
                                                ;; size=2 bbWeight=4 PerfScore 4.00
G_M31285_IG23:        ; bbWeight=4, isz, extend
       je       SHORT G_M31285_IG27
                                                ;; size=2 bbWeight=4 PerfScore 4.00
G_M31285_IG24:        ; bbWeight=2, gcrefRegs=00000008 {ebx}, byrefRegs=00000000 {}, byref
                    ; gcrRegs -[edx]
       vmovss   xmm7, dword ptr [@RWD00]
                                                ;; size=8 bbWeight=2 PerfScore 6.00
G_M31285_IG25:        ; bbWeight=2, extend
       vdivss   xmm6, xmm7, xmm6
                                                ;; size=4 bbWeight=2 PerfScore 20.00
G_M31285_IG26:        ; bbWeight=2, isz, extend
       jmp      SHORT G_M31285_IG28
                                                ;; size=2 bbWeight=2 PerfScore 4.00
G_M31285_IG27:        ; bbWeight=2, gcrefRegs=00000008 {ebx}, byrefRegs=00000000 {}, byref
       vmovss   xmm6, dword ptr [@RWD04]
                                                ;; size=8 bbWeight=2 PerfScore 6.00
G_M31285_IG28:        ; bbWeight=4, gcrefRegs=00000008 {ebx}, byrefRegs=00000000 {}, byref
       vcvtss2sd xmm6, xmm6, xmm6
                                                ;; size=4 bbWeight=4 PerfScore 16.00
G_M31285_IG29:        ; bbWeight=4, extend
       vcvtsd2ss xmm6, xmm6, xmm6
                                                ;; size=4 bbWeight=4 PerfScore 16.00
G_M31285_IG30:        ; bbWeight=4, extend
       vbroadcastss xmm6, xmm6
                                                ;; size=5 bbWeight=4 PerfScore 4.00
G_M31285_IG31:        ; bbWeight=4, extend
       vmulps   xmm4, xmm6, xmm4
                                                ;; size=4 bbWeight=4 PerfScore 12.00
G_M31285_IG32:        ; bbWeight=4, extend
       vmovups  xmmword ptr [ebp-0x80], xmm4
                                                ;; size=5 bbWeight=4 PerfScore 4.00
G_M31285_IG33:        ; bbWeight=4, extend
       xor      eax, eax
                                                ;; size=2 bbWeight=4 PerfScore 1.00
G_M31285_IG34:        ; bbWeight=4, extend
       lea      edi, bword ptr [ebp-0x28]
                    ; byrRegs +[edi]
                                                ;; size=3 bbWeight=4 PerfScore 2.00
G_M31285_IG35:        ; bbWeight=4, extend
       mov      dword ptr [edi], eax
                                                ;; size=2 bbWeight=4 PerfScore 4.00
G_M31285_IG36:        ; bbWeight=4, extend
       mov      esi, 20


                                                ;; size=5 bbWeight=4 PerfScore 1.00
G_M31285_IG37:        ; bbWeight=4, gcrefRegs=00000008 {ebx}, byrefRegs=00000080 {edi}, byref
       mov      dword ptr [edi+esi], eax
                                                ;; size=3 bbWeight=4 PerfScore 4.00
G_M31285_IG38:        ; bbWeight=4, extend
       sub      esi, 4
                                                ;; size=3 bbWeight=4 PerfScore 1.00
G_M31285_IG39:        ; bbWeight=4, isz, extend
       jne      SHORT G_M31285_IG37


                                                ;; size=2 bbWeight=4 PerfScore 4.00
G_M31285_IG40:        ; bbWeight=4, extend
       vmovups  xmmword ptr [ebp+0x24], xmm0
                                                ;; size=5 bbWeight=4 PerfScore 4.00
G_M31285_IG41:        ; bbWeight=4, extend
       vmovsd   qword ptr [ebp-0x28], xmm0                        <--- crash here on reporting untracked local to GC and seeing bad object

Since this is in the middle of SIMD instruction sequence, I wonder if this may be related to recent vectorizing optimizations for struct copying/initialization.

Build Information

Build: https://dev.azure.com/dnceng-public/public/_build/results?buildId=685688
Build error leg or test failing: JIT.performance.0.1
Pull request: #102415

Error Message

Fill the error message using step by step known issues guidance.

{
  "ErrorMessage": "Item 'JIT.performance' did not finish running",
  "ErrorPattern": "",
  "BuildRetry": false,
  "ExcludeConsoleLog": false
}

Known issue validation

Build: 🔎 https://dev.azure.com/dnceng-public/public/_build/results?buildId=685688
Error message validated: [Item 'JIT.performance' did not finish running]
Result validation: ✅ Known issue matched with the provided build.
Validation performed at: 5/26/2024 12:59:42 AM UTC

Report

Build Definition Test Pull Request
742830 dotnet/runtime JIT.performance.WorkItemExecution #104906
742514 dotnet/runtime JIT.performance.WorkItemExecution #104944
742451 dotnet/runtime JIT.performance.WorkItemExecution #104906
741624 dotnet/runtime JIT.performance.WorkItemExecution #104906
739394 dotnet/runtime JIT.performance.WorkItemExecution #104445
732644 dotnet/runtime JIT.performance.WorkItemExecution #103837
732378 dotnet/runtime JIT.performance.WorkItemExecution #104445
732173 dotnet/runtime JIT.performance.WorkItemExecution #104445
732165 dotnet/runtime JIT.performance.WorkItemExecution #104517
732116 dotnet/runtime JIT.performance.WorkItemExecution #104445
732101 dotnet/runtime JIT.performance.WorkItemExecution #104445
732053 dotnet/runtime JIT.performance.WorkItemExecution #104445
732003 dotnet/runtime JIT.performance.WorkItemExecution #104103
731745 dotnet/runtime JIT.performance.WorkItemExecution #104445
731069 dotnet/runtime JIT.performance.WorkItemExecution #104445
730950 dotnet/runtime JIT.performance.WorkItemExecution #104445
727224 dotnet/runtime JIT.performance.WorkItemExecution #104288

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
0 0 17
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 23, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label May 23, 2024
@VSadov
Copy link
Member Author

VSadov commented May 23, 2024

CC: @EgorBo

@VSadov VSadov added arch-x86 Known Build Error Use this to report build issues in the .NET Helix tab labels May 23, 2024
@JulieLeeMSFT JulieLeeMSFT added Priority:2 Work that is important, but not critical for the release and removed untriaged New issue has not been triaged by the area owner labels Jun 3, 2024
@JulieLeeMSFT JulieLeeMSFT added this to the 9.0.0 milestone Jun 3, 2024
@EgorBo
Copy link
Member

EgorBo commented Jul 18, 2024

"ErrorMessage": "Item 'JIT.performance' did not finish running",

is a too generic pattern and attracts unrelated build failures here, I wasn't able to find any related gc assert in all builds listed in the table. Nor I was able to repro it locally - I am running the JIT.performance on win-x86 with the env vars you listed in a loop - no hits so far. Perhaps, it has already been fixed since the end of May.

@EgorBo EgorBo removed the Priority:2 Work that is important, but not critical for the release label Jul 18, 2024
@EgorBo
Copy link
Member

EgorBo commented Aug 1, 2024

closing due to 0 hits in last 7 days. I bet it was fixed by Jakob's PR

@EgorBo EgorBo closed this as completed Aug 1, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Aug 31, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-x86 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI Known Build Error Use this to report build issues in the .NET Helix tab
Projects
None yet
Development

No branches or pull requests

3 participants