Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mono JIT/HardwareIntrinsics - Massive failures #75606

Closed
karelz opened this issue Sep 14, 2022 · 13 comments · Fixed by #75768
Closed

Mono JIT/HardwareIntrinsics - Massive failures #75606

karelz opened this issue Sep 14, 2022 · 13 comments · Fixed by #75768
Labels
area-Codegen-JIT-mono blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms'
Milestone

Comments

@karelz
Copy link
Member

karelz commented Sep 14, 2022

63x failures in each run. Recent regression as of 9/13.

Platform: mono Linux x64 Release @ Ubuntu.1804.Amd64.Open

Last 30 days - JIT.HardwareIntrinsics.X86.Ssse3 in Runfo as of 9/14

  • First failure in PR 14818 on 9/13
  • First rolling run affected: 15289

There are more JIT.HardwareIntrinsics.* Work Items affected.

Example of error in Console log:

    JIT/HardwareIntrinsics/X86/Ssse3/Ssse3_r/Ssse3_r.sh [FAIL]
      
      Return code:      1
      Raw output file:      /datadisks/disk1/work/B0980979/w/B4440A48/uploads/Reports/JIT.HardwareIntrinsics/X86/Ssse3/Ssse3_r/Ssse3_r.output.txt
      Raw output:
      BEGIN EXECUTION
      /datadisks/disk1/work/B0980979/p/corerun -p System.Reflection.Metadata.MetadataUpdater.IsSupported=false Ssse3_r.dll ''
      Supported ISAs:
        AES:       True
        AVX:       False
        AVX2:      False
        AVXVNNI:   False
        BMI1:      True
        BMI2:      True
        FMA:       False
        LZCNT:     True
        PCLMULQDQ: True
        POPCNT:    True
        SSE:       True
        SSE2:      True
        SSE3:      True
        SSE4.1:    True
        SSE4.2:    True
        SSSE3:     True
        X86Serialize: False
      
      Beginning test case Abs.Byte at 9/13/2022 10:36:21 PM
      Random seed: 20010415; set environment variable CORECLR_SEED to this value to repro
      
      Beginning scenario: RunBasicScenario_UnsafeRead
      Beginning scenario: RunBasicScenario_Load
      Beginning scenario: RunBasicScenario_LoadAligned
      Beginning scenario: RunReflectionScenario_UnsafeRead
      ERROR!!!-System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation.

Report

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
0 0 0
@karelz karelz added blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' area-Codegen-JIT-mono labels Sep 14, 2022
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Sep 14, 2022
@karelz
Copy link
Member Author

karelz commented Sep 14, 2022

This is significantly blocking CI (19x impacted runs in ~1 day). We need to roll back the change that caused it or disable the test ASAP.

@SamMonoRT @BrzVlad @fanyang-mono can you please help route it?

@karelz
Copy link
Member Author

karelz commented Sep 14, 2022

BTW: I think I saw in one of the logs inner exception to be PlatformNotSupportedException, but I can't find it now.

@karelz karelz changed the title Mono JIT - Massive failures Mono JIT/HardwareIntrinsics - Massive failures Sep 14, 2022
@SamMonoRT SamMonoRT added this to the 8.0.0 milestone Sep 14, 2022
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Sep 14, 2022
@SamMonoRT
Copy link
Member

@tannergooding @fanyang-mono - can you confirm these failures are not related to #75470

@tannergooding
Copy link
Member

@SamMonoRT, Int128 has no relation to the HardwareIntrinsics tests. They aren't used by eachother in any way.

The Mono llvmaot failures look to be an issue with the RunReflectionScenario and is likely something to do with all the new SIMD bringup Mono has been doing.

I'd guess there isn't handling for the recursive pattern that the intrinsics currently follow.

@tannergooding
Copy link
Member

It might be related to #75438 or one of the other recent Mono SIMD changes

@danmoseley
Copy link
Member

It seems from above that these run on PR validation? So unless there's flakiness, it should be possible to figure which PR first merged with these failures?

@fanyang-mono
Copy link
Member

Right, I am working on it now.

@BrzVlad
Copy link
Member

BrzVlad commented Sep 14, 2022

Here is the timeline of some issues related to simd intrinsics.

There were some failures on android which I investigated, #74797 was a fix for it by disabling intrinsics. There were no llvm failures noticed on that PR, merged on main, roughly one week ago. Then tried to backport to .net 7. Noticed these same failures here on the backport PR. Tested locally my changed, saw that my change was introducing failures that were awkward to fix so gave up on the backport.

Yesterday I disabled that original change via #75438, so behavior is as before. I tested again locally to see the impact of my revert. I notice again that the revert fixes the intrinsics issues for me so I don't understand why there are still failures today on CI for this. Since my PR didn't have failures but the backport PR did, I don't fully trust what is going on.

Note that locally I didn't run the XUnit wrapper, I just ran runtime tests one at a time while aot-ing with llvm SPC.dll and System.Runtime.Intrinsics.dll

@fanyang-mono
Copy link
Member

These are the PR's merged between a good rolling build 20220913.4 and the current bad rolling build 20220913.80
6214022...2bc4f61

The failures are on x64 with Mono runtime. I suspect that #75464 might caused this. @lateralusX Any thoughts?

@lateralusX
Copy link
Member

lateralusX commented Sep 14, 2022

The code in #75464 is guarded with MONO_ARCH_CODE_EXEC_ONLY/MONO_VALIDATE_PLT_ENTRY_INDEX defines.
Not currently defined on any platforms (that we run on CI).

@fanyang-mono
Copy link
Member

I verified that these test failures was exposed by #75438, but introduced by some other PR. Because #75438 enables intrinsics support back on non-full-aot mode Mono. The test failures were probably introduced by one of @matouskozak's AMD64 intrinsics PR's merged during the past week. I am currently working with @matouskozak to get a fix.

The reason why it wasn't caught during PR validation is because a recent change (#74601) caused the effect that the problematic CI lane (mono llvmaot Pri0 Runtime Tests Run Linux x64 release) was skipped on Mono PR validations, when it was not supposed to. @radical is currently working on a fix for that. He has opened a PR - #75645.

@fanyang-mono
Copy link
Member

The failures was actually not introduced by @matouskozak's PR's. It was caused by #75055 for disabling LLVM for the JIT fall back.

@fanyang-mono
Copy link
Member

fanyang-mono commented Sep 23, 2022

I have confirmed that these tests are not failing on CI any more.

Here is the full story:
The test failures were actually caused by #75055. In that PR, LLVM version was upgraded to 14 and mono LLVM JIT stopped working (#75757 tracks the work to fix it) and the LLVM JIT CI lanes disabled the LLVM JIT fall back. The reason why it wasn't caught by PR validation is because before this PR, Vlad made a change (#74797) which disabled intrinsics for LLVM JIT. So they passed even thought LLVM JIT fallback was disabled. As Vlad mentioned earlier, that change was later reverted by #75438. And I've confirmed that these tests started to fail right after that change. And the reason why it wasn't caught by Vlad's revert PR's validation is because Ankit's CI optimization PR (#74601) caused some mono runtime test CI lanes not running against mono source code changes. That issue has been fixed together with my PR to disable these failing tests.

@ghost ghost locked as resolved and limited conversation to collaborators Oct 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-Codegen-JIT-mono blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms'
Projects
None yet
7 participants