Coredistools failure to disassemble vextracti64x4 #84967

BruceForstall · 2023-04-18T06:01:18Z

SuperPMI is reporting an asm diff in:

JIT.HardwareIntrinsics.General._Vector512_1.VectorGetAndWithLowerAndUpper__GetAndWithLowerAndUpperByte:RunBasicScenario():this

but no textual diff, for both x64 and x86. For x64, it is in 236d7997-3d71-45f9-b7d7-5241ad89a56f.windows.x64\coreclr_tests.run.windows.x64.checked.mch.

The superpmi log shows:

[10:34:40] ERROR: Decode Failure Left@ offset 1fb4f77e456
[10:34:40] ERROR: Decode Failure Right@ offset 1fb52ec6816
[10:34:40] ERROR: Decode Failure Left@ offset 1fb4f77e456
[10:34:40] ERROR: Decode Failure Right@ offset 1fb52ec6816

The issue is that coredistools is failing to disassemble:

62D3BD483BF001       vextracti64x4 ymm8, zmm6, 1

coredistools was last updated based on LLVM 13.0.1 in March 2022 with dotnet/jitutils#351 (and related). The most recent version is LLVM 16.0.0 released March 2023 (https://releases.llvm.org/). Does it have a fix for this?

I verified that this disassembles correctly in Visual Studio.

Are there other cases where vextracti64x4 is properly decoded so it's not all cases, just some?

If we can't update coredistools with a fix, we'll have to do something so SuperPMI doesn't display spurious diffs. E.g., filter out the failing MCs.

@dotnet/avx512-contrib @dotnet/jit-contrib

The text was updated successfully, but these errors were encountered:

BruceForstall · 2023-04-19T03:19:11Z

I built a coredistools.dll updated to the latest LLVM, 16.0.1, and this failure persists.

tannergooding · 2023-04-19T03:48:25Z

This looks to be an actual bug in the emitted codegen, noting the emitted BD vs the correct FD in the 3rd byte.

Emitted: 62D3BD483BF001
Correct: 62D3FD483BF001

The technical encoding is: EVEX.512.66.0F3A.W1 3B /r ib. where the EVEX encoding is 0x62 P0 P1 P2:

* P0: R  X  B  R' 0  0  m  m
* P1: W  v  v  v  v  1  p  p
* P2: z  L' L  b  V' a  a  a

We've encoded (where P1E is emitted, P1C is correct):

* P0:  1 1 0 1  0 0 1 1
* P1E: 1 0 1 1  1 1 0 1
* P1C: 1 1 1 1  1 1 0 1
* P2:  0 1 0 0  1 0 0 0

So we're simply misencoding the VEX.vvvv register here. It's meant to be stored in bit inverted format and combined with the V' bit, we've got 1_0111 emitted, which is 8.

The error comes in that vextract doesn't use the vvvv field at all, and so it must be 1_1111. Instead since the instruction is ins reg/mem, reg, imm the mod/rm field is used to store the destination.

It's not immediately obvious where we're going wrong here and we may actually have a related issue with the AVX vextractf128 and vextracti128

tannergooding · 2023-04-19T04:11:37Z

So we do document the format incorrectly as IF_RRW_RRW_CNS when it should be IF_RWR_RRD_CNS. This is a long-standing issue and something we have wrong for a couple instructions, but one that doesn't cause any problems in practice.

The issue ends up being that the instruction is incorrectly tracking the IsDstDstSrcAVXInstruction flag, which means we double encode what is supposed to be just regFor012Bits also in the regFor3456Bits. This is unique to the Avx512 extract instructions and isn't an issue for the Avx ones after all

Fix should just be removing the flag. We should also probably audit the IsDstDstSrc and IsDstSrcSrc flags for all SIMD instructions. This is mostly needed for the new the AVX512 ones, but we've recently audited the other flags in general, so it wouldn't hurt to finish doing the same here.

BruceForstall · 2023-04-19T16:44:25Z

It was confusing because both VS and windbg disassemble this instruction as we expect to see it. Presumably because they ignore the field that is incorrectly encoded?

However, when I run this program (JIT\HardwareIntrinsics\General\Vector512_1\Vector512_1_ro\Vector512_1_ro.dll), with DOTNET_TieredCompilation=0 I get an illegal instruction exception (on both AMD 7950X and Intel Xeon Platinum 8370C).

So is this test case actually running in the CI? It must be if the test ended up in the spmi coreclr_tests.run collection.

Is it running but not correctly reporting the error/failure?

tannergooding · 2023-04-19T17:42:11Z

So is this test case actually running in the CI?

The encoding is only incorrect when we have vextracti64x4 reg, reg. If we instead contain it as part of a store and emit vextracti64x4 [addr], reg we don't have any issues.

Up until yesterday we were also only running Vector512 tests in Pri1, so its possible that we weren't seeing this in CI because of that and that it will now be showing up in the outerloop jobs.

BruceForstall · 2023-04-19T17:57:00Z

The latest outerloop job (from Apr 19, 2023 at 2:00 AM PDT) has a few failures in AVX tests, but not in this one:

https://dev.azure.com/dnceng-public/public/_build/results?buildId=244156&view=ms.vss-test-web.build-test-results-tab

However, when I look at the console log for the HardwareIntrinsics_General_ro test, I see the failure:

11:03:55.549 Running test: _Vector512_1_ro::JIT.HardwareIntrinsics.General._Vector512_1.Program.GetAndWithLowerAndUpperByte()
Beginning scenario: RunBasicScenario
Fatal error. System.Runtime.InteropServices.SEHException (0x80004005): External component has thrown an exception.
   at JIT.HardwareIntrinsics.General._Vector512_1.VectorGetAndWithLowerAndUpper__GetAndWithLowerAndUpperByte.RunBasicScenario()
   at JIT.HardwareIntrinsics.General._Vector512_1.Program.GetAndWithLowerAndUpperByte()
   at Program.<<Main>$>g__TestExecutor26|0_25(System.IO.StreamWriter, System.IO.StreamWriter, <>c__DisplayClass0_0 ByRef)
   at Program.<Main>$(System.String[])
App Exit Code: -1073741795
Expected: 100
Actual: -1073741795
END EXECUTION - FAILED
FAILED
[XUnitLogChecker]: Item 'HardwareIntrinsics_General_ro' did not finish running. Checking and fixing the log...
[XUnitLogChecker]: XUnit log file has been fixed!

1259/2363 tests run.
* 1259 tests passed.
* 0 tests failed.
* 0 tests skipped.

[XUnitLogChecker]: Checking for dumps...
[XUnitLogChecker]: No crash dumps found. Continuing...
[XUnitLogChecker]: Finished!
ERROR: The process "corerun.exe" not found.
2023-04-19T11:03:58.485Z	INFO   	run.py	run(48)	main	Beginning reading of test results.
2023-04-19T11:03:58.485Z	INFO   	run.py	__init__(42)	read_results	Searching 'C:\h\w\B51E0A11\w\AD88096E\e' for test results files
2023-04-19T11:03:58.485Z	INFO   	run.py	__init__(48)	read_results	Found results file C:\h\w\B51E0A11\w\AD88096E\e\JIT\HardwareIntrinsics\HardwareIntrinsics_General_ro\HardwareIntrinsics_General_ro.testResults.xml with format xunit
2023-04-19T11:03:58.501Z	INFO   	run.py	__init__(42)	read_results	Searching 'C:\h\w\B51E0A11\w\AD88096E\uploads' for test results files
2023-04-19T11:03:58.501Z	INFO   	run.py	packing_test_reporter(30)	report_results	Packing 1259 test reports to 'C:\h\w\B51E0A11\w\AD88096E\e\__test_report.json'
2023-04-19T11:03:58.501Z	INFO   	run.py	packing_test_reporter(33)	report_results	Packed 281920 bytes
['HardwareIntrinsics_General_ro' END OF WORK ITEM LOG: Command exited with 0]

https://helixre107v0xdcypoyl9e7f.blob.core.windows.net/dotnet-runtime-refs-heads-main-abd91328c2f94763aa/HardwareIntrinsics_General_ro/1/console.b75ffbb8.log?helixlogtype=result

BUT, notice that even though the test failed, the harness returns 0 -- no failure.

Possibly this is a XUnitLogChecker problem?

=======

btw, the tests are invoked with:

call JIT\HardwareIntrinsics\HardwareIntrinsics_General_ro\HardwareIntrinsics_General_ro.cmd -usewatcher

I don't have a HardwareIntrinsics_General_ro.cmd in my locally built artifacts, even though I built pri-1 tests. Why?

I DO have artifacts\tests\coreclr\windows.x64.Checked\JIT\HardwareIntrinsics\General\Vector512_1\Vector512_1_ro\Vector512_1_ro.cmd, for example.

tannergooding · 2023-04-19T18:10:06Z

Possibly this is a XUnitLogChecker problem?

CC. @trylek are there any known issues here? Could we be missing reported tests failures in some cases?

Perhaps we are looking for # failed and not accounting for tests which didn't run, crashes, bad exit codes, or some timeouts?

tannergooding · 2023-04-19T18:11:37Z

I don't have a HardwareIntrinsics_General_ro.cmd in my locally built artifacts, even though I built pri-1 tests. Why?

Did you do anything related to BuildAsStandalone? I think that may stop the merged test wrappers from being built.

BruceForstall · 2023-04-19T18:12:05Z

Also, @ivdiazsa -- it looks like XUnitLogChecker is ignoring the test run exit code

BruceForstall · 2023-04-19T18:12:49Z

Did you do anything related to BuildAsStandalone? I think that may stop the merged test wrappers from being built.

Good point. I always build with BuildAsStandalone=true

BruceForstall · 2023-04-19T18:24:31Z

I opened #85056 to track the CI infrastructure problem (of not reporting this test failure)

markples · 2023-04-19T19:14:05Z

This is the case where the failure takes down the entire process. Presumably the test that you're looking for didn't run. We don't have the wrapper logic to attempt to rerun the rest of the test group after the failure. Mitigations are (1) Disable (or mark as RequiresProcessIsolation) catastrophic failures asap to unblock other testing, and (2) keep a balanced merged test group size (# number of 100s) to keep the wins from merging but reduce the collateral damage from these failures.

I see the logic in src/tests/Directory.Build.targets that disables merged test groups under BuildAsStandalone. It's possible that this isn't necessary. Similar to RequiresProcessIsolation, this generates a .cmd/.sh/local entry point, but a merged group can still execute it out-of-process. I guess often it wouldn't be useful since you can run the test standalone.

tannergooding · 2023-04-19T19:20:10Z

This is the case where the failure takes down the entire process

Shouldn't the harness try to detect this and mark itself as failing? It looks like we do capture the test process having a bad exit code and that we didn't run everything...

markples · 2023-04-19T19:24:11Z

There's an issue where one level needs to report at "success" to avoid helix retrying the job. This isn't an area that I understand - those you've already tagged (plus maybe @jkoritzinsky?) should know more.

BruceForstall · 2023-04-19T19:48:41Z

Presumably the test that you're looking for didn't run

@markples There are multiple things going on here, but the core problem is the important test here, that took down the merged group, did not get reported as failing (and in fact the entire group did not get reported as failing).

ivdiazsa · 2023-04-19T20:18:12Z

Bruce is right. It's a bug with the log checker I'm already looking into. I have no idea why it's overriding the test's exit code. It might have to do with the contents of the HelixCommand script, as the log checker is the last thing to get called.

BruceForstall added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI avx512 Related to the AVX-512 architecture labels Apr 18, 2023

BruceForstall added this to the 8.0.0 milestone Apr 18, 2023

BruceForstall self-assigned this Apr 18, 2023

BruceForstall mentioned this issue Apr 18, 2023

Update coredistools dotnet/jitutils#372

Closed

This was referenced Apr 19, 2023

Remove more unnecessary scenarios from HWIntrinsic test templates and fix timeout/failure #85026

Merged

Ensure vextractf64x4 and vextracti64x4 aren't marked DstDstSrc #85030

Merged

ghost added the in-pr There is an active PR which will close this issue when it is merged label Apr 19, 2023

BruceForstall mentioned this issue Apr 19, 2023

Test failures not reported in test run #85056

Closed

tannergooding closed this as completed in #85030 Apr 19, 2023

ghost removed the in-pr There is an active PR which will close this issue when it is merged label Apr 19, 2023

ghost locked as resolved and limited conversation to collaborators May 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Coredistools failure to disassemble vextracti64x4 #84967

Coredistools failure to disassemble vextracti64x4 #84967

BruceForstall commented Apr 18, 2023

BruceForstall commented Apr 19, 2023

tannergooding commented Apr 19, 2023

tannergooding commented Apr 19, 2023 •

edited

Loading

BruceForstall commented Apr 19, 2023

tannergooding commented Apr 19, 2023

BruceForstall commented Apr 19, 2023

tannergooding commented Apr 19, 2023

tannergooding commented Apr 19, 2023

BruceForstall commented Apr 19, 2023

BruceForstall commented Apr 19, 2023

BruceForstall commented Apr 19, 2023

markples commented Apr 19, 2023

tannergooding commented Apr 19, 2023

markples commented Apr 19, 2023

BruceForstall commented Apr 19, 2023

ivdiazsa commented Apr 19, 2023

Coredistools failure to disassemble vextracti64x4 #84967

Coredistools failure to disassemble vextracti64x4 #84967

Comments

BruceForstall commented Apr 18, 2023

BruceForstall commented Apr 19, 2023

tannergooding commented Apr 19, 2023

tannergooding commented Apr 19, 2023 • edited Loading

BruceForstall commented Apr 19, 2023

tannergooding commented Apr 19, 2023

BruceForstall commented Apr 19, 2023

tannergooding commented Apr 19, 2023

tannergooding commented Apr 19, 2023

BruceForstall commented Apr 19, 2023

BruceForstall commented Apr 19, 2023

BruceForstall commented Apr 19, 2023

markples commented Apr 19, 2023

tannergooding commented Apr 19, 2023

markples commented Apr 19, 2023

BruceForstall commented Apr 19, 2023

ivdiazsa commented Apr 19, 2023

tannergooding commented Apr 19, 2023 •

edited

Loading