Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable _tzcnt_u32/_tzcnt_u64 for ARM64EC #6257

Closed
wants to merge 1 commit into from
Closed

Conversation

mcfi
Copy link
Contributor

@mcfi mcfi commented Aug 4, 2022

ARM64EC doesn't support AVX intrinsics including _tzcnt_u32/_tzcnt_u64, which are mapped to AVX2 instruction tzcnt. Therefore, neuter such intrinsics for ARM64EC compilation.

ARM64EC doesn't support AVX intrinsics including _tzcnt_u32/_tzcnt_u64, which are mapped to AVX2 instruction tzcnt. Therefore, neuter such intrinsics for ARM64EC compilation.
@waywardmonkeys
Copy link
Contributor

Would it make sense to change the logic so that this code only happens for the right processors instead of trying to flag all of the ones where it shouldn't, which is more fragile?

If so, then we could also add sections for the ARM64 platforms with MSVC to use the right intrinsics there (https://docs.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics?view=msvc-170).

@mcfi
Copy link
Contributor Author

mcfi commented Aug 5, 2022

Arm/arm64 doesn’t have same similar intrinsics now, so it’s not possible to check arm64 cpu capability and use intrinsics.

@waywardmonkeys
Copy link
Contributor

ah, right, we only look for trailing zeros, not leading.

The other part is valid though, we can change:

#if (defined(__LP64__) || defined(_WIN64)) && !defined(_M_ARM) && !defined(_M_ARM64) && !defined(_M_ARM64EC)

to check for _M_X64 instead of !_M_ARM && !_M_ARM64 && !_M_ARM64EC.

And similar for the other one ...

@waywardmonkeys
Copy link
Contributor

But now I also see that both x86_64 and arm64 can be defined at once in https://techcommunity.microsoft.com/t5/windows-kernel-internals-blog/getting-to-know-arm64ec-defines-and-intrinsic-functions/ba-p/2957235 ...

It just seems like we can make this a bit less fragile here.

@NikolajBjorner
Copy link
Contributor

Would it make sense to change the logic so that this code only happens for the right processors instead of trying to flag all of the ones where it shouldn't, which is more fragile?

Right

@@ -50,7 +50,7 @@ Revision History:

#if defined(__GNUC__)
#define _trailing_zeros32(X) __builtin_ctz(X)
#elif defined(_WINDOWS) && !defined(_M_ARM) && !defined(_M_ARM64) && !defined(__MINGW32__)
#elif defined(_WINDOWS) && !defined(_M_ARM) && !defined(_M_ARM64) && !defined(_M_ARM64EC) && !defined(__MINGW32__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be defined(_WINDOWS) && (defined(_X86/64) || ...)

@@ -62,7 +62,7 @@ static uint32_t _trailing_zeros32(uint32_t x) {
}
#endif

#if (defined(__LP64__) || defined(_WIN64)) && !defined(_M_ARM) && !defined(_M_ARM64)
#if (defined(__LP64__) || defined(_WIN64)) && !defined(_M_ARM) && !defined(_M_ARM64) && !defined(_M_ARM64EC)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also I would then set the architecture conservatively.

@NikolajBjorner
Copy link
Contributor

I think I should set the architecture conservatively to handle this bug/pull

NikolajBjorner added a commit that referenced this pull request Aug 5, 2022
Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com>
@mcfi
Copy link
Contributor Author

mcfi commented Aug 5, 2022

Note that in ARM64EC compilation, both the _M_X64 and _M_ARM64EC macros are defined. To really only limit the scope of the tzcnt intrinsics, you need to check (defined(_M_X86) || defined(_M_X64) && !defined(_M_ARM64EC)) if you want a more conservative change. Thanks.

NikolajBjorner added a commit that referenced this pull request Aug 5, 2022
Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants