Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use CopyContext to restore saved context on X86 #65490

Merged
merged 3 commits into from
Feb 18, 2022
Merged

Conversation

VSadov
Copy link
Member

@VSadov VSadov commented Feb 17, 2022

Fixes: #65292

@jkotas
Copy link
Member

jkotas commented Feb 17, 2022

Any chance we can switch to RtlRestoreContext on x86 and make it more similar to how other platforms work?

@adamsitnik
Copy link
Member

@VSadov I have built your PR and I can't repro the issue anymore (AMD, x86, Win11) 👍

@filipnavara
Copy link
Member

I've backported the fix on top of release/6.0 branch and built it locally. I can also confirm that the issue no longer reproduces.

//
// REVIEW: CopyContext may fail. in theory. How should we handle that? FailFast?
CONTEXT* pTarget = pExcepPtrs->ContextRecord;
CopyContext(pTarget, pTarget->ContextFlags, pCtx);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems it would make sense to modify the ReplaceExceptionContextRecord to use CopyContext instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not sure if ReplaceExceptionContextRecord can be used outside of Windows. Can it?

Copy link
Member

@filipnavara filipnavara Feb 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not readily obvious but it's always guarded by HOST_WINDOWS and/or !TARGET_UNIX so it looks Windows-only. CopyContext is available on Windows 7 SP1+ which is the minimum target at the moment so that should be fine too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also wonder if ReplaceExceptionContextRecord needs to handle extended context at all if suspension resume switches completely to the RtlRestoreContext plan.

Copy link
Member Author

@VSadov VSadov Feb 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not sure if ReplaceExceptionContextRecord can be used outside of Windows. Can it?

I have tried moving the fix into ReplaceExceptionContextRecord and that did not build on Linux, so that answers this.

@VSadov
Copy link
Member Author

VSadov commented Feb 17, 2022

Any chance we can switch to RtlRestoreContext on x86 and make it more similar to how other platforms work?

I think it is worth trying. The reason for current approach could be compatibility with no longer interesting platforms. (Win95?)

Edit:
The actual reason is that RtlRestoreContext became available on x86 relatively recently and I am not yet sure how far back the support goes - i.e. is it available on Windows7 SP1 ?

@filipnavara
Copy link
Member

Any chance we can switch to RtlRestoreContext on x86 and make it more similar to how other platforms work?

Seems reasonable going forward but I assume that this will need servicing for 6.0 and it looks like too big of a change for backporting?

@VSadov
Copy link
Member Author

VSadov commented Feb 17, 2022

I have looked into unifying on use of RtlRestoreContext and it is promising, but not completely straightforward.

The level of API that we are targeting does not include RtlRestoreContext on x86. We would have to dynamically query/bind to it and use if present. We may also have to keep the old codepath, unless we find some other pattern that is guaranteed to work on all platforms that we support.

I think we might want to take the current fix first though - to unblock clean CI, since the issue causes random failures.

@VSadov VSadov marked this pull request as ready for review February 17, 2022 22:47
@VSadov
Copy link
Member Author

VSadov commented Feb 17, 2022

CC: @davidwrighton

// these contexts may contain extended registers and may have different format
// for reasons such as alignment or context compaction
//
// REVIEW: CopyContext may fail. in theory. How should we handle that? FailFast?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, FailFast is appropriate for failure here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, be careful calling new functions from this function. If they have destructors and the compiler chooses to inline them, the runtime will start crashing.

Copy link
Member Author

@VSadov VSadov Feb 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of code like:

            if (!CopyContext(pTarget, pTarget->ContextFlags, pCtx))
            {
                STRESS_LOG1(LF_SYNC, LL_ERROR, "ERROR: Could not set context record, lastError = 0x%x\n", GetLastError());
                ThrowLastError();
            }

Would that fit here?

It is highly unlikely that copy would fail, but it'd be nice to have some diagnostics if that start happenning.

Copy link
Member

@davidwrighton davidwrighton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a quick fix to unblock testing, I approve of this, but I'm not confident this is the right long term fix. IMO, we should change to RtlRestoreContext on operating systems that support it, and keep the existing horrifying logic for the older OS. Then of course, once we have that, we should backport the fix to servicing.

@davidwrighton
Copy link
Member

PR feedback update looks good to me.

if (!CopyContext(pTarget, pTarget->ContextFlags, pCtx))
{
STRESS_LOG1(LF_SYNC, LL_ERROR, "ERROR: Could not set context record, lastError = 0x%x\n", GetLastError());
ThrowLastError();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ThrowLastError();
EEPOLICY_HANDLE_FATAL_ERROR(COR_E_EXECUTIONENGINE);

The caller does not expect exceptions. This needs to be failfast. Throwing exception is likely going to lead to hard to diagnose crash otherwise.

@@ -2509,6 +2509,18 @@ void RedirectedThreadFrame::ExceptionUnwind()
#ifndef TARGET_UNIX

#ifdef TARGET_X86

// a helper to make sure these calls are not inlined into a filter function.
// in particular we do not want a possibility of inlined destructor calls.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be a concern once you change it to EEPOLICY_HANDLE_FATAL_ERROR. EEPOLICY_HANDLE_FATAL_ERROR is non-inlineable.

@jkotas
Copy link
Member

jkotas commented Feb 18, 2022

As a quick fix to unblock testing, I approve of this, but I'm not confident this is the right long term fix.

+1. It is likely that the OS treatment of context will evolving over time that is going to break the exception filter hack again.

@VSadov
Copy link
Member Author

VSadov commented Feb 18, 2022

It is likely that the OS treatment of context will evolving over time that is going to break the exception filter hack again.

Agree on this as well.

@VSadov
Copy link
Member Author

VSadov commented Feb 18, 2022

Thanks!!!

agocke pushed a commit to agocke/runtime that referenced this pull request Mar 8, 2022
* Use CopyContext to restore saved context on X86

* PR feedback

* more PR feedback
carlossanlop pushed a commit that referenced this pull request Mar 8, 2022
…#66120)

* Use CopyContext to restore saved context on X86 (#65490)

* Use CopyContext to restore saved context on X86

* PR feedback

* more PR feedback

* Do not copy XState other than AVX when redirecting for GC stress (#65825)

* Do not copy XState other than AVX

* #if defined(TARGET_X86) || defined(TARGET_AMD64)

* mask XState unconditionally

* Ensure XSTATE_MASK_AVX is set before calling EEGetThreadContext

* redundant supportsAVX, more clear comment

* PR feedback

* null-check the redirect context before using. (#65910)

* null-check the redirect context before using.

* tweak the comment

* do not allocate context if InitializeContext has unexpected results.

* EE Suspension on x86 should use RtlRestoreContext when available (#65878)

* RestoreContextSimulated

* probe for RtlRestoreContext

* ntdll.dll

* restore self-trap sequence

* PR feedback

* Clarify CopyContext in RedirectedHandledJITCaseExceptionFilter

* simpler indentation.

* restore last error on the legacy path.

* Update src/coreclr/vm/threads.h

Co-authored-by: Dan Moseley <danmose@microsoft.com>

Co-authored-by: Dan Moseley <danmose@microsoft.com>
carlossanlop pushed a commit that referenced this pull request Mar 15, 2022
* Use CopyContext to restore saved context on X86 (#65490)

* Use CopyContext to restore saved context on X86

* PR feedback

* more PR feedback

* Do not copy XState other than AVX when redirecting for GC stress (#65825)

* Do not copy XState other than AVX

* #if defined(TARGET_X86) || defined(TARGET_AMD64)

* mask XState unconditionally

* Ensure XSTATE_MASK_AVX is set before calling EEGetThreadContext

* redundant supportsAVX, more clear comment

* PR feedback

* null-check the redirect context before using. (#65910)

* null-check the redirect context before using.

* tweak the comment

* do not allocate context if InitializeContext has unexpected results.

* EE Suspension on x86 should use RtlRestoreContext when available (#65878)

* RestoreContextSimulated

* probe for RtlRestoreContext

* ntdll.dll

* restore self-trap sequence

* PR feedback

* Clarify CopyContext in RedirectedHandledJITCaseExceptionFilter

* simpler indentation.

* restore last error on the legacy path.

* Update src/coreclr/vm/threads.h

Co-authored-by: Dan Moseley <danmose@microsoft.com>

Co-authored-by: Vladimir Sadov <vsadov@microsoft.com>
Co-authored-by: Dan Moseley <danmose@microsoft.com>
@ghost ghost locked as resolved and limited conversation to collaborators Mar 20, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[x86] SequenceEqual returns false for equal spans
6 participants