Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HTTP/3] Fix NullReferenceException on cancellation #54334

Closed
wants to merge 5 commits into from

Conversation

CarnaViire
Copy link
Member

Add abort on Content stream disposal, fix NRE, fix abort direction, add test.
Add volatile for EOS checking, update logging to include msquic pointers.

Fixes #48624

@ghost
Copy link

ghost commented Jun 17, 2021

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

Add abort on Content stream disposal, fix NRE, fix abort direction, add test.
Add volatile for EOS checking, update logging to include msquic pointers.

Fixes #48624

Author: CarnaViire
Assignees: -
Labels:

area-System.Net.Http

Milestone: -

@ghost
Copy link

ghost commented Jun 17, 2021

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

Add abort on Content stream disposal, fix NRE, fix abort direction, add test.
Add volatile for EOS checking, update logging to include msquic pointers.

Fixes #48624

Author: CarnaViire
Assignees: -
Labels:

area-System.Net.Quic

Milestone: -

Comment on lines +84 to +86
var stream = Interlocked.Exchange(ref _stream, null!);
stream.Dispose();
DisposeSyncHelper(stream);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var stream = Interlocked.Exchange(ref _stream, null!);
stream.Dispose();
DisposeSyncHelper(stream);
QuicStream stream = Interlocked.Exchange(ref _stream, null!);
if (stream is not null)
{
stream.Dispose();
DisposeSyncHelper(stream);
}

@@ -81,8 +81,9 @@ public void Dispose()
if (!_disposed)
{
_disposed = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need _disposed, or does _stream being null now indicate disposal?

Comment on lines +95 to +97
var stream = Interlocked.Exchange(ref _stream, null!);
await stream.DisposeAsync().ConfigureAwait(false);
DisposeSyncHelper(stream);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var stream = Interlocked.Exchange(ref _stream, null!);
await stream.DisposeAsync().ConfigureAwait(false);
DisposeSyncHelper(stream);
QuicStream stream = Interlocked.Exchange(ref _stream, null!);
if (stream is not null)
{
await stream.DisposeAsync().ConfigureAwait(false);
DisposeSyncHelper(stream);
}

_connection.RemoveStream(_stream);
_connection = null!;
_stream = null!;
Interlocked.Exchange(ref _connection, null!).RemoveStream(stream);
Copy link
Member

@stephentoub stephentoub Jun 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we've got sole ownership of the stream now due to the Interlocked in the dispose methods above, do we still need this interlocked? Or is there another caller that could get here outside of Dispose{Async}?

Also, if it is still needed, then presumably there's a race condition around nulling this out, in which case the .RemoveStream should be ?.RemoveStream.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've mostly done that to be sure the change is visible to null checks in Http3RequestStream.HandleReadResponseContentException https://github.com/dotnet/runtime/pull/54334/files#diff-b79affd636899ac98b816c1398f9dca19bc4850a939a3815c09c5a9931094a4cR1113-R1120 In the issue I'm trying to solve, disposal and this exception handling are happening concurrently. Here #52800 (comment) @scalablecory says the change might not be visible otherwise. On the other hand, @wfurt offline told me that we usually don't worry about this for example for setting and checking _disposed flag. I am not sure what should be the best practice here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest simply not nulling these fields out (i.e. _stream and _connection).

I'm not quite sure why you null them out today. Is it because you want to avoid doing stuff like Abort on the connection or AbortRead on the QuicStream if we are disposed? Because as the code stands, this is all racy and you aren't avoiding these calls consistently.

You need to either:
(a) Implement full locking here to prevent the races
(b) Avoid the races entirely by not modifying these. (There's still a race here, but it's in msquic code and they are presumably handling this today.)

@@ -1110,18 +1110,26 @@ private void HandleReadResponseContentException(Exception ex, CancellationToken
throw new IOException(SR.net_http_client_execution_error, new HttpRequestException(SR.net_http_client_execution_error, abortException));
case Http3ConnectionException _:
// A connection-level protocol error has occurred on our stream.
_connection.Abort(ex);
_connection?.Abort(ex);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When can this happen? It seems weird that errors on a single HTTP stream would cause the whole connection to be killed.

throw new IOException(SR.net_http_client_execution_error, new HttpRequestException(SR.net_http_client_execution_error, ex));
}
}

private void CancelResponseContentRead()
{
if (Volatile.Read(ref _responseDataPayloadRemaining) != -1) // -1 indicates EOS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an inherent race between setting this and checking it in ReadNextDataFrameAsync.

That means you need to handle the case where the AbortRead happens after we check this in ReadNextDataFrameAsync, but before we try to actually read the frame.

As such, I would suggest not even trying to set _responseDataPayloadRemaining here. Just do AbortRead and let the next attempt to read from the underlying stream fail, and handle it as appropriate.

@CarnaViire CarnaViire marked this pull request as draft June 24, 2021 13:56
@CarnaViire
Copy link
Member Author

JFYI - I'll return to this PR after I'm back from vacation. Discussed with @karelz that this fix can wait.

@CarnaViire
Copy link
Member Author

Closing in favor of #55724

@CarnaViire CarnaViire closed this Jul 15, 2021
@karelz karelz added this to the 6.0.0 milestone Jul 22, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Aug 21, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[HTTP/3] NullReferenceException on cancellation
4 participants