Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove forced serialization of async-over-sync in Stream base methods #53389

Closed
wants to merge 1 commit into from

Conversation

stephentoub
Copy link
Member

Fixes #45089

The base implementation of Stream.BeginRead/Write queue a work items that invoke the abstract Read/Write methods. When Stream.BeginRead/Write were introduced long ago, for reasons I’m not privy to, someone decided it would be a good idea to add protection to these methods, such that if you try to call either BeginRead or BeginWrite while a previous BeginRead or BeginWrite operation was still in flight, the synchronous call to BeginXx would synchronously block. Yuck. Back in .NET Framework 4.5 when we added Stream.Read/WriteAsync, we had to add the base implementations as wrappers for the BeginRead/Write methods, since Read/WriteAsync should pick up the overrides of those methods if they existed. The idea of propagating that synchronous blocking behavior to Read/WriteAsync was unstomachable, but for compatibility we made it so that Read/WriteAsync would still serialize, just asynchronously (later we added a fast path optimization that would skip BeginRead/Write entirely if they weren’t overridden by the derived type). That serialization, however, even though it was asynchronous, was also misguided. In addition to adding overhead, both in terms of needing a semaphore and somewhere to store it and in terms of using that semaphore for every operation, it prevents the concurrent use of read and write. In general, streams aren’t meant to be used concurrently at all, but some streams are duplex and support up to a single reader and single writer at a time. This serialization ends up blocking such duplex streams from being used (if they don’t override Read/WriteAsync), but worse, it ends up hiding misuse of streams that shouldn’t be used concurrently by masking the misuse and turning it into behavior that might appear to work but is unlikely to actually be the desired behavior.

This PR deletes that serialization and then cleans up all the cruft that was built up around it. This is a breaking change, as it’s possible code could have been relying on this (undocumented) protection; the fix for such an app is to stop doing that erroneous concurrent access, which could include applying its own serialization if necessary.

BufferedStream was explicitly using the same serialization mechanism; I left that intact, though we might want to revisit that. BufferedFileStreamStrategy was also using it, as FileStream kinda sorta somewhat tries to enable concurrent (not parallel) usage when useAsync == true on Windows.

cc: @jkotas, @geoffkizer, @adamsitnik, @marklio

@stephentoub stephentoub added area-System.IO breaking-change Issue or PR that represents a breaking API or functional change over a prerelease. needs-breaking-change-doc-created Breaking changes need an issue opened with https://github.com/dotnet/docs/issues/new?template=dotnet labels May 27, 2021
@stephentoub stephentoub added this to the 6.0.0 milestone May 27, 2021
@ghost
Copy link

ghost commented May 27, 2021

Tagging subscribers to this area: @carlossanlop
See info in area-owners.md if you want to be subscribed.

Issue Details

Fixes #45089

The base implementation of Stream.BeginRead/Write queue a work items that invoke the abstract Read/Write methods. When Stream.BeginRead/Write were introduced long ago, for reasons I’m not privy to, someone decided it would be a good idea to add protection to these methods, such that if you try to call either BeginRead or BeginWrite while a previous BeginRead or BeginWrite operation was still in flight, the synchronous call to BeginXx would synchronously block. Yuck. Back in .NET Framework 4.5 when we added Stream.Read/WriteAsync, we had to add the base implementations as wrappers for the BeginRead/Write methods, since Read/WriteAsync should pick up the overrides of those methods if they existed. The idea of propagating that synchronous blocking behavior to Read/WriteAsync was unstomachable, but for compatibility we made it so that Read/WriteAsync would still serialize, just asynchronously (later we added a fast path optimization that would skip BeginRead/Write entirely if they weren’t overridden by the derived type). That serialization, however, even though it was asynchronous, was also misguided. In addition to adding overhead, both in terms of needing a semaphore and somewhere to store it and in terms of using that semaphore for every operation, it prevents the concurrent use of read and write. In general, streams aren’t meant to be used concurrently at all, but some streams are duplex and support up to a single reader and single writer at a time. This serialization ends up blocking such duplex streams from being used (if they don’t override Read/WriteAsync), but worse, it ends up hiding misuse of streams that shouldn’t be used concurrently by masking the misuse and turning it into behavior that might appear to work but is unlikely to actually be the desired behavior.

This PR deletes that serialization and then cleans up all the cruft that was built up around it. This is a breaking change, as it’s possible code could have been relying on this (undocumented) protection; the fix for such an app is to stop doing that erroneous concurrent access, which could include applying its own serialization if necessary.

BufferedStream was explicitly using the same serialization mechanism; I left that intact, though we might want to revisit that. BufferedFileStreamStrategy was also using it, as FileStream kinda sorta somewhat tries to enable concurrent (not parallel) usage when useAsync == true on Windows.

cc: @jkotas, @geoffkizer, @adamsitnik, @marklio

Author: stephentoub
Assignees: -
Labels:

area-System.IO, breaking-change, needs-breaking-change-doc-created

Milestone: 6.0.0

@runfoapp runfoapp bot mentioned this pull request May 28, 2021
The base implementation of Stream.BeginRead/Write queue a work items that invoke the abstract Read/Write methods.  When Stream.BeginRead/Write were introduced long ago, for reasons I’m not privy to, someone decided it would be a good idea to add protection to these methods, such that if you try to call either BeginRead or BeginWrite while a previous BeginRead or BeginWrite operation was still in flight, the synchronous call to BeginXx would synchronously block.  Yuck.  Back in .NET Framework 4.5 when we added Stream.Read/WriteAsync, we had to add the base implementations as wrappers for the BeginRead/Write methods, since Read/WriteAsync should pick up the overrides of those methods if they existed.  The idea of propagating that synchronous blocking behavior to Read/WriteAsync was unstomachable, but for compatibility we made it so that Read/WriteAsync would still serialize, just asynchronously (later we added a fast path optimization that would skip BeginRead/Write entirely if they weren’t overridden by the derived type).  That serialization, however, even though it was asynchronous, was also misguided.  In addition to adding overhead, both in terms of needing a semaphore and somewhere to store it and in terms of using that semaphore for every operation, it prevents the concurrent use of read and write.  In general, streams aren’t meant to be used concurrently at all, but some streams are duplex and support up to a single reader and single writer at a time.  This serialization ends up blocking such duplex streams from being used (if they don’t override Read/WriteAsync), but worse, it ends up hiding misuse of streams that shouldn’t be used concurrently by masking the misuse and turning it into behavior that might appear to work but is unlikely to actually be the desired behavior.

This PR deletes that serialization and then cleans up all the cruft that was built up around it.  This is a breaking change, as it’s possible code could have been relying on this (undocumented) protection; the fix for such an app is to stop doing that erroneous concurrent access, which could include applying its own serialization if necessary.

BufferedStream was explicitly using the same serialization mechanism; I left that intact.  BufferedFileStreamStrategy was also using it, as FileStream kinda sorta somewhat tries to enable concurrent (not parallel) usage when useAsync == true on Windows.
@marklio
Copy link

marklio commented Jun 4, 2021

I haven't yet wrapped my head around the changes yet. If someone is broken by this change, how easy is it for them to recognize there is an issue, diagnose the break, and take action to fix their problematic code (assuming they own the code in question)?

@stephentoub
Copy link
Member Author

If someone is broken by this change, how easy is it for them to recognize there is an issue, diagnose the break, and take action to fix their problematic code (assuming they own the code in question)?

It would mean they're issuing concurrent Read/WriteAsync calls on a Stream that wasn't intended to be used concurrently and that doesn't override Read/WriteAsync to provide a better implementation. It would likely manifest as exceptions (e.g. argument out of range) or corrupted data, since in that situation it's effectively removing a lock that was causing operations to be serialized. Fixing the code would entail not using the Stream in that problematic manner, which could simply mean using their own semaphore to provide that same serialization if really desired.

@marklio
Copy link

marklio commented Jun 4, 2021

It would likely manifest as exceptions (e.g. argument out of range) or corrupted data, since in that situation it's effectively removing a lock that was causing operations to be serialized.

I think the possibility of corruption is the piece that worries me here the most, particularly in cases where concurrent use is probabilistic. I'd rather have an exception/crash than data corruption. It seems completely possible for someone to run validation and not catch corruption. How possible is it for us to deliver a reliable, diagnostic exception for problematic concurrent use?

@marklio
Copy link

marklio commented Jun 4, 2021

cc @dotnet/compat

@stephentoub
Copy link
Member Author

How possible is it for us to deliver a reliable, diagnostic exception for problematic concurrent use?

It's not while also achieving the goals of the removal. Doing that would require tracking the in-flight usage, so we wouldn't be to remove all fields. And we have no way of knowing whether concurrent usage is acceptable or not, so we'd be throwing an exception even if it's something that should be allowed, e.g. for a duplex stream that supports a reader and writer concurrently. At that point, there's little benefit in taking a change here at all.

if (useAsync)

// To ensure that the buffer of a FileStream opened for async IO is flushed
// by FlushAsync in asynchronous way, we aquire a lock for every buffered WriteAsync.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: "aquire" => "acquire"

@stephentoub
Copy link
Member Author

I'm going to close this for .NET 6. I'd like to revisit for .NET 7 when we have more runway.

@ghost ghost locked as resolved and limited conversation to collaborators Jul 30, 2021
@ericstj ericstj removed this from the 6.0.0 milestone Sep 30, 2021
@wfurt
Copy link
Member

wfurt commented Oct 7, 2024

Is this something we should consider for 10 @stephentoub ? It seems like the runway is as long as it gets....

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.IO breaking-change Issue or PR that represents a breaking API or functional change over a prerelease. needs-breaking-change-doc-created Breaking changes need an issue opened with https://github.com/dotnet/docs/issues/new?template=dotnet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider removing Stream's semaphore
5 participants