Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using async I/O in .NET 6 #10887

Closed
dtivel opened this issue May 21, 2021 · 4 comments
Closed

Consider using async I/O in .NET 6 #10887

dtivel opened this issue May 21, 2021 · 4 comments
Labels
Functionality:Restore Priority:2 Issues for the current backlog. Tenet:Performance Performance issues

Comments

@dtivel
Copy link
Contributor

dtivel commented May 21, 2021

Reimplemented async I/O in .NET 6 offers much improved performance benefits. See this for details.

We disabled async I/O in NuGet/NuGet.Client#2488 based on dotnet/runtime#27643. .NET 6 fixes this problem.

Note that we still would need to preserve current performance on older frameworks (e.g.: NuGet.exe on .NET Framework 4.8).

@dtivel dtivel added the Tenet:Performance Performance issues label May 21, 2021
@nkolev92 nkolev92 added Functionality:Restore Priority:2 Issues for the current backlog. labels Jun 16, 2021
@dtivel
Copy link
Contributor Author

dtivel commented Jan 12, 2022

@zivkan
Copy link
Member

zivkan commented Jan 12, 2022

@dtivel I created some benchmarks to test sync vs async vs mmap IO for package extraction: https://github.com/zivkan/Benchmarks/blob/main/WritePerformance/PackageExtraction.cs

On .NET Framework I found that async IO is about 50% slower than sync IO, and even on .NET 6, it's only about the same perf as sync IO, it wasn't meaninfully faster (my memory was that it was still slightly slower, since async has additional scheduling overheads that blocking IO does not, but I don't remember the exact numbers, as my main concern was sync IO vs mmap).

As long as Visual Studio is using the .NET Framework, it would be a huge perf regression to switch to async IO, unless there's a trick I don't know about. Even in .NET 6, it's a lot of work for not much benefit, according to my benchmarks.

@dtivel
Copy link
Contributor Author

dtivel commented Jan 12, 2022

That's great. Feel free to close this issue then.

@zivkan
Copy link
Member

zivkan commented Jan 12, 2022

Looking at the blog post, and the linked NuGet PR, I see that my benchmark was doing async IO wrong, causing it to use sync over async. Hopefully I fixed it properly (locally, I haven't yet pushed to my repo), I got the following results:

.NET 6

Method MaxParallel Mean Error StdDev Ratio RatioSD
CopyTo 8 5.778 s 3.883 s 0.2129 s 1.00 0.00
CopyToAsync 8 6.100 s 3.495 s 0.1916 s 1.06 0.03
CopyToMmap 8 5.821 s 2.242 s 0.1229 s 1.01 0.03
CopyTo 32 5.887 s 3.629 s 0.1989 s 1.00 0.00
CopyToAsync 32 6.084 s 3.887 s 0.2131 s 1.03 0.02
CopyToMmap 32 6.018 s 5.811 s 0.3185 s 1.02 0.04

.NET Framework 4.8

Method MaxParallel Mean Error StdDev Ratio RatioSD
CopyTo 8 6.780 s 2.988 s 0.1638 s 1.00 0.00
CopyToAsync 8 66.611 s 2.958 s 0.1621 s 9.83 0.26
CopyToMmap 8 6.838 s 2.806 s 0.1538 s 1.01 0.05
CopyTo 32 6.575 s 7.413 s 0.4063 s 1.00 0.00
CopyToAsync 32 38.789 s 12.847 s 0.7042 s 5.92 0.41
CopyToMmap 32 6.785 s 7.217 s 0.3956 s 1.03 0.07

I tested with 8 and 32 parallel extractions because my machine has 8 physical core, 16 logical core CPU. From my previous tests running 1, 2, 4, 8, 16, and 32 max parallel extractions, I found 8 was where significant performance gains stopped on this hardware. I tested 32 here because the point of async is to allow more efficient application task switching compared to operating system thread switching, so max parallel < (logical) processor with async APIs is theoretically always slower than blocking APIs because of async task scheduling overheads.

I also had Windows Defender exclude the write directory from scanning, because on .NET Framework this has a massive perf penalty that I'm slowing trying to investigate and report to the Windows Defender team.

So, I confirm that on .NET 6 async is only slightly slower, even when MaxParallel is greater than Environment.ProcessorCount.
My guess is this is because ZipArchive does not have Async APIs, and therefore zipStream.CopyToAsync is doing sync over async for the read part of the extraction.

Whereas on .NET Framework 4.8, async IO takes 10x as long (assuming the Windows Defender perf impact is mitigated).

Since there doesn't appear to be any scenario where async file IO benefits package extraction, I'm closing this issue.

@zivkan zivkan closed this as completed Jan 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Functionality:Restore Priority:2 Issues for the current backlog. Tenet:Performance Performance issues
Projects
None yet
Development

No branches or pull requests

4 participants