Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Add managed packet recycling #389

Merged
merged 3 commits into from
May 21, 2020

Conversation

Wraith2
Copy link
Contributor

@Wraith2 Wraith2 commented Jan 19, 2020

When using the managed network interface SNIPacket instances are used to contain data being sent and received from the socket stream. These packets rent their buffers from the arraypool to try and make sure we aren't constantly allocating and discarding (by default 8K) large arrays, this isn't being particularly effective because a lot of packet instances are being dropped to the GC which loses the array to the GC as well. This isn't harmful it just isn't a good use of the arraypool. The arrays allocated are the most prevalent class in memory traces.

packetrecycle-before

This PR add a new class in the handle hierarchy, the SNIPhysicalHandle and adds a packet pool to it. The packet pool implementation is derived from a very simple one used and tested in the aspnet extensions repo. The physical handle is used as the base class for named pipe and tcp handles, mars handle composes those and defers to the physical connection for abstract members. The abstract members are used to rent and return packets to the pool. This means that each 'real' connection has a packet pool and that packets used to read and write from the physical connection will be served from that pool. The pool is small and I think that each pool should usually contain no more than 3 packets, 1 send, I receive, 1 attn but it can contain more. It isn't invalid to use more packets than are in the pool as long as they are created and disposed through the pool. Mars will need more packets for partial packet reconstruction.

In order to track the lifetime of packets and make sure they were always sourced and returned from the pool needed to add a lot of debugging infrastructure to the packet and pool. This allowed me to track down a subtle packet reuse bug that was fixed in #350 This code is debug only mostly. There is some additional tracing put on a compiler #define in the packet to allow tracing of individual packets, this is very performance expensive but very valuable when you need to know what is really going on with a piece of data.

Reuse of packets cleans up quite a lot of lifetime handling code and means that read and write paths no longer need to care about who is responsible for cleaning up the packet, it's handled elsewhere. It also means that the allocation of the async read callback can be cached and reused which is another small memory save..

This has passed all the tests and I've put it through some pretty punishing benching with the DataAccessPerformance benchmark project used to test throughput performance. It's stable and roughly 7% faster on my configuration.

Configuration Throughput StdDev Name
ado-sqlclient+async+32 31225 1798 managed master
ado-sqlclient+async+32 33519 2059 managed recycle

@cheenamalhotra cheenamalhotra added the 📈 Performance Use this label for performance improvement activities label Feb 6, 2020
@Wraith2
Copy link
Contributor Author

Wraith2 commented Feb 24, 2020

Any feedback @cheenamalhotra @David-Engel @saurabh500 ?

Copy link
Contributor

@David-Engel David-Engel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like these perf changes but it does take some time to digest what exactly is going on in the changes. I've been trying to do that today and I think it looks good but I'd like a second set of eyes on it.

Also, please merge changes from master to get fresh build results. I think there were some existing-but-newly-failing-tests that have since been fixed since you originally created the PR (although I see the enclave pipeline still seems to be failing). I also adjusted the PR pipelines to retain build logs for a bit longer.

@Wraith2
Copy link
Contributor Author

Wraith2 commented Feb 25, 2020

This is a pretty deep change so I was assuming it would be post 2.0 but if it's a poor path to take I'd prefer to get that feedback so I can work along other avenues.

@David-Engel
Copy link
Contributor

@Wraith2 No, I think it's good. For 2.0, we are okay with a bit more risk since we are bumping the major version. And since we have our own release vehicle, it's easier for us to hotfix issues quickly. The problem with this one is just finding the time it takes to wrap your head around the changes. Feel free to tag me or request my review directly if your PRs aren't getting attention (I have to filter GitHub notification emails to other folders or my inbox is unmanageable) and I will do my best to help review them.

@saurabh500
Copy link
Contributor

saurabh500 commented Feb 25, 2020

I havent looked into this PR. However I saw a packet recycle mechanism which already exists for Native SNI.

internal sealed class WritePacketCache : IDisposable
{
private bool _disposed;
private Stack<SNIPacket> _packets;
public WritePacketCache()
{
_disposed = false;
_packets = new Stack<SNIPacket>();
}
public SNIPacket Take(SNIHandle sniHandle)
{
SNIPacket packet;
if (_packets.Count > 0)
{
// Success - reset the packet
packet = _packets.Pop();
SNINativeMethodWrapper.SNIPacketReset(sniHandle, SNINativeMethodWrapper.IOType.WRITE, packet, SNINativeMethodWrapper.ConsumerNumber.SNI_Consumer_SNI);
}
else
{
// Failed to take a packet - create a new one
packet = new SNIPacket(sniHandle);
}
return packet;
}
public void Add(SNIPacket packet)
{
if (!_disposed)
{
_packets.Push(packet);
}
else
{
// If we're disposed, then get rid of any packets added to us
packet.Dispose();
}
}
public void Clear()
{
while (_packets.Count > 0)
{
_packets.Pop().Dispose();
}
}
public void Dispose()
{
if (!_disposed)
{
_disposed = true;
Clear();
}
}
}
}

I don't believe this cache was implemented for Managed SNI. Would this help and maintain parity in the way Native SNI is caching packets, instead of creating another way to manage a packet cache?

@saurabh500
Copy link
Contributor

BTW I don't mean to say that what has been done in this PR is wrong.
I just wanted to check if you have seen this other caching option and would it make sense to enable that for managed SNI the same way caching was done for Native SNI, and what benefits does this PR offer over the existing caching mechanism for Write packets.
Knowing that managed SNI is also caching packets the same way as Native, basically builds on prior art and reduces the chances of introducing a regression.

@Wraith2
Copy link
Contributor Author

Wraith2 commented Feb 25, 2020

I did see that detail, and in the past I've noted that the vestiges of the same structure were in the managed state object but not completed.

I chose to push the pool down to a lower layer because it makes the lifetime management easier and when I was routing packets around it got awkward at a higher layer. Packets come in at the physical level and having to navigate all the way up to the state object to get a wrapper would leave us transporting bits of incoming data upwards to the state object and then back down to physical level then up again to be handled possibly passing through demuxing or async queuing.

Why is this better? Threadsafe because the pool is using interlocks and bounded which the stack is not. The aspnet pool was optimized for high perf on small pools which is good for this approach and means we inherit the testing and design the asp team already did. It's more localized in performance terms because any single connection will have a small search for a new packet in it's own pool, prevents packet fetch delay being unbounded. The lifetime handling of the async packets feels a lot cleaner, that disposeAfter bool in async sends really bothered me.

@cheenamalhotra
Copy link
Member

Hi @Wraith2

Please update your branch to "master" to trigger new builds.

@Wraith2
Copy link
Contributor Author

Wraith2 commented Mar 20, 2020

Done, that was an irritating set of merge conflicts.

@Wraith2
Copy link
Contributor Author

Wraith2 commented Apr 15, 2020

Any chance of movement on this PR? It's coming up to 3 months open now.

@Wraith2 Wraith2 mentioned this pull request Apr 28, 2020
Copy link
Contributor

@David-Engel David-Engel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Wraith2
Copy link
Contributor Author

Wraith2 commented May 15, 2020

Next week this will gave been waiting 4 months. I don't want to have to start buying birthday cakes for PR's so can I get some feedback on this from the requested reviewers please @cheenamalhotra @karinazhou @JRahnama @DavoudEshtehari ? If it's too complicated to review here I can be available to talk it through in a teleconference if that helps.

@cheenamalhotra
Copy link
Member

Hi @Wraith2

We intend to take this for 2.0.0-preview-4 (hence tagged), we've all been pretty busy with in-hand complex activities in progress, haven't spent enough time on this PR yet but we're going to review this soon. Thanks for your patience!

Copy link
Member

@cheenamalhotra cheenamalhotra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Wraith2

Could you make final improvements as mentioned in last comments, so we can get this merged?

  • Fix extra lines
  • Update obsolete link.

@Wraith2
Copy link
Contributor Author

Wraith2 commented May 21, 2020

Updated.

@cheenamalhotra cheenamalhotra merged commit 3ca3984 into dotnet:master May 21, 2020
@Wraith2 Wraith2 deleted the perf-packetrecycle branch June 17, 2020 19:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📈 Performance Use this label for performance improvement activities
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants