Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reducing allocations when (de)serializing frames/commands. #732

Merged
merged 13 commits into from
Mar 4, 2020

Conversation

stebet
Copy link
Contributor

@stebet stebet commented Feb 24, 2020

Proposed Changes

This PR starts the work needed to reduce allocations when (de)serializing commands and frames that eventually get sent or received over the wire.

The only way to properly do this is to use Memory and slicing to read/write objects, but this brings some questions along with it since it does impact the Public API.

Currently a lot of allocations happen when a lot of MemoryStream objects are being temporarily created. If we want to get rid of them we need to make some changes, so I propose the following:

  1. Internalize types that have anything to do with serialization and protocol handling. End users should never have to change/override them. I think it would make sense for any type deriving from Frame and MethodBase at least. This should reduce the exposed public API quite a lot. Relates to Reduce public API as much as possible #714
  2. Phase out MemoryStream and NetworkBinary(Reader/Writer) and work directly with Memory<byte> instances when reading/writing types deriving from MethodBase.

Types of Changes

  • Bug fix (non-breaking change which fixes issue #NNNN)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause an observable behavior change in existing systems)
  • Documentation improvements (corrections, new content, etc)
  • Cosmetic change (whitespace, formatting, etc)

Checklist

  • I have read the CONTRIBUTING.md document
  • I have signed the CA (see https://cla.pivotal.io/sign/rabbitmq)
  • All tests pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)
  • Any dependent changes have been merged and published in related repositories

Further Comments

Tagging @bording for input.

@stebet stebet changed the title WIP: Reducing allocations when (de(serializing frames/commands. WIP: Reducing allocations when (de)serializing frames/commands. Feb 24, 2020
@stebet stebet requested review from lukebakken and michaelklishin and removed request for lukebakken February 24, 2020 14:20
@stebet
Copy link
Contributor Author

stebet commented Feb 24, 2020

Current memory usage with the NuGet latest release:
image

Current memory usage with the latest changes in the PR:
image

@lukebakken lukebakken added this to the 6.0.0 milestone Feb 24, 2020
@lukebakken lukebakken self-assigned this Feb 24, 2020
Anarh2404 added a commit to Anarh2404/rabbitmq-dotnet-client that referenced this pull request Feb 24, 2020
@Anarh2404 Anarh2404 mentioned this pull request Feb 25, 2020
11 tasks
@stebet
Copy link
Contributor Author

stebet commented Feb 25, 2020

Updated (see above), now using memory buffers when serializing frames as well. Huge reductions in allocations again :)

@michaelklishin
Copy link
Member

@stebet thank you. We'd appreciate some guidance on how much of #734 can be folded into this PR. If most of it (an ideal scenario IMO), we should also make sure to give @Anarh2404 full credit for their contribution :)

@stebet
Copy link
Contributor Author

stebet commented Feb 26, 2020

@stebet thank you. We'd appreciate some guidance on how much of #734 can be folded into this PR. If most of it (an ideal scenario IMO), we should also make sure to give @Anarh2404 full credit for their contribution :)

Not a lot can be merged into this PR as this PR doesn't include the pipelines improvements from #706. I mentioned to @lukebakken the other day that I'd like to split the allocation reductions off from the pipelines work to keep the PRs more sane for review and not to mix concepts too much.

The work from @Anarh2404 definietly applies in general though and it would be awesome to get in as well but I think it applies a bit more to the pipelines PR. The async improvements (ValueTask etc) have a lot of API changes that I think might be more suited for 7.0 as they are a bit bigger, but I'd love to see what from there could apply here.

@stebet
Copy link
Contributor Author

stebet commented Feb 26, 2020

The biggest part of allocations left can be removed if the following behavior change can take place:

When commands/messages are handled, make it an explicit and documented operation that if the client receiving the message wants to hang on to the bytes returned in consumed messages, they MUST create a copy of them within the scope when they are received/handled, since they'll be disposed after commands have been handled.

If we dispose them, we can fetch them from array pools like the (de)serialized versions and then pretty much make sending and receiving messages from the server totally utilizing recycled buffers making the allocations pretty predictable and constant.

This would make the Command.Body property a ReadOnlyMemory<byte>, which will be disposed once the command has been handled. As this might break clients if they depend on hanging on to the bytes, which is an edge case I think, as I'd guess most clients are deserializing them into some other objects as they process them.

What do you say @lukebakken, @michaelklishin and @bording? Is that an acceptable API change to make for 6.0?

To give you an idea what it means, this is what it looks like with the same program as above (sending and receiving 50k messages, 512 byte body size):
image

Even if I bump the body size up to 16384 bytes, this is what it looks like:
image

So pretty much constant memory usage, regardless of body size :)

@lukebakken
Copy link
Contributor

@stebet that seems reasonable to me, but I'm not an end-user of this library.

@acogoluegnes what is the lifetime of the message data bytes in the Java client? Do users have a well-defined scope in which to either copy or use the data for de-serialization?

@Anarh2404
Copy link
Contributor

@michaelklishin part of the #734 doesn't make sense without #706. But I think some other changes can be ported to the current master without breaking changes. I can try to do it on the weekend.

@michaelklishin
Copy link
Member

A minimum possible lifetime is the lifetime of the delivery handler. Beyond that, the client does not really control it. This would be more or less the same if/when we change the API to use ByteBuffers

@stebet
Copy link
Contributor Author

stebet commented Feb 27, 2020

A minimum possible lifetime is the lifetime of the delivery handler. Beyond that, the client does not really control it. This would be more or less the same if/when we change the API to use ByteBuffers

Yeah, that's what I'm thinking. If we dispose the bytes when all handlers are finished running we can get rid of these allocations as well as we can recycle those buffers. As long as it's well documented as a possible change in behavior, I don't see why not. Do you want me to add those changes to this PR? @michaelklishin @lukebakken @acogoluegnes

@lukebakken
Copy link
Contributor

Do you want me to add those changes to this PR?

@stebet yes, please add them in a single commit in the unlikely event we have to remove it. Thanks!!

@stebet
Copy link
Contributor Author

stebet commented Feb 27, 2020

@lukebakken @michaelklishin PR updated. Pretty close to final version. A lot of unused internal stuff deleted that related to NetworkBinaryReader/Writer (after merge with the Public API reduction PR). Also updated the screenshot at the top with the latest memory profile :)

@stebet stebet changed the title WIP: Reducing allocations when (de)serializing frames/commands. Reducing allocations when (de)serializing frames/commands. Feb 28, 2020
Copy link
Contributor

@lukebakken lukebakken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made some cosmetic changes on my first pass. There's a lot going on here, so I'll re-review on Monday. Thanks for a lot of hard work @stebet

@stebet
Copy link
Contributor Author

stebet commented Feb 29, 2020

Don't hesitate to ask if something isn't clear or if you want me to comment on some changes for explanations :)

@stebet
Copy link
Contributor Author

stebet commented Mar 4, 2020

Haven't added this yet but here's a glimpse of what using async directly (instead of work pools) achieves as well with close to no functional changes except making most of the Handle* methods return Task instead of void, and turning HandleFrame and HandleCommand into async methods as well.

Here's my fork PR: stebet#2

image

This also improves throughput considerably on my test environment at least (both benchmark app and tests run faster).

@lukebakken lukebakken force-pushed the serializationAllocReductions branch from c8cc691 to 5f33780 Compare March 4, 2020 15:13
@lukebakken lukebakken force-pushed the serializationAllocReductions branch from 5f33780 to b1c16b4 Compare March 4, 2020 20:13
@lukebakken
Copy link
Contributor

As always very impressive work @stebet - the RabbitMQ team appreciates it a lot!

@lukebakken lukebakken merged commit 384b1c3 into rabbitmq:master Mar 4, 2020
@stebet
Copy link
Contributor Author

stebet commented Mar 4, 2020

My pleasure :)

@hez2010
Copy link

hez2010 commented Apr 8, 2020

Just a suggestion:
Use ReadOnlySpan<T> for exposed parameters in message handler APIs instead of ReadOnlyMemory<T> so that user cannot save it for later. ReadOnlySpan<T> provides almost same interface with ReadOnlyMemory<T>. This can avoid issues with buffer's life cycle because users must copy data from Span if they want to use those data later.
Update: My mistake. ReadOnlySpan<T> cannot use within an async method.

@stebet stebet deleted the serializationAllocReductions branch April 8, 2020 10:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants