Framing benchmarks and optimizations #1016

stebet · 2021-02-11T12:06:29Z

Proposed Changes

Added benchmarks for frame generation and added optimizations which should benefit all commands sent from the clients.

Types of Changes

Bug fix (non-breaking change which fixes issue #NNNN)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause an observable behavior change in existing systems)
Documentation improvements (corrections, new content, etc)
Cosmetic change (whitespace, formatting, etc)

Checklist

I have read the CONTRIBUTING.md document
I have signed the CA (see https://cla.pivotal.io/sign/rabbitmq)
All tests pass locally with my changes
I have added tests that prove my fix is effective or that my feature works
I have added necessary documentation (if appropriate)
Any dependent changes have been merged and published in related repositories

Further Comments

Main optimizations were done by reducing method calls, adding inlining and reducing code branches. There is definitely room for more improvements, but this is a pretty solid start :)

Benchmark results for serializing BasicAck, BasicDeliver and ChannelClose to their actual frames:

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.21277
Intel Core i7-10700 CPU 2.90GHz, 1 CPU, 16 logical and 8 physical cores
.NET Core SDK=5.0.103
  [Host]     : .NET Core 3.1.12 (CoreCLR 4.700.21.6504, CoreFX 4.700.21.6905), X64 RyuJIT
  .NET 4.8 = .NET Framework 4.8 (4.8.4261.0), X64 RyuJIT
  Core 3.1 = .NET Core 3.1.12 (CoreCLR 4.700.21.6504, CoreFX 4.700.21.6905), X64 RyuJIT

Before

Method	Runtime	Mean	Error	StdDev	Code Size	Gen 0	Gen 1	Gen 2	Allocated
ChannelCloseWrite	.NET 4.8	82.41 ns	0.699 ns	0.653 ns	1817 B	0.0088	-	-	56 B
ChannelCloseWrite	Core 3.1	31.48 ns	0.350 ns	0.327 ns	1117 B	0.0067	-	-	56 B
BasicPublishWrite	.NET 4.8	225.96 ns	0.905 ns	0.755 ns	1817 B	0.0241	-	-	152 B
BasicPublishWrite	Core 3.1	116.96 ns	0.684 ns	0.640 ns	1117 B	0.0181	-	-	152 B
BasicPublishMemoryWrite	.NET 4.8	162.78 ns	1.383 ns	1.226 ns	1817 B	0.0241	-	-	152 B
BasicPublishMemoryWrite	Core 3.1	71.53 ns	0.239 ns	0.200 ns	1117 B	0.0181	-	-	152 B
BasicAckWrite	.NET 4.8	70.30 ns	0.504 ns	0.472 ns	1817 B	0.0088	-	-	56 B
BasicAckWrite	Core 3.1	27.09 ns	0.508 ns	0.450 ns	1117 B	0.0067	-	-	56 B

After:

Method	Runtime	Mean	Diff	Error	StdDev	Code Size	Gen 0	Gen 1	Gen 2	Allocated
ChannelCloseWrite	.NET 4.8	69.72 ns	-15,4%	0.440 ns	0.411 ns	759 B	0.0088	-	-	56 B
ChannelCloseWrite	Core 3.1	23.36 ns	-25,8%	0.482 ns	0.691 ns	1195 B	0.0067	-	-	56 B
BasicPublishWrite	.NET 4.8	200.58 ns	-11,2%	0.687 ns	0.574 ns	2814 B	0.0241	-	-	152 B
BasicPublishWrite	Core 3.1	98.16 ns	-16,1%	0.381 ns	0.356 ns	2438 B	0.0181	-	-	152 B
BasicPublishMemoryWrite	.NET 4.8	133.69 ns	-17,9%	0.336 ns	0.314 ns	2942 B	0.0241	-	-	152 B
BasicPublishMemoryWrite	Core 3.1	55.57 ns	-22,3%	0.503 ns	0.470 ns	2438 B	0.0181	-	-	152 B
BasicAckWrite	.NET 4.8	58.54 ns	-16,7%	0.245 ns	0.229 ns	759 B	0.0089	-	-	56 B
BasicAckWrite	Core 3.1	18.06 ns	-33.3%	0.381 ns	0.558 ns	1195 B	0.0067	-	-	56 B

projects/RabbitMQ.Client/client/impl/OutgoingCommand.cs

projects/RabbitMQ.Client/client/impl/ModelBase.cs

projects/RabbitMQ.Client/client/impl/OutgoingCommand.cs

projects/RabbitMQ.Client/client/impl/SessionBase.cs

projects/RabbitMQ.Client/client/impl/ISession.cs

projects/RabbitMQ.Client/client/impl/SessionBase.cs

bollhals · 2021-02-11T22:32:03Z

Nice improvements! Looks very much worth it.

Some of your improvements I also applied in #962, so we should look how we coordinate these two PRs to get the maximum out of it 🔥 🚀

Out of curiosity, while changing it, what change did bring the most benefit? Is shifting & then write as ulong that much faster?

projects/RabbitMQ.Client/client/impl/Frame.cs

projects/RabbitMQ.Client/client/impl/OutgoingCommand.cs

projects/RabbitMQ.Client/client/impl/Frame.cs

stebet · 2021-02-11T23:51:04Z

The UInt64 change helped get rid of method calls and calls to Span.Slice, which even if they are cheap, add up quickly on a hot-path, made the Write* calls smaller (Code size) which made them more inlineable.

The biggest factor was splitting up OutgoingCommand to reduce code branches (ifs) and extra method calls.

I have an idea for a few more optimizations which I'll try out tomorrow.

projects/RabbitMQ.Client/client/impl/OutgoingCommand.cs

stebet · 2021-02-12T14:48:04Z

Rebased and squashed after reviews.

projects/RabbitMQ.Client/client/impl/OutgoingCommand.cs

stebet · 2021-02-12T15:17:02Z

Updated with latest perf numbers

stebet · 2021-02-18T09:06:49Z

Anyone want to do a final review of this? If not, I think this PR is ready :) @michaelklishin @danielmarbach @bollhals

bollhals · 2021-02-18T09:12:37Z

Go for it. I'll update #962 afterwards to pickup these improvements and extend where possible.

danielmarbach reviewed Feb 11, 2021

View reviewed changes

bollhals reviewed Feb 11, 2021

View reviewed changes

bollhals reviewed Feb 12, 2021

View reviewed changes

projects/RabbitMQ.Client/client/impl/OutgoingCommand.cs Outdated Show resolved Hide resolved

stebet force-pushed the framingbenchmarksandoptimizations branch from e08aa7f to 3d920cd Compare February 12, 2021 14:47

Adding benchmarks for framing and optimizing frame generation.

e02e86f

stebet force-pushed the framingbenchmarksandoptimizations branch from 6f9ddb0 to e02e86f Compare February 12, 2021 14:50

bollhals reviewed Feb 12, 2021

View reviewed changes

projects/RabbitMQ.Client/client/impl/OutgoingCommand.cs Outdated Show resolved Hide resolved

Stefán J. Sigurðarson added 2 commits February 12, 2021 15:24

Fixing after review.

3d4bfc8

Removing redundant #ifdef

57a7d97

danielmarbach approved these changes Feb 19, 2021

View reviewed changes

Merge branch 'master' into framingbenchmarksandoptimizations

39f3ddc

michaelklishin merged commit 9437b15 into rabbitmq:master Feb 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Framing benchmarks and optimizations #1016

Framing benchmarks and optimizations #1016

stebet commented Feb 11, 2021 •

edited

Loading

bollhals commented Feb 11, 2021

stebet commented Feb 11, 2021 •

edited

Loading

stebet commented Feb 12, 2021

stebet commented Feb 12, 2021

stebet commented Feb 18, 2021

bollhals commented Feb 18, 2021

Framing benchmarks and optimizations #1016

Framing benchmarks and optimizations #1016

Conversation

stebet commented Feb 11, 2021 • edited Loading

Proposed Changes

Types of Changes

Checklist

Further Comments

bollhals commented Feb 11, 2021

stebet commented Feb 11, 2021 • edited Loading

stebet commented Feb 12, 2021

stebet commented Feb 12, 2021

stebet commented Feb 18, 2021

bollhals commented Feb 18, 2021

stebet commented Feb 11, 2021 •

edited

Loading

stebet commented Feb 11, 2021 •

edited

Loading