Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functions for reading and writing length-prefixed data with customizable encodings for the length #2867

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

hamzahrmalik
Copy link
Contributor

Add functions for reading and writing length-prefixed data with customizable encodings for the length, particularly for quic variable-length integers (RFC 9000)

Motivation:

Many protocols require us to write data and then prefix that data with its length. But each protocol has a different way of encoding the length. This PR introduces general purpose functions which can be extended for different encoding strategies

Modifications:

Create a new protocol which defines how to encode an integer
Implement this protocol for QUIC

Provide functions on bytebuffer for writing length-prefixed buffers, strings or bytes

@hamzahrmalik hamzahrmalik added the semver/minor Adds new public API. label Sep 4, 2024
Copy link
Contributor

@glbrntt glbrntt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is neat! I've not reviewed it all but left some initial feedback and questions to make sure I understood how it all fits together.

Sources/NIOCore/ByteBuffer-binaryEncodedLengthPrefix.swift Outdated Show resolved Hide resolved
Sources/NIOCore/ByteBuffer-binaryEncodedLengthPrefix.swift Outdated Show resolved Hide resolved
Sources/NIOCore/ByteBuffer-binaryEncodedLengthPrefix.swift Outdated Show resolved Hide resolved
Sources/NIOCore/ByteBuffer-binaryEncodedLengthPrefix.swift Outdated Show resolved Hide resolved
Sources/NIOCore/ByteBuffer-binaryEncodedLengthPrefix.swift Outdated Show resolved Hide resolved
/// If you use write more or less than this many bytes, we wll need to move bytes around to make that possible
/// Therefore, you may decide to encode differently to use more bytes than you actually would need, if it is possible for you to do so
/// That way, you waste the reserved bytes, but improve writing performance
func writeIntegerWithReservedSpace(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This API 'stutters' a little bit: could we use reservedCapacity to overload writeInteger? i.e.

func writeInteger(_ value: Int, to buffer: inout ByteBuffer, reservedCapacity: Int) -> Int

to buffer: inout ByteBuffer
) -> Int

/// When writing an integer using this strategy, how many bytes we should reserve
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this and writeIntegerWithReservedSpace quite unintuitive without looking at how they were used. It became clearer when I started reading "Integer" as "Length". Is that the right way to think about it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah i agree this is confusing

The idea here is that reservedSpaceForInteger is the amount of space that someone should reserve if theyre intending to write an integer using your strategy
And the writeIntegerWithReservedSpace is for when someone has already reserved some space, and now wants you to write an integer into that space. You can then make the decision on whether you want to try and use exactly that much space (this is what the quic impl does) even if you could've used less. Ie you choose whether you optimize for the smallest possible encoding of the int, or for trying to avoid moving bytes around

This is in practice currently only used by the writePrefixed* functions so yes, there the integer is a length. But i don't think it has to be a length

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be used to write an integer, but in the context of this protocol should it be used for anything other than the length?

IMO we should make the intent of the requirement clear, if you wanted to write an integer then you should just call the writeInteger(_:to:) method.

Sources/NIOCore/ByteBuffer-binaryEncodedLengthPrefix.swift Outdated Show resolved Hide resolved
Sources/NIOCore/ByteBuffer-binaryEncodedLengthPrefix.swift Outdated Show resolved Hide resolved
Copy link
Contributor

@glbrntt glbrntt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few more comments. The bulk of this looks great but still unsure around some of the naming.

/// If you use write more or less than this many bytes, we wll need to move bytes around to make that possible.
/// Therefore, you may decide to encode differently to use more bytes than you actually would need, if it is possible for you to do so.
/// That way, you waste the reserved bytes, but improve writing performance.
func writeIntegerWithReservedCapacity(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: repeating "reserved" in the name and in the parameter name. Is that necessary?

Sources/NIOCore/ByteBuffer-binaryEncodedLengthPrefix.swift Outdated Show resolved Hide resolved
Sources/NIOCore/ByteBuffer-binaryEncodedLengthPrefix.swift Outdated Show resolved Hide resolved
Sources/NIOCore/ByteBuffer-binaryEncodedLengthPrefix.swift Outdated Show resolved Hide resolved
#elseif canImport(Bionic)
import Bionic
#else
#error("The Byte Buffer module was unable to identify your C library.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MaxDesiatov don't we need WASIlibc here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks for catching this!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need this, then it's missing across all of NIO

//===----------------------------------------------------------------------===//

#if os(Windows)
import ucrt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hamzahrmalik why do we need the libcs here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the memmove function

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, in this case we should remove the imports here.

  • we shouldn't use *Unsafe* here, the most minimal subset of ByteBuffer-core.swift needs to provide enough
  • memmove isn't necessary to be called directly. UnsafeRawBufferPointer can move memory with a higher-level API (I think it's bufferPointer[0..<100] = bufferPointer[2..<102] or something like that; but again, no Unsafe in here please)

#elseif canImport(Musl)
import Musl
#elseif canImport(Bionic)
import Bionic
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
import Bionic
import Bionic
#elseif canImport(WASILibc)
import WASILibc

// Add the required number of bytes to the end first
self.writeRepeatingByte(0, count: requiredSpace)
// Move the data forward by that many bytes, to make space at the front
self.withVeryUnsafeMutableBytes { pointer in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we avoid *Unsafe* here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really: this is the safe wrapper around the access.

Ultimately this is implementing memmove on ByteBuffer. Implementing it safely is somewhere between very hard and impossible using the APIs we currently have. We can run through the options:

  1. Bytewise copy using writeInteger. This is going to be very costly for large moves, which is the concern we have in this method.
  2. Manually-autovectorised copy using writeMultipleInteger and wider integers. This can work, but it's still going to be very slow compared to memmove and it will be much more complex code as we have to handle a wide range of edge-cases.
  3. Move using writeImmutableBuffer. This will, unfortunately, trigger a CoW.
  4. Move using ByteBufferView. This will also trigger a CoW.

I fundamentally can't see an implementation here that performs well enough to be used on larger buffers, which is also the circumstance in which it is most likely that we'll need a move. But I'm open to other options.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, if this is necessary functionality, can we add it to ByteBuffer-core.swift and use in other places too? If my memory serves me right, then @glbrntt added some memory-moving API already a few years back, might be wrong about that though

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Close enough, I added copyBytes(at:to:length:) 5 years ago as part of making ByteBufferView mutable. Good memory!

https://github.com/apple/swift-nio/blame/282f5935cf3352b3d026c35eb57cb3619dd9536f/Sources/NIOCore/ByteBuffer-core.swift#L1077

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hamzahrmalik isn't that exactly what we need? memmove == copying bytes too

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this looks like we could use it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver/minor Adds new public API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants