-
Notifications
You must be signed in to change notification settings - Fork 643
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add functions for reading and writing length-prefixed data with customizable encodings for the length #2867
base: main
Are you sure you want to change the base?
Add functions for reading and writing length-prefixed data with customizable encodings for the length #2867
Conversation
…misable encodings for the length
b181d12
to
74ce004
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is neat! I've not reviewed it all but left some initial feedback and questions to make sure I understood how it all fits together.
/// If you use write more or less than this many bytes, we wll need to move bytes around to make that possible | ||
/// Therefore, you may decide to encode differently to use more bytes than you actually would need, if it is possible for you to do so | ||
/// That way, you waste the reserved bytes, but improve writing performance | ||
func writeIntegerWithReservedSpace( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This API 'stutters' a little bit: could we use reservedCapacity
to overload writeInteger
? i.e.
func writeInteger(_ value: Int, to buffer: inout ByteBuffer, reservedCapacity: Int) -> Int
to buffer: inout ByteBuffer | ||
) -> Int | ||
|
||
/// When writing an integer using this strategy, how many bytes we should reserve |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this and writeIntegerWithReservedSpace
quite unintuitive without looking at how they were used. It became clearer when I started reading "Integer" as "Length". Is that the right way to think about it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah i agree this is confusing
The idea here is that reservedSpaceForInteger
is the amount of space that someone should reserve if theyre intending to write an integer using your strategy
And the writeIntegerWithReservedSpace
is for when someone has already reserved some space, and now wants you to write an integer into that space. You can then make the decision on whether you want to try and use exactly that much space (this is what the quic impl does) even if you could've used less. Ie you choose whether you optimize for the smallest possible encoding of the int, or for trying to avoid moving bytes around
This is in practice currently only used by the writePrefixed* functions so yes, there the integer is a length. But i don't think it has to be a length
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be used to write an integer, but in the context of this protocol should it be used for anything other than the length?
IMO we should make the intent of the requirement clear, if you wanted to write an integer then you should just call the writeInteger(_:to:)
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few more comments. The bulk of this looks great but still unsure around some of the naming.
/// If you use write more or less than this many bytes, we wll need to move bytes around to make that possible. | ||
/// Therefore, you may decide to encode differently to use more bytes than you actually would need, if it is possible for you to do so. | ||
/// That way, you waste the reserved bytes, but improve writing performance. | ||
func writeIntegerWithReservedCapacity( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: repeating "reserved" in the name and in the parameter name. Is that necessary?
#elseif canImport(Bionic) | ||
import Bionic | ||
#else | ||
#error("The Byte Buffer module was unable to identify your C library.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MaxDesiatov don't we need WASIlibc here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thanks for catching this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we need this, then it's missing across all of NIO
//===----------------------------------------------------------------------===// | ||
|
||
#if os(Windows) | ||
import ucrt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hamzahrmalik why do we need the libcs here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the memmove function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, in this case we should remove the imports here.
- we shouldn't use
*Unsafe*
here, the most minimal subset ofByteBuffer-core.swift
needs to provide enough memmove
isn't necessary to be called directly.UnsafeRawBufferPointer
can move memory with a higher-level API (I think it'sbufferPointer[0..<100] = bufferPointer[2..<102]
or something like that; but again, no Unsafe in here please)
#elseif canImport(Musl) | ||
import Musl | ||
#elseif canImport(Bionic) | ||
import Bionic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import Bionic | |
import Bionic | |
#elseif canImport(WASILibc) | |
import WASILibc |
// Add the required number of bytes to the end first | ||
self.writeRepeatingByte(0, count: requiredSpace) | ||
// Move the data forward by that many bytes, to make space at the front | ||
self.withVeryUnsafeMutableBytes { pointer in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we avoid *Unsafe*
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really: this is the safe wrapper around the access.
Ultimately this is implementing memmove
on ByteBuffer
. Implementing it safely is somewhere between very hard and impossible using the APIs we currently have. We can run through the options:
- Bytewise copy using
writeInteger
. This is going to be very costly for large moves, which is the concern we have in this method. - Manually-autovectorised copy using
writeMultipleInteger
and wider integers. This can work, but it's still going to be very slow compared tomemmove
and it will be much more complex code as we have to handle a wide range of edge-cases. - Move using
writeImmutableBuffer
. This will, unfortunately, trigger a CoW. - Move using
ByteBufferView
. This will also trigger a CoW.
I fundamentally can't see an implementation here that performs well enough to be used on larger buffers, which is also the circumstance in which it is most likely that we'll need a move. But I'm open to other options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, if this is necessary functionality, can we add it to ByteBuffer-core.swift
and use in other places too? If my memory serves me right, then @glbrntt added some memory-moving API already a few years back, might be wrong about that though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Close enough, I added copyBytes(at:to:length:)
5 years ago as part of making ByteBufferView
mutable. Good memory!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hamzahrmalik isn't that exactly what we need? memmove
== copying bytes too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this looks like we could use it here.
Add functions for reading and writing length-prefixed data with customizable encodings for the length, particularly for quic variable-length integers (RFC 9000)
Motivation:
Many protocols require us to write data and then prefix that data with its length. But each protocol has a different way of encoding the length. This PR introduces general purpose functions which can be extended for different encoding strategies
Modifications:
Create a new protocol which defines how to encode an integer
Implement this protocol for QUIC
Provide functions on bytebuffer for writing length-prefixed buffers, strings or bytes