-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix loadgen and runner to accommodate endo bundles #32
Conversation
4b8a802
to
e6c1259
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks pretty good to me, although I'll leave the Buffer parsing parts to more experienced eyes
@kriskowal could I request your eyes on the NESM stuff, and maybe the Buffer parsing? |
/** @type {number} */ | ||
let breakLength; | ||
if (!breakValue || typeof breakValue === 'number') { | ||
breakLength = 1; | ||
} else if (Buffer.isBuffer(breakValue)) { | ||
breakLength = breakValue.length; | ||
} else { | ||
breakLength = Buffer.from(breakValue, breakEncoding).length; | ||
} | ||
this._breakLength = breakLength; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would probably be simpler to normalize the break to a single type, or even defer that responsibility to the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deferring is exactly what this transform does, as it passes through to buf.indexOf()
. However I do need to advance the search by the number of bytes of what had been found, which is what this calculates.
const concatLength = this._chunks.reduce( | ||
(acc, { length }) => acc + length, | ||
endOffset, | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can read this. A utility function might be nicer. A for loop might be kinder on both people and performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like reduce
much either, but gotta admit that accumulation of a value is basically what it's originally meant for.
Buffer.concat( | ||
[...this._chunks.splice(0, this._chunks.length), buf], | ||
concatLength, | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Buffer concatenation is expensive and avoidable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately I didn't find a way to avoid it here. The stream is in block mode to allow the consumer to get a single object representing the line split at the configured break value. Node doesn't seem to have a way to create a buffer for which the underlying data is a list of buffers. Since the consumer expects a Buffer-like object, I didn't want to build such a wrapper. When I saw the performance was good enough for my use case, I left it at concat
. If you know of a better way that doesn't require a ton of engineering, I'm open to suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI I did find https://www.npmjs.com/package/buffers, but that basically does a concat when producing a slice out, so that's not any better than this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does calling this._writeItem
on each chunk not work? I would think that would avoid an intermediate allocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default node streams may concatenate bytes as it sees fit. They also have an optional object mode, which treats the data in an opaque manner (even if the data is a Buffer of bytes).
In this case the goal of the transform is to produce a sequence of lines in the form of a Buffer
of bytes, and consumers expect to handle a line as a single Buffer
object. As such the readable portion of the stream transform is configured in object mode.
Some of the handling in the consumer involves pipping into another stream and converting either the whole line or a subset to a string. While I could probably build a "Buffers" object that kept a list of the original buffers and supported those operations somehow, I figured one intermediate allocation wouldn't be the end of the world.
* The BufferLineTransform is reading String or Buffer content from a Readable stream | ||
* and writing each line as a Buffer in object mode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some optimizations for both human and machine readers are more obvious if you pick either string or bytes and not generalize both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Node streams are usually built to handle both string and buffer, so since it was literally a single line to accept a string, I supported it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A noteworthy design flaw of Node.js 😉
buf = buf.subarray(endOffset); | ||
} | ||
cb(); | ||
} catch (err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably catches more error classes than intended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Such as? Do you see any problem with passing it all through the stream error handling logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m sure it’s fine in its form, but:
- Node.js deopts any function that contains try/catch.
- This broad of a catch clause will also defer programming errors (e.g., ReferenceError) to the stream handler. This is approximately analogous to what happens with promises regardless, so only something to avoid if it’s easy.
- This broad of a catch obscures what would throw “invalid data type” specifically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this definitely assumes there are no programming errors. But I have to somehow guard against bad input, and since the whole function does input processing, I'm not sure the try/catch could be much more targeted.
The introduction of endo zip bundles in Agoric/agoric-sdk#3273 required that contracts be NESM and bloated up the size of the slog file (probably in addition to other changes to the slog output). This caused the runner to both slow to a crawl when processing the slog output of the chain node, and made it impossible to deploy the loadgen code.
This PR fixes the slog processing by implementing a new line splitter stream transform that keeps the data as a
Buffer
, but outputs in object mode so that buffers are not re-concatenated. This dropped the parsing of the bootstrap slog from 3 minutes to mere seconds. As a result, a couple timing adjustments could be made:To accommodate the new endo bundle format while keeping backwards compatibility with older SDKs, split the deployed contract code in its own package with
type: module
. That field is ignored by the old bundler and satisfies both the new bundler and RESM that is still used by the deploy script (for backwards compatibility as well).Once these issues were fixed, I've validated that the loadgen is compatible with most recent agoric-sdk, including with Agoric/agoric-sdk#3736.
As usual, best reviewed commit by commit.