Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix loadgen and runner to accommodate endo bundles #32

Merged
merged 6 commits into from
Oct 31, 2021

Conversation

mhofman
Copy link
Member

@mhofman mhofman commented Oct 16, 2021

The introduction of endo zip bundles in Agoric/agoric-sdk#3273 required that contracts be NESM and bloated up the size of the slog file (probably in addition to other changes to the slog output). This caused the runner to both slow to a crawl when processing the slog output of the chain node, and made it impossible to deploy the loadgen code.

This PR fixes the slog processing by implementing a new line splitter stream transform that keeps the data as a Buffer, but outputs in object mode so that buffers are not re-concatenated. This dropped the parsing of the bootstrap slog from 3 minutes to mere seconds. As a result, a couple timing adjustments could be made:

  • Only print an event delay if its meaningful (somewhat arbitrarily set at more than 100ms)
  • Cancel timeout / sleep callbacks to avoid a temporarily hung exiting process if the timeout is raced and ignored

To accommodate the new endo bundle format while keeping backwards compatibility with older SDKs, split the deployed contract code in its own package with type: module. That field is ignored by the old bundler and satisfies both the new bundler and RESM that is still used by the deploy script (for backwards compatibility as well).

Once these issues were fixed, I've validated that the loadgen is compatible with most recent agoric-sdk, including with Agoric/agoric-sdk#3736.

As usual, best reviewed commit by commit.

Copy link
Member

@warner warner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks pretty good to me, although I'll leave the Buffer parsing parts to more experienced eyes

@mhofman
Copy link
Member Author

mhofman commented Oct 30, 2021

@kriskowal could I request your eyes on the NESM stuff, and maybe the Buffer parsing?

runner/lib/helpers/buffer-line-transform.d.ts Show resolved Hide resolved
Comment on lines +20 to +29
/** @type {number} */
let breakLength;
if (!breakValue || typeof breakValue === 'number') {
breakLength = 1;
} else if (Buffer.isBuffer(breakValue)) {
breakLength = breakValue.length;
} else {
breakLength = Buffer.from(breakValue, breakEncoding).length;
}
this._breakLength = breakLength;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would probably be simpler to normalize the break to a single type, or even defer that responsibility to the user.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferring is exactly what this transform does, as it passes through to buf.indexOf(). However I do need to advance the search by the number of bytes of what had been found, which is what this calculates.

runner/lib/helpers/buffer-line-transform.js Show resolved Hide resolved
Comment on lines +56 to +59
const concatLength = this._chunks.reduce(
(acc, { length }) => acc + length,
endOffset,
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can read this. A utility function might be nicer. A for loop might be kinder on both people and performance.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like reduce much either, but gotta admit that accumulation of a value is basically what it's originally meant for.

Comment on lines +61 to +64
Buffer.concat(
[...this._chunks.splice(0, this._chunks.length), buf],
concatLength,
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Buffer concatenation is expensive and avoidable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I didn't find a way to avoid it here. The stream is in block mode to allow the consumer to get a single object representing the line split at the configured break value. Node doesn't seem to have a way to create a buffer for which the underlying data is a list of buffers. Since the consumer expects a Buffer-like object, I didn't want to build such a wrapper. When I saw the performance was good enough for my use case, I left it at concat. If you know of a better way that doesn't require a ton of engineering, I'm open to suggestions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI I did find https://www.npmjs.com/package/buffers, but that basically does a concat when producing a slice out, so that's not any better than this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does calling this._writeItem on each chunk not work? I would think that would avoid an intermediate allocation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default node streams may concatenate bytes as it sees fit. They also have an optional object mode, which treats the data in an opaque manner (even if the data is a Buffer of bytes).

In this case the goal of the transform is to produce a sequence of lines in the form of a Buffer of bytes, and consumers expect to handle a line as a single Buffer object. As such the readable portion of the stream transform is configured in object mode.

Some of the handling in the consumer involves pipping into another stream and converting either the whole line or a subset to a string. While I could probably build a "Buffers" object that kept a list of the original buffers and supported those operations somehow, I figured one intermediate allocation wouldn't be the end of the world.

Comment on lines +8 to +9
* The BufferLineTransform is reading String or Buffer content from a Readable stream
* and writing each line as a Buffer in object mode
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some optimizations for both human and machine readers are more obvious if you pick either string or bytes and not generalize both.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Node streams are usually built to handle both string and buffer, so since it was literally a single line to accept a string, I supported it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A noteworthy design flaw of Node.js 😉

runner/lib/helpers/buffer-line-transform.js Show resolved Hide resolved
buf = buf.subarray(endOffset);
}
cb();
} catch (err) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably catches more error classes than intended.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such as? Do you see any problem with passing it all through the stream error handling logic?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m sure it’s fine in its form, but:

  • Node.js deopts any function that contains try/catch.
  • This broad of a catch clause will also defer programming errors (e.g., ReferenceError) to the stream handler. This is approximately analogous to what happens with promises regardless, so only something to avoid if it’s easy.
  • This broad of a catch obscures what would throw “invalid data type” specifically.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this definitely assumes there are no programming errors. But I have to somehow guard against bad input, and since the whole function does input processing, I'm not sure the try/catch could be much more targeted.

@mhofman mhofman merged commit 2f1f43e into main Oct 31, 2021
@mhofman mhofman deleted the mhofman/fix-endo-bundling branch January 17, 2022 22:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants