Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

worker: allow specifying resource limits #26628

Closed
wants to merge 1 commit into from

Conversation

addaleax
Copy link
Member

Allow specifying resource limits for the JS engine instance created as part of a Worker.

/cc @nodejs/workers

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines

@addaleax addaleax added semver-minor PRs that contain new features and should be released in the next minor version. worker Issues and PRs related to Worker support. labels Mar 13, 2019
@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. errors Issues and PRs related to JavaScript errors originated in Node.js core. labels Mar 13, 2019
@targos
Copy link
Member

targos commented Mar 13, 2019

What happens if the worker itself starts a new worker?

Copy link
Member

@joyeecheung joyeecheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM though I have some questions

doc/api/worker_threads.md Show resolved Hide resolved
src/node_worker.cc Show resolved Hide resolved
Null(env()->isolate()).As<Value>(),
};

MakeCallback(env()->onexit_string(), arraysize(args), args);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can pass the limits back here as well to display them in the error messages?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have suggestions for how/when to pass them back? I don’t think you’re thinking about callback arguments, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have those as AliasedBuffers we could pass them back here right? (Or is that not guaranteed to be valid at this point?)

src/node_worker.cc Outdated Show resolved Hide resolved
src/node_worker.h Show resolved Hide resolved
test/parallel/test-worker-resource-limits.js Outdated Show resolved Hide resolved
src/node_worker.cc Show resolved Hide resolved
@targos
Copy link
Member

targos commented Mar 13, 2019

In case you missed it, I'll ask again:

What happens if the worker itself starts a new worker?
Can it control the child's limits?
Could the child's limits be higher than the parent's?
Does the child inherit limits from its parent by default?

@gireeshpunathil
Copy link
Member

@targos - by looking at the current design, I guess:

Can it control the child's limits?

yes, through this interface

Could the child's limits be higher than the parent's?

yes

Does the child inherit limits from its parent by default?

no

@jasnell
Copy link
Member

jasnell commented Mar 13, 2019

Definitely like the direction on this. And I like the way the failure condition is handled. The one thing that's missing is: how would a worker know what it's limits are without (as you do in the tests) passing those in out of band.

@YurySolovyov
Copy link

Any way to limit CPU usage?

@addaleax
Copy link
Member Author

@jasnell @joyeecheung I guess we could make the resource limits available on the Worker object in the parent thread and in the child thread on the worker_threads exports object. I think that would cover both of your concerns, right?

In that case the main question for me would be, is there anything we could/should do for non-Worker threads? Should we just (inaccurately) report Infinity everywhere, or not provide a resourceLimits object at all, or somehow try to get the actual limits?

@targos Yeah, the options are independent for each Worker, in the way that @gireeshpunathil explained.

Any way to limit CPU usage?

@YurySolovyov Not in this PR, sorry. That might also be pretty complex to implement, because we don’t have the VM available to do this for us.

@richardlau
Copy link
Member

How does the diagnostic report work wrt. workers? Are the resource limits being added here something that could be added to the report?

@addaleax
Copy link
Member Author

@richardlau The diagnostic report is clueless about Workers (see also #26293). But yes, when they add support, I think it makes sense to include the data.

chjj added a commit to chjj/bthreads that referenced this pull request Mar 14, 2019
chjj added a commit to chjj/bthreads that referenced this pull request Mar 14, 2019
@BridgeAR
Copy link
Member

@addaleax seems like this requires a rebase.

@addaleax
Copy link
Member Author

addaleax commented Apr 2, 2019

@jasnell @joyeecheung Before I rebase this … any thoughts on #26628 (comment)?

@joyeecheung
Copy link
Member

Should we just (inaccurately) report Infinity everywhere, or not provide a resourceLimits object at all, or somehow try to get the actual limits?

I would go with not providing the object at all. It's easier to start small and known limitations are better than..known bugs?

@fhinkel
Copy link
Member

fhinkel commented Oct 28, 2019

I'm cleaning out a few old PRs 🧹. I'm closing this due to inactivity. Please re-open if needed!

@fhinkel fhinkel closed this Oct 28, 2019
@gireeshpunathil
Copy link
Member

I don't really see any reason why this could not have landed, don't see any blockers. On the other hand, I find great value for this PR as it provides fine grained control over the worker's execution environment - something that can be leveraged in workloads that means a lot (such as cloud)

pinging @addaleax to see if I missed something.

@addaleax
Copy link
Member Author

addaleax commented Nov 1, 2019

@joyeecheung @gireeshpunathil I’ve rebased this and made the resource constraints available as an object on both the worker_threads module and the Worker instance as suggested above… Could you take another look?

@nodejs-github-bot
Copy link
Collaborator

@@ -294,10 +311,35 @@ function pipeWithoutWarning(source, dest) {
dest._maxListeners = destMaxListeners;
}

const resourceLimitsArray = new Float64Array(kTotalResourceLimitCount);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason not to reuse resourceLimitsRaw here?

Although if we expose makeResourceLimits from this file and calculate the publicly exposed resourceLimits from lib/worker_threads.js instead it makes more sense to use two arrays in case someone created a worker internally with a resource limit..somehow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Practically speaking, yes, the reason is that resourceLimitsRaw is currently not defined in the main thread. And given that this is a very small typed array, I’m okay with that. If you’re concerned about the extra resource usage, I’d probably prefer changing this to not be a static variable and instead generate one for each call to parseResourceLimits()?

Copy link
Member

@joyeecheung joyeecheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (the comment above was a nit)

&loop_,
w->platform_);
CHECK_NOT_NULL(isolate);
Isolate::CreateParams params;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my own understanding...WorkerThreadData is actually closer to NodeMainInstance for the main thread even though we don't have an equivalent abstraction of Worker for the main thread at the moment? (This was my impression when I tried with #29925 locally but I decided to take some more time to figure out a proper class hierarchy before moving forward..)

I wonder whether it makes sense to have Environment points to a base class of Worker and MainThread (?) if they have a longer life time than Environment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my own understanding...WorkerThreadData is actually closer to NodeMainInstance for the main thread even though we don't have an equivalent abstraction of Worker for the main thread at the moment? (This was my impression when I tried with #29925 locally but I decided to take some more time to figure out a proper class hierarchy before moving forward..)

It is somewhat similar, yes.

I wonder whether it makes sense to have Environment points to a base class of Worker and MainThread (?) if they have a longer life time than Environment.

What would we use that for? And how would that fit into the embedder API? Both Workers and the main thread have a lifetime longer than the Environment, but only very slightly so…

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, Worker, NodeMainInstance and Environment all keep copies of arguments and exec arguments. We could reduce the duplication by only keeping them in Worker and MainThread. There seem to be a lot of duplication along the hierarchy at the moment..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be careful about that… deduplication is nice but right now moving towards a better embedder API is my primary goal, and linking from Environment to a superclass of Worker and NodeMainInstance sounds a bit fragile

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If anything...I think a public version of that super class could be a good base for a newer set of better embedder APIs? (but maybe it's just a crazy idea, I have not given too much thought into it).

@nodejs-github-bot
Copy link
Collaborator

addaleax added a commit that referenced this pull request Nov 5, 2019
Allow specifying resource limits for the JS engine instance created
as part of a Worker.

PR-URL: #26628
Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Franziska Hinkelmann <franziska.hinkelmann@gmail.com>
@addaleax
Copy link
Member Author

addaleax commented Nov 5, 2019

Landed in d855904 🎉

@addaleax addaleax closed this Nov 5, 2019
@addaleax addaleax deleted the worker-resource-limits branch November 5, 2019 18:58
MylesBorins pushed a commit that referenced this pull request Nov 17, 2019
Allow specifying resource limits for the JS engine instance created
as part of a Worker.

PR-URL: #26628
Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Franziska Hinkelmann <franziska.hinkelmann@gmail.com>
@BridgeAR BridgeAR mentioned this pull request Nov 19, 2019
MylesBorins added a commit to BridgeAR/node that referenced this pull request Nov 21, 2019
Notable changes:

* addons:
  * Deprecate one- and two-argument `AtExit()`. Use the three-argument
    variant of `AtExit()` or `AddEnvironmentCleanupHook()` instead
    (Anna Henningsen) nodejs#30227
* child_process,cluster:
  * The `serialization` option is added that allows child process
    IPC to use the V8 serialization API (to e.g., pass through data
    types like sets or maps) (Anna Henningsen)
    nodejs#30162
* deps:
  * Update V8 to 7.9
  * Update `npm` to 6.13.0 (Ruy Adorno)
    nodejs#30271
* embedder:
  * Exposes the ability to pass cli flags / options through an API
    as embedder (Shelley Vohr)
    nodejs#30466
  * Allow adding linked bindings to Environment (Anna Henningsen)
    nodejs#30274
* esm:
  * Unflag --experimental-modules (Guy Bedford)
    nodejs#29866
* stream:
  * Add `writable.writableCorked` property (Robert Nagy)
    nodejs#29012
* worker:
  * Allow specifying resource limits (Anna Henningsen)
    nodejs#26628
* v8:
  * The Serialization API is now stable (Anna Henningsen)
    nodejs#30234

PR-URL: nodejs#30547
MylesBorins added a commit that referenced this pull request Nov 21, 2019
Notable changes:

* addons:
  * Deprecate one- and two-argument `AtExit()`. Use the three-argument
    variant of `AtExit()` or `AddEnvironmentCleanupHook()` instead
    (Anna Henningsen) #30227
* child_process,cluster:
  * The `serialization` option is added that allows child process
    IPC to use the V8 serialization API (to e.g., pass through data
    types like sets or maps) (Anna Henningsen)
    #30162
* deps:
  * Update V8 to 7.9
  * Update `npm` to 6.13.0 (Ruy Adorno)
    #30271
* embedder:
  * Exposes the ability to pass cli flags / options through an API
    as embedder (Shelley Vohr)
    #30466
  * Allow adding linked bindings to Environment (Anna Henningsen)
    #30274
* esm:
  * Unflag --experimental-modules (Guy Bedford)
    #29866
* stream:
  * Add `writable.writableCorked` property (Robert Nagy)
    #29012
* worker:
  * Allow specifying resource limits (Anna Henningsen)
    #26628
* v8:
  * The Serialization API is now stable (Anna Henningsen)
    #30234

PR-URL: #30547
MylesBorins pushed a commit that referenced this pull request Jan 12, 2020
Allow specifying resource limits for the JS engine instance created
as part of a Worker.

PR-URL: #26628
Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Franziska Hinkelmann <franziska.hinkelmann@gmail.com>
BethGriggs pushed a commit that referenced this pull request Feb 6, 2020
Allow specifying resource limits for the JS engine instance created
as part of a Worker.

PR-URL: #26628
Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Franziska Hinkelmann <franziska.hinkelmann@gmail.com>
@MylesBorins MylesBorins mentioned this pull request Feb 8, 2020
@dwinrick-lever
Copy link

A current problem with this approach is that you can't have a custom heap size for the parent process and the worker_threads. To customize the worker's heaps, you cannot customize the parent process's heap via CLI in any way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready PRs that have at least one approval, no pending requests for changes, and a CI started. c++ Issues and PRs that require attention from people who are familiar with C++. errors Issues and PRs related to JavaScript errors originated in Node.js core. semver-minor PRs that contain new features and should be released in the next minor version. worker Issues and PRs related to Worker support.
Projects
None yet
Development

Successfully merging this pull request may close these issues.