Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

workers: initial implementation #2133

Closed
wants to merge 10 commits into from

Conversation

petkaantonov
Copy link
Contributor

#1159 retargeted to master

@alubbe
Copy link

alubbe commented Jul 8, 2015

Awesome, thank you for picking this up!

@petkaantonov petkaantonov added the semver-minor PRs that contain new features and should be released in the next minor version. label Jul 8, 2015
'use strict';

if (!process.features.experimental_workers) {
throw new Error('Experimental workers are disabled');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Strictly speaking, they are not disabled, but not enabled.

@petkaantonov
Copy link
Contributor Author

Implemented process.threadId which is always 0 for the main thread and > 0 for workers.

Implemented data option where you can pass initial data to the worker (process.env cannot be used since it's process wide). The passed data is available in process.workerData inside a worker. This is needed for running fs and network tests in parallel when using workers.

Implemented eval option, a boolean that you can set to true if you want the first argument to be evaluated as code rather than loading it as a file.

@Fishrock123
Copy link
Contributor

@petkaantonov "io.js master merged the required libuv fix regarding to closing stdio handles on Windows"

Could you provide a link to the libuv issue up there? :)

@brendanashworth brendanashworth added the c++ Issues and PRs that require attention from people who are familiar with C++. label Jul 8, 2015

// process-relative uptime base, initialized at start-up
static double prog_start_time;
static bool debugger_running;
// Needed for potentially non-thread-safe process-globas
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/globas/globals

@piscisaureus
Copy link
Contributor

io.js master merged the required libuv fix regarding to closing stdio handles on Windows

That happened: libuv/libuv@60e515d...c619f37. I believe a libuv release is also imminent.

@petkaantonov
Copy link
Contributor Author

@piscisaureus yeah the task means that current deps/uv in master doesn't contain the changes

// process.isWorkerInstance
READONLY_PROPERTY(process,
"isWorkerInstance",
Boolean::New(env->isolate(), env->is_worker_instance()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a case when isMainInstance == isWorkerInstance?

I assume no, and probably that having both is just for convenience.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the main thread is guaranteed to have a process.threadId of 0, there is no need to have process.isMainInstance nor process.isWorkerInstance. It also would help reduce the API surface.

On a related point, should it be process.threadId or process.tid as discussed in the last PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah they are for convenience and readability

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with having both. Matches how cluster works. This is a nit, but possibly consider how to shorten the names, or maybe just swap Instance for Thread.

@petkaantonov
Copy link
Contributor Author

@kzc The issue you reported was actually known all along in this comment:

// Deleting WorkerContexts in response to their notification signals
// will cause use-after-free inside libuv. So the final `delete this`
// call must be made somewhere else

"somewhere else" means queuing the delete this call asynchronously on the main thread event loop. And of course this fails when the owner thread == main thread.

A significantly simpler solution (without this problem) now occurs to me where WorkerContexts to be deleted would be pushed to a global cleanup queue which is looped through in-between event loop iterations on the main event loop.

@kzc
Copy link

kzc commented Jul 10, 2015

@petkaantonov - it's been some time since I looked at your thread worker code but I vaguely recall it already had a cleanup queue that was intended be called asynchronously. The problem was a nested event loop handler within a dispose function. Nesting event loops is something that really should be avoided - it creates complexity and it is difficult to reason about the ordering of events and the correctness of a solution.

@petkaantonov
Copy link
Contributor Author

@kzc btw did you want to know the worker's threadId from the worker object on its owner thread as well? As in worker.threadId?

And yeah I'll change process.threadId to process.tid for better symmetry with process.pid :)

@kzc
Copy link

kzc commented Jul 10, 2015

For my needs just having process.threadId is sufficient.

This brings up a good point - in your implementation can a given worker instance potentially be scheduled on different threads during its lifetime, or are workers always pinned to a specific thread? If not pinned, the worker instance could benefit from having a unique worker id (never reused for the lifetime of the process across all workers), which is different than a threadId.

@petkaantonov
Copy link
Contributor Author

Worker is exclusively tied to a thread. I am not sure what benefit there would be from being able to schedule it on different threads, it would be very complex to implement as you need to facilitate the ownership transfer of a v8 isolate and so on.

However tying a worker to a specific CPU core will be possible if/when libuv merges libuv/libuv#280.

@petkaantonov
Copy link
Contributor Author

The use-after-free and nested event loops should be fixed now

@kzc
Copy link

kzc commented Jul 10, 2015

@petkaantonov - Just curious... instead of posting delete tasks to the main thread with QueueWorkerContextCleanup() and CleanupWorkerContexts(), why don't you delete the WorkerContext at the end of WorkerContext::RunWorkerThread() when the worker thread's event loop is guaranteed to have finished?

void WorkerContext::RunWorkerThread(void* arg) {
  WorkerContext* worker = static_cast<WorkerContext*>(arg);
  worker->Run();
  delete worker;
}

@petkaantonov
Copy link
Contributor Author

After Run() completes only the stuff belonging to the worker thread has been disposed. It is still pending owner disposal at that point.

@ronkorving
Copy link
Contributor

Very cool stuff, but is this not going to be solved/replaced by Vats "some day"? I'm probably missing something, but hope this isn't overlooked.

@petkaantonov
Copy link
Contributor Author

You seem to imply that all strawman proposals will eventually be implemented but that is not the case.

@ronkorving
Copy link
Contributor

I just assume a lot, out of ignorance :)

@kzc
Copy link

kzc commented Jul 11, 2015

@petkaantonov - I tested your "Fix use-after-free" patch on Linux with valgrind as per the instructions here. It appears to work correctly.

You may consider getting rid of the async WorkerContext reaper on the main thread and adopting something like this instead which I think is easier to understand and should put less of a burden on the main thread since it would no longer have to poll the WorkerContext queue:

void WorkerContext::RunWorkerThread(void* arg) {
  WorkerContext* worker = static_cast<WorkerContext*>(arg);
  worker->Run();
  ...wait on a libuv condition variable signalled by 
     owner thread at end of WorkerContext::Dispose()...
  delete worker;
}

Unfortunately the Mac OSX BADF/select problem mentioned in the last PR still exists. I think it's a libuv issue. There's also an unrelated linux issue outlined below.

Using the latest workers implementation as of 1e0b6b1fd5fc93986d056798f47804d0a15a9bec and this patch:

--- a/test/workers/test-crypto.js
+++ b/test/workers/test-crypto.js
@@ -33,3 +33,3 @@ var tests = [

-var parallelism = 4;
+var parallelism = 8;
 var testsPerThread = Math.ceil(tests.length / parallelism);

running this command repeatedly:

./iojs --experimental-workers test/workers/test-crypto.js

on a 4 core Linux VM it experiences this error roughly once per 50 runs:

/opt/iojs-workers-implementation/test/common.js:484
  throw e;
        ^
Error: Running test/parallel/test-crypto-stream.js inside worker failed:
AssertionError: false == true
    at Decipheriv.end (/opt/iojs-workers-implementation/test/parallel/test-crypto-stream.js:52:5)
    at Decipheriv.<anonymous> (/opt/iojs-workers-implementation/test/common.js:371:15)
    at emitOne (events.js:82:20)
    at Decipheriv.emit (events.js:169:7)
    at done (_stream_transform.js:178:19)
    at _stream_transform.js:119:9
    at Decipheriv.Cipher._flush (crypto.js:160:5)
    at Decipheriv.<anonymous> (_stream_transform.js:118:12)
    at Decipheriv.g (events.js:260:16)
    at emitNone (events.js:67:13)
    at Worker.<anonymous> (/opt/iojs-workers-implementation/test/common.js:477:14)
    at emitOne (events.js:77:13)
    at Worker.emit (events.js:169:7)
    at onerror (worker.js:61:18)
    at WorkerBinding.workerContext._onmessage (worker.js:75:16)

on a 4 core Mac it experiences these errors roughly once per 20 runs:

 /opt/iojs-workers-implementation/test/common.js:484
   throw e;
         ^
 Error: Running test/parallel/test-crypto-hmac.js inside worker failed:
 Error: EBADF: bad file descriptor, close
     at Error (native)
     at Object.fs.closeSync (fs.js:518:18)
     at Object.fs.readFileSync (fs.js:445:21)
     at Object.Module._extensions..js (module.js:447:20)
     at Module.load (module.js:355:32)
     at Function.Module._load (module.js:310:12)
     at Function.Module.runMain (module.js:471:10)
     at process._runMain (node.js:68:18)
     at Worker.<anonymous> (/opt/iojs-workers-implementation/test/common.js:477:14)
     at emitOne (events.js:77:13)
     at Worker.emit (events.js:169:7)
     at onerror (worker.js:61:18)
     at WorkerBinding.workerContext._onmessage (worker.js:75:16)
 (node) crypto.createCredentials is deprecated. Use tls.createSecureContext instead.
 <Buffer 0c 1e e9 6b 67 d3 29 f7 94 26 87 51 bb 05 53 3f>
 Assertion failed: (r == 1), function uv__stream_osx_interrupt_select, file ../deps/uv/src/unix/stream.c, line 127.
 Abort trap: 6

Ignore the deprecation lines - they are of no consequence to this issue.

return nullptr;

// Allocate enough space to include the null terminator
size_t len = StringBytes::StorageSize(string_value, UTF8) + 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't StorageSize take an isolate?

@hax
Copy link
Contributor

hax commented Jun 19, 2016

@isiahmeadows Workers should be in core to support transfer ArrayBuffer or other resources in Node. But I'm not sure whether this PR implement these features.

@siriux
Copy link

siriux commented Jul 25, 2016

@bnoordhuis Is there any place to follow your work on integrating webworkers for v7?

@HyeonuPark
Copy link

As Atomics and SharedArrayBuffer api landed in stage 2 of tc39 process and v8 is implementing it, I think nodejs should have thread apis in any form as it's core module, to support shared memory correctly.

#bnoordhuis have you checked that sharedmem api? Can it be possible with your implementation?

@HyeonuPark
Copy link

@bnoordhuis have you checked that sharedmem api? Can it be possible with your implementation?

I' just wonder why i used # instead @ :P


// -p, --print

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are a few blocks that are marked as "changes" when in actually just some formatting / ordering changed. Probably not a good idea to offer possible merge conflicts because of indentation fixes?!

@dead-claudia
Copy link

dead-claudia commented Dec 11, 2016 via email

@bnoordhuis
Copy link
Member

I've been working on and off on a rewrite of this pull request but after discussion with other collaborators I've come to the conclusion that multi-threading support adds too many new failure modes for not enough benefit.

The primary motivation for the rewrite was improved IPC performance but I'm fairly confident by now that we can also accomplish that using more traditional means like shared memory and more efficient serialization.

I'll go ahead and close this. Thanks for your hard work, Petka.

@bnoordhuis bnoordhuis closed this Dec 11, 2016
@GnorTech
Copy link
Member

For those who want to write Node.js code in multithread program: NW.js implemented this by enabling Node.js in Web Workers: https://nwjs.io/blog/v0.18.4/

@pemrouz
Copy link

pemrouz commented Jan 10, 2017

The primary motivation for the rewrite was improved IPC performance but I'm fairly confident by now that we can also accomplish that using more traditional means like shared memory and more efficient serialization.

Hi @bnoordhuis. Can I ask what the latest plan/thinking is for shared memory in Node (i.e. implement workers, or somehow allow transferring SharedArrayBuffers with cluster, or different API altogether)? The latest version seems to have SharedArrayBuffer (and Atomics), but there is no way to currently use this iiuc? Also, what would be the best the way to help out with this?

@addaleax
Copy link
Member

I've come to the conclusion that multi-threading support adds too many new failure modes for not enough benefit.

Also… could you mention what exactly it is that you have discarded? Multi-threading with the full Node API available in each thread, or something more lightweight like a WebWorkers-style API?

@rsp
Copy link
Contributor

rsp commented Jan 12, 2017

@addaleax I tried to summarize the state of this issue as well as the different types of concurrency and their pros and cons in the context of Node, and I also kept posting updates about this pull request (mostly thanks to comments from @matheusmoreira - thanks for that) in this answer on Stack Oveflow:

Which would be better for concurrent tasks on node.js? Fibers? Web-workers? or Threads?

If anything is incorrect or outdated please let me know.

@bnoordhuis
Copy link
Member

Can I ask what the latest plan/thinking is for shared memory in Node (i.e. implement workers, or somehow allow transferring SharedArrayBuffers with cluster, or different API altogether)?

Shared memory is not my primary focus right now, reducing the overhead of serializing/deserializing is. I ran a lot of benchmarks and in most non-contrived cases the overhead of converting to and from JSON is significantly greater (as in 70/30 or 80/20 splits) than sending it to another process.

Once I get the overhead of serdes down, I'm going to look into shared memory support. It's a minefield of platform-specific quirks and limitations so it's probably going to take a while to get it merged in libuv and iron out the bugs. If you want to help out, this is probably a good place to start.

V8 5.5 or 5.6 will make it a lot easier to do efficient serdes so that's what I'm currently waiting for.

could you mention what exactly it is that you have discarded? Multi-threading with the full Node API available in each thread, or something more lightweight like a WebWorkers-style API?

The former, the node-with-threads approach. WebWorkers-style parallelism is still an option and not terribly hard to implement but I didn't see a point in pursuing that in core, there are already add-ons that do.

@dead-claudia
Copy link

@bnoordhuis

The former, the node-with-threads approach. WebWorkers-style parallelism is still an option and not terribly hard to implement but I didn't see a point in pursuing that in core, there are already add-ons that do.

That'd be useful except none of the modules I've seen using true threads (instead of processes) actually support require in any way, which makes it way harder to scale. (More specifically, they currently can't, because there's no way to atomically modify the require-related caches via different threads. It has to be moved into C++ land for that to be possible, thanks to V8's lack of thread safety.)

@ronkorving
Copy link
Contributor

@bnoordhuis

V8 5.5 or 5.6 will make it a lot easier to do efficient serdes so that's what I'm currently waiting for.

Out of curiosity, could you elaborate as to why this is?

@addaleax
Copy link
Member

V8 5.5 or 5.6 will make it a lot easier to do efficient serdes so that's what I'm currently waiting for.

Out of curiosity, could you elaborate as to why this is?

I’m pretty sure Ben is referring to the added ValueSerializer and ValueDeserializer classes

@ronkorving
Copy link
Contributor

@addaleax Ah nice, a serializer that doesn't use JSON strings?

Sidenote: Something cluster's messaging could make use of too I imagine (would be nice as it's quite slow now imho).

@addaleax
Copy link
Member

Ah nice, a serializer that doesn't use JSON strings?

I mean, I haven’t used it myself yet, but that’s sure what it sounds like. 😄

Something cluster's messaging could make use of too I imagine (would be nice as it's quite slow now imho).

Yeah, I had that thought, too. But as far as the current slowness is concerned: #10557 seems to fix quite a bit of that. :)

@NawarA
Copy link

NawarA commented May 12, 2017

Is this still being worked on?

@jasnell
Copy link
Member

jasnell commented May 12, 2017

@NawarA ... not at this time. If someone wanted to volunteer to pick it up, I think that would be welcome, but it's quite a task and there would be much to do.

@pemrouz
Copy link

pemrouz commented May 21, 2017

@jasnell it would be useful if someone could setup a meta-tracking issue like #6980 to just break down the problem on what is required in order to get this done - then volunteers can start picking them up.

Based on the above, it's not even clear what the desired end state is. It would be good to clarify how Node users will eventually be able to use SharedArrayBuffer (e.g: shared memory between workers? shared memory between processes? using cluster module?).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. semver-minor PRs that contain new features and should be released in the next minor version.
Projects
None yet
Development

Successfully merging this pull request may close these issues.