Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend shared library interface #19005

Closed
wants to merge 66 commits into from
Closed

Conversation

cmfcmf
Copy link

@cmfcmf cmfcmf commented Feb 26, 2018

Hi everyone,

We (Tim @luminosuslight, Christian @cmfcmf, Johannes @Hannes01071995, Justus @justus-hildebrand, Max @msoechting) are a group of five students currently enrolled for a M.Sc. in Computer Science.
As part of a C++-focused seminar, we took on the project topic of extending Node.js when used as a shared library.
The goal of our project was to allow for an easier and more flexible development experience when embedding Node.js into C++ applications by giving more control to the application developer while staying backwards compatible with the existing interface (node::Start(argc, argv), etc.).

Motivation

As a C++ application developer, it can sometimes be hard to find maintained libraries with appropriate licenses. Furthermore, it can be relatively difficult to integrate C++ libraries into projects. In order to unlock the full potential of the NPM package ecosystem for C++ applications, we embedded Node.js as a shared library. We wanted to build a Qt based C++ application using NPM packages for data retrieval. However, we ran into various difficulties during the development process, such as:

  • sparse documentation
  • low controlability
    • event loop blocks calling thread, once Node.js is started
    • parameters (e.g. for Start(argc, argv)) are difficult to construct

Thus, we tried to improve the usability of Node.js as a shared library.

Thoughts about the interface

As we started with with our project, we developed some ideas of how we wished to use Node.js. We came up with two primary requirements for our interface design.

Low abstraction

We decided to stay as close as possible to v8 (using ‘pure’ v8 data types) and Node.js with just enough convenience added, so that anyone can use Node.js. An example for the convenience methods would be node::RegisterModule(), which exposes a list of given C++ methods to the JavaScript context.
Addtionally, the application does not exit if a fault occurs when using methods defined in our interface. The developer has to check the return values explicitly.

Flexible event loop

We wanted to give the user full control over the behavior of the Node.js event loop. Although the initial implementation allowed the user to start the event loop with just one call, we missed the possibility to also stop and resume the loop at will. Therefore, we introduced two different ways for the user to interact with the event processing in Node.js: The user might just (1) let Node.js take care of the event loop. This interaction is very close to the initial behavior, since the node::RunEventLoop(callback) blocks the calling thread until all events are processed. However, the user gains the possibility to react to changes by providing a callback, which is executed once per loop tick. The other way of controlling the event loop is by (2) executing single event loop ticks. By calling node::ProcessEvents(), Node.js processes events once and returns to the caller.

Main Changes

We mainly worked on refactoring the three node::Start methods inside the src/node.cc file. We introduced separated methods for (1) initialization, (2) execution and (3) deinitialization.
We took care to create the same objects in the same order as before, however, most of them are now created on the heap instead of the stack.
We defined and implemented the public interface inside the src/node_lib.h file.
The code still contains quite a few TODOs, but most of the work has been done and we feel confident our code can provide a basis for discussion and feedback around our ideas.

Examples

We created an example repository containing two types of applications:

First, there is is the node-lib-cli application. It is a simple command line app which doesn't do anything useful. However, it showcases most of the new methods we added to Node.js' interface in a simple manner.

Second, we built a really simple RSS feedreader using Qt as frontend and Node.js for reading and parsing a feed. It comes in two different versions:

The node-qt-rss version uses the old Node.js interface. The app runs even without any changes from this PR. It is mainly there to show the difficulties when using Node.js' interface in its current form.
The node-lib-qt-rss version behaves much the same except it's using the new interface proposed in this pull request, reducing the overall amount of code needed.

Here is some example code to show some of the most important methods of the new interface:

// Initialize Node.js. This will execute most of the code formerly present
// in the node::Start() method and stop execution right before the event
// loop would start running.
node::Initialize("cli-app");

// defined somewhere else:
// void doHeavyCalculation(const v8::FunctionCallbackInfo<v8::Value>& args);

// This C++ method can be made available to JS using a new convenience method.
// The third parameter binds the C++ module to a global JS object with that name.
node::RegisterModule("cpp-calculation", {
                        {"doHeavyCalculation", doHeavyCalculation},
                      }, "cppCalculation");

// Now let's execute this from JavaScript:
node::Evaluate("cppCalculation.doHeavyCalculation();");

// The JavaScript state will only be destroyed when node::Deinitialize() is called.
// This allows the following:
node::Evaluate("let text = 'This is text stored in a global variable in JS.';");
node::Evaluate("console.log(text);");

// Obviously passing JavaScript via string is only feasible for small snippets.
// You can include and execute whole files using the Run method
node::Run("/path/to/file.js");
// Please note: events generated by the JS file aren't executed unless you run the eventloop:
node::RunEventLoop();

// We can also include Node.js modules easily and call functions on them.
auto fs = node::IncludeModule("fs");
auto isolate = node::internal::isolate();
auto result = node::Call(fs.ToLocalChecked(), "existsSync", 
                         { v8::String::NewFromUtf8(isolate, "/path/to/file.txt") });
// result now holds whether or not the file exists.

For the complete interface we implemented, check out the node_lib.h file.

We encourage you to refer to our usage documentation (LIB_USAGE.md) for more sophisticated examples (i.e. QT GUI application using data from NPM modules).

We know this is quite a bit of info in just a single PR. This is why we're especially eager to get to know your opinions and feedback. Please keep in mind that we are currently still working on some issues.

Open questions
  • We added a guide for using the new interface (in addition to code documentation) inside the LIB_USAGE.md file. What would be a good place to put this information in the official node repo?
  • Should we merge our node_lib.h header into the node.h header file or keep it separate?
  • We currently have to execute evalScript('[eval]'); inside the js bootstrapping code to make the require function available as a global. We presume there may be a better way to achieve the same goal?
  • The code contains a few open TODOs and questions we had while going through the current Node.js interface. We’d be glad to start a discussion on these TODOs with you.
Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines
Affected core subsystem(s)

src, lib, doc

msoechting and others added 30 commits February 7, 2018 11:12
fixes issues pointed out in PR review
Deinitialization with dev as base
Replay "allow optional REPL" on new dev
Replay "refactor Start() methods to use lib functions" onto new dev
Correctly deinitialize context
Fix require not available in shared mode
* Start advanced doc

* Remove lib namespace from documentation

* Update code (thanks Johannes)

* Fix typos

* Fix a few typos

* Update usage doc

* Update comments in usage doc

* Update more comments in usage doc

* Updated with refactored code from node-embed

* Rename itemTitle to item
@cmfcmf
Copy link
Author

cmfcmf commented Apr 8, 2018

Hey @addaleax, did you find time to take a more detailed look yet? Since your last comment, we've removed a bunch of globals and made a few other changes, which we described in this now collapsed comment: #19005 (comment)

@devsnek
Copy link
Member

devsnek commented Apr 8, 2018

@cmfcmf have you looked into how much of this can be moved to some sort of standardised napi interface?

@cmfcmf
Copy link
Author

cmfcmf commented Apr 8, 2018

@cmfcmf have you looked into how much of this can be moved to some sort of standardised napi interface?

We have looked into it. While some of our methods, like Call, could certainly be made more napi compatible (I even think similar methods to our Call method already exist in the napi), the main part of this PR, which is controlling the event loop, doesn't really fit into the napi interface philosophy.

@BridgeAR
Copy link
Member

BridgeAR commented Apr 9, 2018

Ping @addaleax

@mhdawson
Copy link
Member

@yhwang FYI, use of Node.js as a shared library. We should review as part of the planning for extending the testing for the shared library to cover embedding use cases.

@mhdawson
Copy link
Member

I've just come across this PR and have not reviewed in detail, and don't want to slow thing down or get in the way but do want to add that exposing through N-API would be my preference.

@mhdawson
Copy link
Member

@cmfcmf can you expand a bit on "doesn't really fit into the napi interface philosophy."

@cmfcmf
Copy link
Author

cmfcmf commented Apr 19, 2018

@cmfcmf can you expand a bit on "doesn't really fit into the napi interface philosophy."

I'm sorry for not replying sooner @mhdawson: the difference between our embedder's API and the N-API is that N-API is designed with a focus on building native C++ addons for Node.js, with Node.js still being the main application running.
Our API approaches the C++ <-> JS interface from the other side: Instead of writing addons for Node.js, our proposal focuses on enabling C++ programmers with an already existing project to use Node.js as an "addon" in their own project, with all control over the program's control flow still laying in the hands of the main C++ application.
In our opinion, this is not what N-API promises to deliver. Maybe an adaption of the target of N-API might be possible, as to also include giving addons control over the NodeJS main loop, but we didn't feel this is what N-API was made for.

@mhdawson
Copy link
Member

@cmfcmf thanks for the response. The original scope of N-API was focussed on addons. However, we have talked about making N-API be the umbrella under which native APIs are provided by Node are defined and with the functions for addons being one sub-category.

So even though the existing doc does not make it look like a good fit, it may still be the right way to go. Using the same types, and a C interface in order to be able to decouple from v8 internals is the important part.

Would it make any sense to have a short conversation? I am trying to get the larger conversation going in nodejs/user-feedback#51 but it might make sense for us to have a short discussion on this specific issue.

* *Important* Use with caution, changing this object might break Node.js.
* @return Pointer to the `node::Environment`.
*/
Environment* environment();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m sorry I’ve not really been keeping up with this, but at its core the issue remains – these APIs lock Node.js more into maintaining global state where it shouldn’t, and that’s not a good thing.

In particular, it’s just not going to work with multi-threaded embedders, the way it is implemented right now. But even without that, conceptually Environment* instances are not global singletons.

@cmfcmf
Copy link
Author

cmfcmf commented Apr 25, 2018

Would it make any sense to have a short conversation? I am trying to get the larger conversation going in nodejs/user-feedback#51 but it might make sense for us to have a short discussion on this specific issue.

Yes, I think that'd make sense.

@mhdawson
Copy link
Member

@cmfcmf lets try to set something up for next week. What timezone are you in?

@cmfcmf
Copy link
Author

cmfcmf commented Apr 28, 2018

@cmfcmf lets try to set something up for next week. What timezone are you in?

@mhdawson I'm in Germany, which is CEST (UTC+2). I'd prefer Wednesday, if possible (I've got Wednesday off, so I'm flexible with the time).

@mhdawson
Copy link
Member

mhdawson commented May 1, 2018

@cmfcmf I'm at a conf this Wednesday. If you can send an email to the address listed in my profile I'll send out an invite for Wednesday the 9th at 10 AM EST (which I think would be 4PM your time) if that works for you.

@addaleax
Copy link
Member

addaleax commented May 1, 2018

@mhdawson I think that’s conflicting with the TSC meeting? Also, I’d like to get an invite as well if that’s okay with y’all?

@mhdawson
Copy link
Member

mhdawson commented May 3, 2018

@addaleax thanks for catching that. @cmfcmf, @addaleax would an hour earlier work for you two? 9EST 3PM CEST?

@addaleax
Copy link
Member

addaleax commented May 3, 2018

@mhdawson Not for me. :/ The slot after the TSC meeting would work at least for me, though.

Also, I think @gireeshpunathil might be interested in this? Might be wrong, but he expressed some interest around embedder support a couple times :)

One more thing: @cmfcmf It looks like your group is Berlin-based, so I would very much encourage you to have a look at openjs-foundation/summit#60 – tl;dr: A large group of Node.js core developers meets in Berlin on May 31st/June 1st, and in general anybody can join.

@mhdawson
Copy link
Member

mhdawson commented May 3, 2018

The hour after so 11EST or 5CEST works for me. @cmfcmf does that work for you?

@addaleax I was not trying to arrange the full group in this thread, although we can do that instead. I'm meeting with @yhwang (who has been working on improving shared library support and adding testing) on Monday to talk about setting up a meeting based on discussion in: nodejs/user-feedback#51

Instead, since @cmfcmf had not jumped in there (#51) I was looking to talk with him separately to ask a few questions and better understand how to integrate their work with the larger effort.

@cmfcmf
Copy link
Author

cmfcmf commented May 3, 2018

The hour after so 11EST or 5CEST works for me. @cmfcmf does that work for you?

@addaleax @mhdawson That works for me. 👍

@gireeshpunathil
Copy link
Member

I am interested and willing to join discussions that involve topologies, use cases and embedding scenarios at different levels of Node interfaces and differing capacity of the embedder. If it is specific to n-api interface conformation, I can skip it this time.

@addaleax
Copy link
Member

addaleax commented May 9, 2018

After some synchronous discussion, we decided to close this for now.

As a takeaway, it would be good to see which APIs were introduced and what problems they address, and in particular that there’s a necessity to refactor Node’s Start() functions and making the event loop more accessible to embedders.

Thanks for all of the hard work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. embedding Issues and PRs related to embedding Node.js in another project. lib / src Issues and PRs related to general changes in the lib or src directory.
Projects
None yet
Development

Successfully merging this pull request may close these issues.