-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
POSIX Spawn 2: Electric Boogaloo #487
Conversation
Wow, thats like a full rewrite of the whole process setup handling in node-pty, removing the fork stuff. 👍
Hmm thats weird, any idea yet why? From what I've read, the problem with fork is the sandboxing on BigSur, where the memory transfer to the child process takes very long. In theory this problem should be gone with posix_spawn, since the process gets not clone-copied from the parent anymore. Lemme me know if you need some help or when to review, can at least assist for linux and freebsd to some degree. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some early remarks from my side 😸
Thanks for the review - I'll go through it tomorrow. It's still 2x faster than fork but I've expected it to be way faster than that. I've profiled it and it's posix_spawn itself taking these 300ms, even with eg. /bin/df as file. |
I see, well thats unfortunate. Is there a mystic flag or something else, that can be set to speed up things?
Well you'd have to parse the numbers back for gid/uid back, but thats quite easy. You dont need to decompose the whole argv thingy, for exec you can simply pass it on, e.g. |
Btw solaris does not know the TIOCSCTTY ioctl. To properly attach a controlling terminal there, it is needed to do an open call to it, e.g. something like this (error handling skipped): #if defined(TIOCSCTTY)
...
#else
char *slave_path = ttyname(STDIN_FILENO);
// open implicit attaches a process to a terminal device if:
// - process has no controlling terminal yet
// - O_NOCTTY is not set
close(open(slave_path, O_RDWR));
#endif |
I think it's worth considering, as the next step after this, doing the spawn itself in an async worker, to at least avoid locking up the Electron's UI thread. |
Please take another look at it when you'll have time - I think this is as clean as it gets. Still needs testing on Linux and other UNIXes that I don't have access to. Ideally I would love this to get merged first and then work on an async spawn() separately |
Looks pretty cleaned up at a first glance, will review and test it later on under Ubuntu 18, FreeBSD 12 and OpenIndiana (solaris clone), which should cover most prominent POSIX systems. Dont have access to OpenBSD, NetBSD or AIX (to name a few more that are less common, not even sure if those have a working nodejs port). |
@Eugeny Do you know if the new spawn helper will have to be manually code-signed on OSX when packaging an Electron App? I'm a little bit concerned the extra helper may complicate thinks in that matter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works so far under linux, but ofc I have a few remarks:
- it is much much slower than the fork path (takes ~300ms on my machine to start bash)
- slow startup leads to
pty.write('abc')
writes to show up on original tty (not sure yet why this happens). Compared to the current startup this is a showstopper / API breaker, as it means, that ppl prolly will have to wait for some onReady event, before they can interact with the pty process. - cwd and tty working correctly (thus setting cwd and controlling terminal works)
- not tested: signals, uid/gid changes
- more remarks in the code (see below)
Edit:
- Linux ✔️
- Solaris ✔️
- FreeBSD ✔️
@mofux yes, but existing native modules already need to be signed anyway. Good news is that |
@jerch re performance: turns out, this is due to file closing. Removing the fd closer loop gets it down to the same 1-3 ms as fork on my machine. |
All done, perf is back to normal with the tradeoff of keeping FDs open. Error handling is revamped and passes errno back to the parent. Also added |
After loading some large modules and allocating & filling a 100 MB buffer first:
(all of the performance numbers here and above are on Linux) |
Oh well, yeah then lets not go with that security measure by default. I suggest to keep it as configurable option, so ppl can opt-in here under more dangerous situations. Maybe with an additional check whether POSIX_SPAWN_CLOEXEC_DEFAULT was defined (assuming we dont need to loop-close FDs if the system already did it for us). The danger is real, see https://labs.portcullis.co.uk/blog/exploiting-inherited-file-handles-in-setuid-programs/. And such a dangerous situation can easily be made up - imagine some terminal hub service, that pulls the authentication/authorization from some DB socket in the parent process, then uses |
I'm a dumbass who doesn't know how to benchmark stuff. Only the first spawn() takes 300 ms on macOS, subsequent spawns are all <5 ms. Looks like an async spawn() won't be necessary after all! |
Released it in the latest Tabby beta - let's see how it goes. |
@Eugeny Sorry could not yet test it on FreeBSD, but I expect no bigger surprises there (the pty impl is very similar to linux). And it works really nice and fast on linux. At first I was a bit unsure about the VFORK option, since it is blocking and sharing all memory with the parent, but after some reading it is prolly safe for the instant exec done by |
No problem. vfork() + exec() seems to be the recommended way to spawn processes in Linux, and although there's only a limited set of calls that are allowed to happen after a vfork, it should be safe to assume that glibc does it right. |
@Eugeny FYI - freebsd tested - works. (Well, some tests fail for other reasons, but not from your code...) Edit: #include <errno.h> explicit in |
Done. |
Let me know if anything else needs to be done to get this merged 🙌 |
@deepak1556 could you review this when you get a chance? |
Isn't this embarrassing? A lot of ppl suffer from this, still this PR is not worth a ~15min review/feedback. Guys if you dont feel like maintaining this lib in a healthy state - just say so. Pretty sure there are enough ppl out there to take ownership in a responsible way, even without that CLA nonsense. |
@jerch not sure if you are following the upstream libuv refactor for I don't have any additional code reviews for this PR at the moment, I will let you and @Tyriar make the decision of merging the PR in its current state. |
@deepak1556 Thx for the update on the issues behind, thats at least some information to work with (and no, I dont follow any upstream discussion on foreign repos). Itsn't a pity that I have to get cocky to get that information? Anyway, thx for the quick response. |
Sorry about that, I could have done a better communication from my side beforehand. I mostly get tied up with other core work that this kept falling through the cracks but definitely not a valid reason for slow response on this PR that is open for quite sometime. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will pull into VS Code after the beta is released and see how that goes.
Here's another go at posix_spawn, with a helper handling cwd/uid/gid/CT/setuid and then exec'ing itself into the new process.
fork()
is replaced withposix_spawn()
ing a helper process, allowing 300x spawn speedup on macOS and 2 to 20x on Linux.closeFDs
, UNIX only). This is free on macOS via POSIX_SPAWN_CLOEXEC and costs 300 ms on Linux due to having to close FDs one by one.