Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dumb-init child sometimes receives SIGHUP and/or SIGCONT right after start in setsid mode due to race #136

Closed
jeffrey4l opened this issue Feb 7, 2017 · 19 comments · Fixed by #174

Comments

@jeffrey4l
Copy link

OpenStack Kolla[0] project is using dumb-init v1.1.3 before, which works very well. Recently, we upgrade to v1.2.0. But the container failed now and then. After some debug, I found the difference.

here is dumb-init v1.1.3's debug log[2]

2017-02-07 01:09:52.334863 | + docker logs --tail all nova_compute
2017-02-07 01:09:52.347658 | [dumb-init] Running in debug mode.
2017-02-07 01:09:52.347712 | [dumb-init] Child spawned with PID 7.
2017-02-07 01:09:52.347734 | [dumb-init] setsid complete.
2017-02-07 01:09:52.347790 | INFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json
2017-02-07 01:09:52.347834 | INFO:__main__:Validating config file
2017-02-07 01:09:52.347884 | INFO:__main__:Kolla config strategy set to: COPY_ALWAYS
2017-02-07 01:09:52.347930 | INFO:__main__:Copying service configuration files
...

here is dumb-init v1.2.0's debug log[3], which container run successfully.

2017-02-07 01:06:13.845614 | + docker logs --tail all nova_conductor
2017-02-07 01:06:13.854610 | [dumb-init] Running in debug mode.
2017-02-07 01:06:13.854648 | [dumb-init] Child spawned with PID 7.
2017-02-07 01:06:13.854667 | [dumb-init] Received signal 1.
2017-02-07 01:06:13.854687 | [dumb-init] Forwarded signal 1 to children.
2017-02-07 01:06:13.854703 | [dumb-init] Received signal 18.
2017-02-07 01:06:13.854722 | [dumb-init] Forwarded signal 18 to children.
2017-02-07 01:06:13.854737 | [dumb-init] setsid complete.
2017-02-07 01:06:13.854765 | INFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json
...

No idea why it received a signal 18.

Here is dumb-init v1.2.0's debug log[4], which container run failed.

2017-02-07 01:06:13.814198 | + docker logs --tail all nova_compute
2017-02-07 01:06:13.823693 | [dumb-init] Running in debug mode.
2017-02-07 01:06:13.823732 | [dumb-init] Child spawned with PID 7.
2017-02-07 01:06:13.823751 | [dumb-init] Received signal 1.
2017-02-07 01:06:13.823766 | [dumb-init] setsid complete.
2017-02-07 01:06:13.823786 | [dumb-init] Forwarded signal 1 to children.
2017-02-07 01:06:13.823802 | [dumb-init] Received signal 18.
2017-02-07 01:06:13.823821 | [dumb-init] Forwarded signal 18 to children.
2017-02-07 01:06:13.823837 | [dumb-init] Received signal 17.
2017-02-07 01:06:13.823860 | [dumb-init] A child with PID 7 was terminated by signal 1.
2017-02-07 01:06:13.823879 | [dumb-init] Forwarded signal 15 to children.
2017-02-07 01:06:13.823900 | [dumb-init] Child exited with status 129. Goodbye.

it received singal 18, then signal 17, and children are killed.

I compared the source code of dumb-init v1.1.3 and v 1.2.0. But can not explain it. Any idea on this?

all these test is based on the same kolla code. the only variable is dumb-init version.

[0] https://github.com/openstack/kolla
[1] https://review.openstack.org/#/c/424832/
[2] http://logs.openstack.org/08/429908/3/check/gate-kolla-ansible-dsvm-deploy-ubuntu-source-ubuntu-xenial-nv/260697e/console.html#_2017-02-07_01_09_52_334863
[3] http://logs.openstack.org/81/429681/3/check/gate-kolla-ansible-dsvm-deploy-centos-source-centos-7-nv/d2b0b1b/console.html#_2017-02-07_01_06_13_845614
[4] http://logs.openstack.org/81/429681/3/check/gate-kolla-ansible-dsvm-deploy-centos-source-centos-7-nv/d2b0b1b/console.html#_2017-02-07_01_06_13_814198

@asottile
Copy link
Contributor

asottile commented Feb 7, 2017

Just to put some numbers to signals here:

1: SIGHUP: Hangup (POSIX)
15: SIGTERM: Termination (ANSI)
17: SIGCHLD: Child process has stopped or exited, changed (POSIX)
18: SIGCONT: Continue executing, if stopped (POSIX)

Exit status 129 (128+n): killed by signal 1

The default posix behaviour for SIGHUP if there is no handler is to terminate.

I don't have any answers and I haven't looked into anything, this might help someone else do that though :)

@jeffrey4l
Copy link
Author

SIGHUP is handled properly. and i also tried to send SIGHUP by docker kill -1 <container> which works.

openstack-gerrit pushed a commit to openstack/kolla that referenced this issue Feb 7, 2017
This reverts commit 49cd429.

dumb-init v1.2.0 busts the container with unknown reason now and then.
There is an upstream issue tracking this[0].

[0] Yelp/dumb-init#136

Partial-Bug: #1662383
Change-Id: I310e78307b6577110dc867d7524b5ac98f625c8e
@chriswessels
Copy link

We're seeing this too, very occasionally. The only difference between success and failure is signal 17. Happy to do testing if that would help.

@chriskuehl
Copy link
Contributor

@chriswessels are you also using OpenStack Kolla when you see this, or something else? Probably the easiest way to fix this is if we could reproduce it easily. I can spend some time testing with Kolla, but if there's an easier reproduction, that'd be cool :)

I'm also not seeing anything strange in the diff between 1.1.3 and 1.2.0, not sure how to explain this...

@chriswessels
Copy link

@chriskuehl No, we don't use OpenStack Kolla.

We do however see this error across plain docker run on Docker For Mac, through docker-compose on Mac & Linux, as well as in Kubernetes.

@jeffrey4l
Copy link
Author

@chriswessels do you have easier way to reproduce this issue?

@chriskuehl
Copy link
Contributor

We saw a really similar build failure in #164 running on our own Travis tests (we never saw this with CircleCI):

>           assert os.waitpid(pid, 0) == (pid, 0), output
E           AssertionError: [dumb-init] Running in debug mode.
E             [dumb-init] Child spawned with PID 1359.
E             [dumb-init] Received signal 1.
E             [dumb-init] setsid complete.
E             [dumb-init] Forwarded signal 1 to children.
E             [dumb-init] Received signal 18.
E             [dumb-init] Forwarded signal 18 to children.
E             [dumb-init] Received signal 17.
E             [dumb-init] A child with PID 1359 was terminated by signal 1.
E             [dumb-init] Forwarded signal 15 to children.
E             [dumb-init] Child exited with status 129. Goodbye.
E             
E           assert (1358, 33024) == (1358, 0)
E             At index 1 diff: 33024 != 0
E             Full diff:
E             - (1358, 33024)
E             ?        -- --
E             + (1358, 0)
>>> os.WEXITSTATUS(33024)
129

This is really interesting, though would be a lot easier to dig into if we could repro this locally :(

This is the diff from 1.1.3 (works according to the initial report) and 1.2.0:

diff --git a/dumb-init.c b/dumb-init.c
index cb0cbec..65e69ef 100644
--- a/dumb-init.c
+++ b/dumb-init.c
@@ -15,6 +15,7 @@
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
+#include <sys/ioctl.h>
 #include <sys/types.h>
 #include <sys/wait.h>
 #include <unistd.h>
@@ -250,6 +251,15 @@ int main(int argc, char *argv[]) {
     for (i = 1; i <= MAXSIG; i++)
         signal(i, dummy);
 
+    /* detach dumb-init from controlling tty */
+    if (use_setsid && ioctl(STDIN_FILENO, TIOCNOTTY) == -1) {
+        DEBUG(
+            "Unable to detach from controlling tty (errno=%d %s).\n",
+            errno,
+            strerror(errno)
+        );
+    }
+
     child_pid = fork();
     if (child_pid < 0) {
         PRINTERR("Unable to fork. Exiting.\n");
@@ -266,6 +276,14 @@ int main(int argc, char *argv[]) {
                 );
                 exit(1);
             }
+
+            if (ioctl(STDIN_FILENO, TIOCSCTTY, 0) == -1) {
+                DEBUG(
+                    "Unable to attach to controlling tty (errno=%d %s).\n",
+                    errno,
+                    strerror(errno)
+                );
+            }
             DEBUG("setsid complete.\n");
         }
         execvp(cmd[0], &cmd[0]);

I bet the SIGHUP and then SIGCONT received at the start are related to this:

If the process is the session leader, then SIGHUP and SIGCONT signals
are sent to the foreground process group and all processes in the
current session lose their controlling tty.
(from http://man7.org/linux/man-pages/man4/tty.4.html)

@chriskuehl
Copy link
Contributor

I'm able to reproduce this in a test now and working on writing a decent regression test; it seems like there's a race involved here: if the parent half of the fork in dumb-init executes early enough, the SIGHUP and SIGCONT get forwarded early and the child doesn't die (maybe because it hasn't exec'd the child process yet?). If enough of the child half happens "before" the parent half, the SIGHUP/SIGCONT arrive later and typically kill the child.

You can force the race and repro this bug reliably by doing this:

diff --git a/dumb-init.c b/dumb-init.c
index f437b10..e499717 100644
--- a/dumb-init.c
+++ b/dumb-init.c
@@ -294,6 +294,9 @@ int main(int argc, char *argv[]) {
     } else {
         /* parent */
         DEBUG("Child spawned with PID %d.\n", child_pid);
+        sleep(1);
         for (;;) {
             int signum;
             sigwait(&all_signals, &signum);

My guess is that some environments (like Travis and the OpenStack environment mentioned in this ticket) for some reason tend to have the child half working first, maybe due to scheduler quirks. I think @asottile also mentioned seeing something like this extremely rarely (like 1/1,000,000 times on "regular" EC2 hosts) .

@chriskuehl
Copy link
Contributor

Digging through docs on how ttys work, it appears that any time you detach a session from the controlling tty (ctty), that session is sent a SIGHUP and then SIGCONT with no way to avoid it. The Linux source seems to back this up: https://github.com/torvalds/linux/blob/894025f24bd028942da3e602b87d9f7223109b14/drivers/tty/tty_jobctrl.c#L233-L244

Currently, the reason dumb-init disassociates itself from the ctty is so that it can make the child's new session the tty's primary session ID later on so that job control works.

If we want to preserve this behavior, does the parent need to wait for (and drop) the first HUP and CONT it gets...?

I wonder if tcsetpgrp like tini uses can achieve the same thing -- I think maybe it can.

@asottile
Copy link
Contributor

@asottile also mentioned seeing something like this extremely rarely (like 1/1,000,000 times on "regular" EC2 hosts) .

Yeah we were seeing this on regular EC2 hosts (c4 and c5 families) at a very low rate and really only at high load.

@bukzor
Copy link
Contributor

bukzor commented Jun 9, 2018

any time you detach a session from the controlling tty (ctty), that session is sent a SIGHUP and then SIGCONT with no way to avoid it.

These are catchable signals, so there's a way to avoid it. Perhaps I'm missing something obvious, but why not catch (and wait for) the signals before forking? Then the child can't possibly be killed by them, regardless of races.


Upon closer reading, I see that's exactly what Chris suggested. 👌

@bukzor
Copy link
Contributor

bukzor commented Jun 9, 2018

Please edit "behavior wired" in the title to (perhaps) "SIGHUP race".

@bukzor
Copy link
Contributor

bukzor commented Jun 9, 2018

Investigating setpgrp, I'm pretty sure it's implemented here (below). Unless I'm missing something, it does nothing other than throw errors.

https://github.com/bminor/glibc/blob/09533208febe923479261a27b7691abef297d604/termios/tcsetpgrp.c

@chriskuehl
Copy link
Contributor

chriskuehl commented Jun 9, 2018

@bukzor yeah, here's a WIP patch I was playing with:

commit 3c66203dc9d7e66dbcf148ad6d678b9d3d392f32 (HEAD -> fix-race)
Author:     Chris Kuehl <ckuehl@yelp.com>
AuthorDate: Mon Jun 4 12:39:14 2018 -0700
Commit:     Chris Kuehl <ckuehl@yelp.com>
CommitDate: Mon Jun 4 12:39:14 2018 -0700

    WIP fix race

diff --git a/dumb-init.c b/dumb-init.c
index f437b10..c4fd171 100644
--- a/dumb-init.c
+++ b/dumb-init.c
@@ -18,6 +18,7 @@
 #include <sys/ioctl.h>
 #include <sys/types.h>
 #include <sys/wait.h>
+#include <termios.h>
 #include <unistd.h>
 #include "VERSION.h"
 
@@ -251,13 +252,50 @@ int main(int argc, char *argv[]) {
     for (i = 1; i <= MAXSIG; i++)
         signal(i, dummy);
 
-    /* detach dumb-init from controlling tty */
-    if (use_setsid && ioctl(STDIN_FILENO, TIOCNOTTY) == -1) {
-        DEBUG(
-            "Unable to detach from controlling tty (errno=%d %s).\n",
-            errno,
-            strerror(errno)
-        );
+    /*
+     * Detach dumb-init from controlling tty, so that the child's session can
+     * attach to it instead.
+     *
+     * We want the child to be able to be the session leader of the TTY so that
+     * it can do normal job control.
+     */
+    if (use_setsid) {
+        if (ioctl(STDIN_FILENO, TIOCNOTTY) == -1) {
+            DEBUG(
+                "Unable to detach from controlling tty (errno=%d %s).\n",
+                errno,
+                strerror(errno)
+            );
+        } else {
+            DEBUG("Detached from controlling tty, now waiting for SIGHUP + SIGCONT before forking.\n");
+
+            sigset_t hup_only;
+            sigemptyset(&hup_only);
+            sigaddset(&hup_only, SIGHUP);
+
+            sigset_t cont_only;
+            sigemptyset(&cont_only);
+            sigaddset(&cont_only, SIGCONT);
+
+            int signum;
+
+            if (
+                    sigwait(&hup_only, &signum) != 0 ||
+                    sigwait(&cont_only, &signum) != 0
+            ) {
+                PRINTERR(
+                    "Unable to wait for SIGHUP + SIGCONT after detaching from ctty (errno=%d %s). Exiting.\n",
+                    errno,
+                    strerror(errno)
+                );
+                exit(1);
+            }
+
+            DEBUG("Waited for both SIGHUP and SIGCONT successfully.\n");
+        }
     }
 
     child_pid = fork();
@@ -294,6 +332,8 @@ int main(int argc, char *argv[]) {
     } else {
         /* parent */
         DEBUG("Child spawned with PID %d.\n", child_pid);
+        // TODO: just for debug, forcing the race
+        sleep(1);
+        DEBUG("Done sleeping.\n");
         for (;;) {
             int signum;
             sigwait(&all_signals, &signum);

This appears to work but I haven't had time to dig into it in detail to ensure there aren't any edge cases or other issues.

Edit: We can probably actually go ahead and fork without waiting for the SIGHUP/SIGCONT as long as dumb-init knows to drop (not forward) the first two signals it gets. May be slightly faster?

@bukzor
Copy link
Contributor

bukzor commented Jun 9, 2018

Here's the musl implementatio of setpgrp, which is more helpful: https://github.com/davidlazar/musl/blob/master/src/unistd/tcsetpgrp.c

The implementation is here: https://github.com/torvalds/linux/blob/894025f24bd028942da3e602b87d9f7223109b14/drivers/tty/tty_jobctrl.c#L459

@bukzor
Copy link
Contributor

bukzor commented Jun 9, 2018

I get suspicious (I'm sure you do too) when a problem starts adding more and more code.
I suspect there's a better way to hand off tty control from parent to child.

@chriskuehl
Copy link
Contributor

iirc we do it this way to avoid a race (we want the child to be foregrounded from the start, not some time after the fork).

If we did the above, I think we'd either have a slight race (child could exec the new process before it gets the controlling TTY), or we'd have to coordinate between the child and parent dumb-init processes to delay the exec until the TIOCSCTTY has run in the parent.

@chriskuehl
Copy link
Contributor

Agreed, that's why I've been kind of hesitant about that patch I posted above :\

@chriskuehl chriskuehl changed the title dumb-init v1.2.0 behavior wired dumb-init child sometimes receives SIGHUP and/or SIGCONT right after start in setsid mode due to race Jun 18, 2018
@chriskuehl
Copy link
Contributor

Just released v1.2.2 of dumb-init which has a patch that should prevent this: https://github.com/Yelp/dumb-init/releases/tag/v1.2.2

If you're able to try it out, would love to hear if this fixes things for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants