Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restarting trell_master results in zombie process of old trell_master #71

Open
MrGaribaldi opened this issue Sep 10, 2013 · 2 comments
Open
Assignees
Labels

Comments

@MrGaribaldi
Copy link
Contributor

When restarting trell_master through the web interface, the currently running process will not exit nicely, but becomes a zombie process.

Cleric needed to turn undead and exorcise #67 to stop it from becoming a powerful necromancer spawning undead processes at the press of a button.

@ghost ghost assigned cdyk Sep 10, 2013
@cdyk
Copy link
Contributor

cdyk commented Sep 11, 2013

A process (child) is always spawned by another process (parent). When a process terminates, the parent should be able to retrieve the exit code of its child.

So a zombie is a process that has terminated, but the parent hasn't retrieved the exit code yet. Then the process lingers around in the process table, even though it is terminated.

So: to wipe a zombie process, the parent process can specify explicitly that it doesn't care about the exit code for any of its children, or it can retrieve the exit code, or one can terminate the parent process (so that the child gets adopted by init, which will read its exit code).

The problem with trell_master is that it is spawned by apache, and apache has a host of processes running. So, first of all, we cannot be sure if the process that termitates the process is the same that spawned it, and hence the terminator cannot retrieve the exit code (currently, if you're lucky and these two actions has been done by the same process, the zombie issue is avoided).

If you're unlucky, then the only way is to mess with the signal-handlers of the apache process, which I'm pretty sure that apache already has a use for.

So, as a workaround, restrarting apache should wipe all the zombie processes (as this will terminate the parent processes, and the children will be adopted by init).

In the long term, a better solution is to let an init-script start and stop trell_master the same way apache is started and stopped.

@cdyk cdyk closed this as completed Sep 11, 2013
@cdyk cdyk reopened this Sep 21, 2013
@cdyk
Copy link
Contributor

cdyk commented Sep 21, 2013

This might be resolved by using a wrapper-process between the apache process and the master job. The wrapper process is started instead of the master job, and the wrapper process starts the master process and immediately terminates. Our apache module will wait on the wrapper, letting it terminate cleanly. The master process is orphaned, and will be adopted by init, which avoids it becomming a zombie.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants