Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix delay on reboot or power off #859

Merged
merged 3 commits into from
Mar 18, 2020

Conversation

bcressey
Copy link
Contributor

@bcressey bcressey commented Mar 17, 2020

Issue number:
#858

Description of changes:
This reverts the KillMode=mixed change from 75125f6, and replaces it with lower timeouts for starting and stopping services.

Testing done:
Terminated an instance through the API.

New settings are applied:

# systemctl show
...
DefaultTimeoutStartUSec=10s
DefaultTimeoutStopUSec=10s

Relevant console output during shutdown from an instance with running pods and host containers:

[ 9655.638443] systemd-shutdown[1]: Syncing filesystems and block devices.
[ 9655.643105] systemd-shutdown[1]: Sending SIGTERM to remaining processes...
[ 9655.649976] systemd-journald[1026]: Received SIGTERM from PID 1 (systemd-shutdow).
[ 9665.650617] systemd-shutdown[1]: Waiting for process: containerd-shim, containerd-shim, containerd-shim, containerd-shim, containerd-shim, containerd-shim
[ 9665.659807] systemd-shutdown[1]: Sending SIGKILL to remaining processes...
[ 9665.665462] systemd-shutdown[1]: Sending SIGKILL to PID 3536 (containerd-shim).
[ 9665.671968] systemd-shutdown[1]: Sending SIGKILL to PID 3537 (containerd-shim).
[ 9665.678394] systemd-shutdown[1]: Sending SIGKILL to PID 4389 (containerd-shim).
[ 9665.684806] systemd-shutdown[1]: Sending SIGKILL to PID 4512 (containerd-shim).
[ 9665.691255] systemd-shutdown[1]: Sending SIGKILL to PID 4548 (containerd-shim).
[ 9665.697675] systemd-shutdown[1]: Sending SIGKILL to PID 4744 (containerd-shim).
[ 9665.705557] systemd-shutdown[1]: Unmounting file systems.
...
[ 9665.777562] systemd-shutdown[1]: Powering off.

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

@samuelkarp
Copy link
Contributor

Are we confident that this isn't going to cause the problem that 75125f6 was supposed to fix? In particular, it sounds like shutdown would block when there were containers that needed to exit and the containerd-shim processes weren't receiving appropriate signals.

Can you test with running containers (both host-containers and orchestrated containers) and make sure that shutdown does not block?

@bcressey
Copy link
Contributor Author

In particular, it sounds like shutdown would block when there were containers that needed to exit and the containerd-shim processes weren't receiving appropriate signals.

Per testing output, all the containerd-shim processes are sent SIGKILL. The same happens with a dev build using Docker, if I have backgrounded Docker processes running at shutdown.

Mostly this results in the same behavior as before; with KillMode=mixed other processes are not sent the initial SIGTERM, only the follow-up SIGKILL. The advantage is that shim lifecycle is now decoupled from the containerd service, making it safer to restart it.

Can you test with running containers (both host-containers and orchestrated containers) and make sure that shutdown does not block?

That's indeed what I tested.

The shutdown timeout (and revert) was the original fix. I added the global stop timeout after observing that the libcontainer tasks would also delay shutdown by 90 seconds.

@samuelkarp
Copy link
Contributor

Per testing output, all the containerd-shim processes are sent SIGKILL.

Thanks for verifying!

That's indeed what I tested.

It wasn't called out in the "Testing done" description, so I wanted to make sure.

If we are still running processes during shutdown, they are likely to
be running in containers rather than managed by the host system. They
may not expect or respond to SIGTERM, which can delay the restart for
up to 90 seconds.

In the case of an update, we expect containers to be drained by the
operator before the system is restarted. However, if the system is
powered off directly then this may not happen.

With the network down, it is unlikely that processes can complete any
useful work apart from syncing data to disk. A lower timeout means we
will reboot or power off more quickly, which allows the node or its
replacement to come up faster.

Signed-off-by: Ben Cressey <bcressey@amazon.com>
This reverts commit 75125f6.

The reason for changing KillMode was to make the system shut down
more quickly. However, this is flawed in practice because although
the containerd shims are killed more quickly, the processes running
inside the containers are not since they are no longer part of the
unit's control group.

This configuration is also discouraged by upstream, as it means that
the containerd service cannot be safely restarted.
The default timeout for start and stop is 90 seconds, and none of the
current host services require this much time. If they did, we could
override the setting locally in the service unit.

Many of the processes on the system end up running inside scopes that
are dynamically created by the orchestrator agent. By changing the
default timeout, we ensure that these processes are stopped quickly
during shutdown or restart.

Signed-off-by: Ben Cressey <bcressey@amazon.com>
@bcressey bcressey merged commit e849cc9 into bottlerocket-os:develop Mar 18, 2020
@bcressey bcressey deleted the shutdown-fix branch March 18, 2020 22:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants