host-ctr, host-containers: proper restarts #1230

etungsten · 2020-12-03T03:52:24Z

I recommend reviewing each commit separately

Issue number:
Fixes #1229

Description of changes:

Author: Erikson Tung <etung@amazon.com>
Date:   Sun Dec 6 20:54:03 2020 -0800

    host-containers: add safeguards against lingering host containers
    
    Now that host-ctr has the ability to rebind to existing host containers.
    We want to ensure whenever we enable host containers the container will
    be running with its latest configuration.
    
    We utilize `host-ctr`'s clean-up command to clean up any potential
    lingering host-container when we're enabling a previously disabled
    host-container and whenever a host-container is disabled.

Author: Erikson Tung <etung@amazon.com>
Date:   Fri Dec 4 20:02:37 2020 -0800

    host-ctr: add new subcommand `clean-up`
    
    Adds a new subcommand `clean-up` that checks if a given container
    exists, if it does, `host-ctr` will attempt to kill the container task
    and delete the container.

Author: Erikson Tung <etung@amazon.com>
Date:   Fri Dec 4 18:34:07 2020 -0800

    host-ctr: refactoring
    
    Refactors `host-ctr`.
    Categorizes functionality into subcommands.

Author: Erikson Tung <etung@amazon.com>
Date:   Fri Dec 4 14:31:26 2020 -0800

    host-containers@: remove KillMode=mixed
    
    We don't need systemd to go and actively try kill all processes
    of the unit's cgroup.

Author: Erikson Tung <etung@amazon.com>
Date:   Wed Dec 2 18:18:57 2020 -0800

    host-ctr: do not kill existing container, take over it
    
    If the host-container already exists, we should just take over the
    helm and not try to replace it with a new container. This is so that
    even if we temporarily lose connection with host-containerd, we can
    still eventually get the task status when containerd comes back up.

commit 0dbc03cdcde435aa259f34d80f6ec66a202b45a4

Author: Erikson Tung <etung@amazon.com>
Date:   Wed Dec 2 14:45:44 2020 -0800

    host-containers: 'Wants' host-containerd instead of 'BindsTo'
    
    host-containers@ systemd units should not stop when host-containerd is
    restarted or killed.
    
    By changing host-containers' dependency on host-containerd.service from
    `BindsTo=` to `Wants=` we ensure host containers tasks won't be killed if
     host-containerd temporarily stops.
    
    host-ctr then has a chance to reclaim the container task when
    host-containerd comes back up.

Testing done:

Built AMI, launched instance
sudo sheltie into the host via the admin container
Restarted host-containerd, and my ssh connection to the admin container was still alive
Checked the status of host-containers@admin and saw that it exited and restarted successfully.

host-containers@admin initial starts successfully.

Dec 03 03:09:30 host-ctr[3077]: Server listening on 0.0.0.0 port 22.
Dec 03 03:09:30 host-ctr[3077]: Server listening on :: port 22.
Dec 03 03:10:48 host-ctr[3077]: Accepted publickey for ec2-user from 123.123.123.123 port 5417 ssh2

This is where I restarted host-containerd. host-ctr loses connection to the containerd server and exits.

Dec 03 03:11:28 host-ctr[3077]: time="2020-12-03T03:11:28Z" level=error msg="failed to get container task
 exit status" error="rpc error: code = Unavailable desc = transport is closing"
Dec 03 03:11:28 host-ctr[3077]: time="2020-12-03T03:11:28Z" level=error msg="failed to delete container t
ask" error="task must be stopped before deletion: running: failed precondition"
Dec 03 03:11:28 host-ctr[3077]: time="2020-12-03T03:11:28Z" level=error msg="failed to cleanup container"
 error="cannot delete running task admin: failed precondition"
Dec 03 03:11:28 systemd[1]: host-containers@admin.service: Main process exited, c
ode=exited, status=1/FAILURE
Dec 03 03:11:28 systemd[1]: host-containers@admin.service: Failed wit
h result 'exit-code'.

host-containers@admin restarts and host-ctr successfully rebinds to the admin container task that's already running.

Dec 03 03:12:14 systemd[1]: host-containers@admin.service: Scheduled restart job, restart counter is at 1
.
Dec 03 03:12:14 systemd[1]: Stopped Host container: admin.
Dec 03 03:12:14 systemd[1]: Starting Host container: admin...
Dec 03 03:12:14 systemd[1]: Started Host container: admin.
Dec 03 03:12:14 host-ctr[5096]: time="2020-12-03T03:12:14Z" level=info msg="Pulling with Amazon ECR Resol
ver" ref="ecr.aws/arn:aws:ecr:us-west-2:328549459982:repository/bottlerocket-admin:v0.5.2"
Dec 03 03:12:14 host-ctr[5096]: time="2020-12-03T03:12:14Z" level=info msg="Pulled successfully" img="ecr
.aws/arn:aws:ecr:us-west-2:328549459982:repository/bottlerocket-admin:v0.5.2"
Dec 03 03:12:14 host-ctr[5096]: time="2020-12-03T03:12:14Z" level=info msg=Unpacking... img="ecr.aws/arn:
aws:ecr:us-west-2:328549459982:repository/bottlerocket-admin:v0.5.2"
Dec 03 03:12:14 host-ctr[5096]: time="2020-12-03T03:12:14Z" level=info msg="Tagging image" imageName="328
549459982.dkr.ecr.us-west-2.amazonaws.com/bottlerocket-admin:v0.5.2"
Dec 03 03:12:14 host-ctr[5096]: time="2020-12-03T03:12:14Z" level=info msg="Container task is still runni
ng, proceeding to monitor it"

Testing for when both host-containers and host-containerd are restarted due to a single API transaction. We ensure the restarted host-ctr runs an up-to-date host container that reflects the setting changes:
See #1230 (comment)

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

zmrow

Just a spelling nit.

☃️

sources/host-ctr/cmd/host-ctr/main.go

etungsten · 2020-12-03T19:02:59Z

Push above adds a condition during clean up to not delete the task and container if we're returning due to host-containerd closing its connection. The deletions will fail anyways so we should not even attempt deletion.

etungsten · 2020-12-03T19:11:43Z

Push above reverts the previous force push. I realized that the change wasn't working in the way I expected it to.

Also addresses typo pointed out by @zmrow 's comment.

webern

Looks good. One probing question.

sources/host-ctr/cmd/host-ctr/main.go

samuelkarp · 2020-12-04T00:33:23Z

@etungsten wrote

I was trying to follow the golang style guide regarding error strings golang/go/wiki/CodeReviewComments#error-strings.

It also explicit says that this doesn't apply to normal log messages. So I kept info logs the same.

(it won't let me reply inline for some reason)

This guidance applies to the error type, not to the log output (thought it's intended to improve log output when errors are appended onto log lines for context). I don't have any problem with the changes you made, but I did want to clarify that.

samuelkarp

Generally LGTM with one small requested change around adding another log line. With the introduction of this PR, the next change is about time to break up the _main function into smaller bits.

packages/os/host-containers@.service

sources/host-ctr/cmd/host-ctr/main.go

bcressey · 2020-12-04T18:59:30Z

sources/host-ctr/cmd/host-ctr/main.go

+	// Check if the target container already exists. If it does, take over the helm to manage it.
+	container, err := client.LoadContainer(ctx, containerID)


Previously we handled the case where we modify the image for a host container that's currently running, since we'd always kill it and restart afterwards.

Now it looks like if we already have "admin" running, we'll continue using it even if the image has changed. I don't think that's the behavior we want.

We should also handle toggling superpowered on and off.

If the image for the host container is changed via settings. Then the corresponding host-containers@ service will be restarted by systemd via restart-commands = ["/usr/bin/host-containers"] . This means host-ctr will actually receive a termination signal and proceed to try and terminate the container task and restart. That hasn't changed.

What this is trying to address is host-ctr's connection to containerd being closed off suddenly and losing track of the container task status and coming back up to try and reclaim the original running host container task.

Oh I guess looking at the code, it seems like to actually apply the new host containers setting (whether it be toggling superpowered or changing the image URL) users would have to toggle the enable setting as well. host-containers (the binary doesn't actually handle restarts?)

Is that the intended workflow? @zmrow @tjkirch ?

It doesn't seem robust to the case where host-ctr itself dies for some reason, gets restarted, and finds a running container - which may not be running with the right settings. Previously we did handle this correctly, at the cost of tearing down a running container that was otherwise fine.

Can we inspect the running container and determine that it's correct based on the image and one of the superpowered properties from the spec?

You're right, changing the image or toggling superpowered doesn't restart the affected host container today. 😞

There might be another edge case to consider - if settings for host-containerd are changed in the same transaction that disables a running host container, then host-ctr will exit after losing the connection, and not be restarted afterwards to clean up the running task.

One approach might be for host-ctr to have a cleanup mode so that it can be invoked by host-containers to delete the running task, if present.

That would handle the edge case and avoid the need to inspect the running container, because it wouldn't be running if the settings had changed.

Is that the intended workflow? @zmrow @tjkirch ?

(Just for historical perspective - it was intended, but we didn't like it :) We didn't have the tools at the time to restart properly on those settings changes and considered it a weakness.)

packages/os/host-containers@.service

etungsten · 2020-12-07T18:04:22Z

Push above adds additional commits to address concerns about edge cases with host-ctr potentially rebinding to an out-of-date host container when both host-containers and host-containerd are restarted.

A new clean-up mode has been added to host-ctr
Refactored host-ctr to accommodate the new subcommand.
host-containers (the rust binary) will now call host-ctr clean-up when appropriate to ensure restarted host-containers are running with the latest settings/configuration.
Removes KillMode=mixed from host-containers@ units

etungsten · 2020-12-07T18:07:29Z

Testing done for verifying the fix for the edge case:

host-ctr no longer rebinds to lingerering out-of-date host-containers when both host-containers and host-containerd are restarted.

bash-5.0# 
bash-5.0# ############# Disable host-containerd
bash-5.0# 
bash-5.0# systemctl restart host-containerd
bash-5.0# sleep 1
bash-5.0# 
bash-5.0# 
bash-5.0# ############# host-ctr loses connection and there is now a lingering control host container running
bash-5.0# 
bash-5.0# systemctl status host-containers@control
● host-containers@control.service - Host container: control
     Loaded: loaded (/x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/host-containers@.service;
 enabled; vendor preset: enabled)
     Active: activating (auto-restart) (Result: exit-code) since Mon 2020-12-07 06:35:21 UTC; 992ms ago
    Process: 3020 ExecStartPre=/usr/bin/mkdir -m 1777 -p ${LOCAL_DIR}/host-containers/control (code=exited, s
tatus=0/SUCCESS)
    Process: 3080 ExecStart=/usr/bin/host-ctr run -container-id=control -source=${CTR_SOURCE} -superpowered=$
{CTR_SUPERPOWERED} (code=exited, status=1/FAILURE)
   Main PID: 3080 (code=exited, status=1/FAILURE)

Dec 07 06:35:21 ip-192-168-28-193.us-west-2.compute.internal host-ctr[3080]: time="2020-12-07T06:35:21Z" leve
l=error msg="failed to cleanup container" error="cannot delete running task control: failed precondition"
Dec 07 06:35:21 ip-192-168-28-193.us-west-2.compute.internal systemd[1]: host-contain
ers@control.service: Main process exited, code=exited, status=1/FAILURE
Dec 07 06:35:21 ip-192-168-28-193.us-west-2.compute.internal systemd[1]: 
host-containers@control.service: Failed with result 'exit-code'.
bash-5.0# 
bash-5.0# ctr -a /run/host-containerd/containerd.sock task ls
TASK       PID     STATUS    
control    3739    RUNNING
admin      3823    RUNNING
bash-5.0# 
bash-5.0# ctr -a /run/host-containerd/containerd.sock container ls
CONTAINER    IMAGE                                                                                RUNTIME                  
admin        ecr.aws/arn:aws:ecr:us-west-2:328549459982:repository/bottlerocket-admin:v0.5.2      io.containerd.runc.v2    
control      ecr.aws/arn:aws:ecr:us-west-2:328549459982:repository/bottlerocket-control:v0.4.1    io.containerd.runc.v2    
bash-5.0# 
bash-5.0# 
bash-5.0# 
bash-5.0# ############# Disable host-containers@control through settings
bash-5.0# 
bash-5.0# apiclient -u /settings -m PATCH -d '{"host-containers": {"control": {"enabled": false}}}'
bash-5.0# apiclient -u /tx/commit_and_apply -m POST
["settings.host-containers.control.enabled"]
bash-5.0# sleep 3

bash-5.0# 
bash-5.0# 
bash-5.0# 
bash-5.0# ############# Restart commands succeeded 
bash-5.0# 
bash-5.0# journalctl -u apiserver -n 50
-- Logs begin at Mon 2020-12-07 06:30:11 UTC, end at Mon 2020-12-07 06:35:22 UTC. --
Dec 07 06:30:13 localhost systemd[1]: Starting Bottlerocket API server...
Dec 07 06:30:13 localhost apiserver[2487]: 06:30:13 [INFO] Starting server at /run/api.sock with 1 thread and
 datastore at /var/lib/bottlerocket/datastore/current
Dec 07 06:30:13 localhost systemd[1]: Started Bottlerocket API server.
Dec 07 06:30:13 localhost apiserver[2487]: 06:30:13 [INFO] Starting 1 workers
Dec 07 06:30:13 localhost apiserver[2487]: 06:30:13 [INFO] Starting "actix-web-service-"/run/api.sock"" servi
ce on "/run/api.sock" (pathname)
Dec 07 06:35:22 apiserver[5871]: 06:35:22 [INFO] thar-be-setting
s started
Dec 07 06:35:22 apiserver[5871]: 06:35:22 [INFO] Parsing stdin f
or updated settings
Dec 07 06:35:22 apiserver[5871]: 06:35:22 [INFO] Requesting affe
cted services for settings: {"settings.host-containers.control.enabled"}
Dec 07 06:35:22 apiserver[5871]: 06:35:22 [INFO] Restarting affe
cted services...
bash-5.0# sleep 5
bash-5.0# 
bash-5.0# 
bash-5.0# ############## host-containers@control is disabled at this point
bash-5.0# 
bash-5.0# systemctl status host-containers@control
● host-containers@control.service - Host container: control
     Loaded: loaded (/x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/host-containers@.service;
 disabled; vendor preset: enabled)
     Active: inactive (dead)

....
Dec 07 06:35:21 host-ctr[3080]: time="2020-12-07T06:35:21Z" leve
l=error msg="failed to get container task exit status" error="rpc error: code = Unavailable desc = transport 
is closing"
...
Dec 07 06:35:21 host-ctr[3080]: time="2020-12-07T06:35:21Z" level=error msg="failed to cleanup container" error="cannot delete running task control: failed precondition"
Dec 07 06:35:21 systemd[1]: host-containers@control.service: Main process exited, code=exited, status=1/FAILURE
Dec 07 06:35:21 systemd[1]: host-containers@control.service: Failed with result 'exit-code'.
Dec 07 06:35:22 systemd[1]: Stopped Host container: control.
bash-5.0# 
bash-5.0# 
bash-5.0# 
bash-5.0# ############## The lingering control host container got cleaned up by the restart command
bash-5.0# 
bash-5.0# ctr -a /run/host-containerd/containerd.sock task ls
TASK     PID     STATUS    
admin    3823    RUNNING
bash-5.0# 
bash-5.0# ctr -a /run/host-containerd/containerd.sock container ls
CONTAINER    IMAGE                                                                              RUNTIME                  
admin        ecr.aws/arn:aws:ecr:us-west-2:328549459982:repository/bottlerocket-admin:v0.5.2    io.containerd.runc.v2    
bash-5.0# 
bash-5.0# 
bash-5.0# ############## Re-enable control host-container and use a different image source for differentiation
bash-5.0# 
<459982.dkr.ecr.us-west-2.amazonaws.com/bottlerocket-control:v0.4.0"}}}'
bash-5.0# apiclient -u /tx/commit_and_apply -m POST
["settings.host-containers.control.enabled","settings.host-containers.control.source"]
bash-5.0# sleep 5
bash-5.0# 
bash-5.0# 
bash-5.0# 
bash-5.0# ############# Restart commands succeeded 
bash-5.0# 
bash-5.0# journalctl -u apiserver -n 50
-- Logs begin at Mon 2020-12-07 06:30:11 UTC, end at Mon 2020-12-07 06:35:35 UTC. --
Dec 07 06:30:13 localhost systemd[1]: Starting Bottlerocket API server...
Dec 07 06:30:13 localhost apiserver[2487]: 06:30:13 [INFO] Starting server at /run/api.sock with 1 thread and datastore at /var/lib/bottlerocket/datastore/current
Dec 07 06:30:13 localhost systemd[1]: Started Bottlerocket API server.
Dec 07 06:30:13 localhost apiserver[2487]: 06:30:13 [INFO] Starting 1 workers
Dec 07 06:30:13 localhost apiserver[2487]: 06:30:13 [INFO] Starting "actix-web-service-"/run/api.sock"" service on "/run/api.sock" (pathname)
Dec 07 06:35:22 apiserver[5871]: 06:35:22 [INFO] thar-be-settings started
Dec 07 06:35:22 apiserver[5871]: 06:35:22 [INFO] Parsing stdin for updated settings
Dec 07 06:35:22 apiserver[5871]: 06:35:22 [INFO] Requesting affected services for settings: {"settings.host-containers.control.enabled"}
Dec 07 06:35:22 apiserver[5871]: 06:35:22 [INFO] Restarting affected services...
Dec 07 06:35:30 apiserver[6037]: 06:35:30 [INFO] thar-be-settings started
Dec 07 06:35:30 apiserver[6037]: 06:35:30 [INFO] Parsing stdin for updated settings
Dec 07 06:35:30 apiserver[6037]: 06:35:30 [INFO] Requesting affected services for settings: {"settings.host-containers.control.source", "settings.host-containers.cont
rol.enabled"}
Dec 07 06:35:30 apiserver[6037]: 06:35:30 [INFO] Restarting affected services...
bash-5.0# sleep 10
bash-5.0# 
bash-5.0# 
bash-5.0# 
bash-5.0# ############# control host container running again
bash-5.0# systemctl status host-containers@control
● host-containers@control.service - Host container: control
     Loaded: loaded (/x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/host-containers@.service; enabled; vendor preset: enabled)
     Active: active (running) since Mon 2020-12-07 06:35:30 UTC; 14s ago
    Process: 6070 ExecStartPre=/usr/bin/mkdir -m 1777 -p ${LOCAL_DIR}/host-containers/control (code=exited, status=0/SUCCESS)
   Main PID: 6088 (host-ctr)
      Tasks: 19 (limit: 9185)
     Memory: 42.5M
     CGroup: /system.slice/system-host\x2dcontainers.slice/host-containers@control.service
             └─6088 /usr/bin/host-ctr run -container-id=control -source=328549459982.dkr.ecr.us-west-2.amazonaws.com/bottlerocket-control:v0.4.0 -superpowered=false

.....
bash-5.0# 
bash-5.0# 
bash-5.0# 
bash-5.0# ############# Notice that the new control host container is using the new image source.
bash-5.0# 
bash-5.0# ctr -a /run/host-containerd/containerd.sock task ls
TASK       PID     STATUS    
admin      3823    RUNNING
control    6209    RUNNING
bash-5.0# 
bash-5.0# ctr -a /run/host-containerd/containerd.sock container ls
CONTAINER    IMAGE                                                                                RUNTIME                  
admin        ecr.aws/arn:aws:ecr:us-west-2:328549459982:repository/bottlerocket-admin:v0.5.2      io.containerd.runc.v2    
control      ecr.aws/arn:aws:ecr:us-west-2:328549459982:repository/bottlerocket-control:v0.4.0    io.containerd.runc.v2    
bash-5.0# 
bash-5.0# #

etungsten · 2020-12-07T18:36:39Z

I noticed while making these new changes that host-containers, the rust binary responsible for managing host container systemd units, unconditionally loops through all host containers to apply their current settings and tries to enables or disables each of their systemd unit. This means that I can't unconditionally try to clean-up host containers with host-ctr before host-containers attempt to enable any particular host-container unit, because it'll do it for every host container regardless of its current running state and whether it's settings have changed or not.

host-containers (the binary) currently does not have the ability to detect settings changes and to only enact them upon impacted host containers. We can however implement this by making host-containers check whether the environment file its writing to for each container is being changed. If the environment file is changed, that means the host container that's being handled right now has to be restarted if it's currently running (enabled=true). If the environment file hasn't changed, then that means we don't need to do anything for the host container being handled. If the environment file does not exist, that means we're going through first boot and we should apply the settings and enact them unconditionally.

I did not try to implement this here because it's not strictly within the scope of this PR. But I can make an issue to follow up with this if people think the approach above is sensible. Please let me know what you think @tjkirch, @bcressey .

tjkirch · 2020-12-07T18:46:54Z

^ I think it could be better to improve the way restart-commands are run so that they're always given the list of settings that have changed, rather than the setting name having to be in the restart-command itself. That way we can handle dynamic settings, and not have to inspect the system to see what changed.

For example, right now, services.motd could have a restart-command like my-command settings.motd because we know services.motd is only associated with one setting, but services.host-containers is associated with any host container name under settings.host-containers, so we can't pass (hardcode) a single setting. Instead, if the command was given the changed settings, it could alter its behavior based on what changed, like in this case where we need to know which host container changed. (It could also give us the ability to re-use restart command helpers more often, if they can branch on setting.)

(I just mean this as a different potential follow-up, not as a blocker for this PR)

samuelkarp

Can we separate the refactoring change from the rest of this PR?

sources/host-ctr/cmd/host-ctr/main.go

etungsten · 2020-12-08T20:28:21Z

Can we separate the refactoring change from the rest of this PR?

I'm inclined to keep it as part of this PR since we're adding an additional "subcommand-like" functionality to host-ctr here. It feels like the right thing to do as opposed to having to add another flag that completely changes what host-ctr does. I very much regret adding -pull-image-only as a flag instead of a subcommand.

Hopefully the refactoring changes aren't too controversial and produce too much churn. It's mostly refactoring stuff into functions.

samuelkarp · 2020-12-08T21:25:20Z

I'm inclined to keep it as part of this PR since we're adding an additional "subcommand-like" functionality to host-ctr here. It feels like the right thing to do as opposed to having to add another flag that completely changes what host-ctr does. I very much regret adding -pull-image-only as a flag instead of a subcommand.

Hopefully the refactoring changes aren't too controversial and produce too much churn. It's mostly refactoring stuff into functions.

It might be easier to pull it out into a separate PR that we merge ahead of this one. I don't expect breaking into functions to be controversial, but I'm expecting that I won't be the only one with an opinion on the subcommand implementation and it might reduce churn/rebasing effort for you to do it that way. And separating the functional changes (proper restart behavior) from the refactor should make both easier to review.

etungsten · 2020-12-09T00:50:48Z

This PR now depends on #1235 being merged before proceeding. Will rebase once it does.

host-containers@ systemd units should not stop when host-containerd is restarted or killed. By changing host-containers' dependency on host-containerd.service from `BindsTo=` to `Wants=` we ensure host containers tasks won't be killed if host-containerd temporarily stops. host-ctr then has a chance to reclaim the container task when host-containerd comes back up.

If the host-container already exists, we should just take over the helm and not try to replace it with a new container. This is so that even if we temporarily lose connection with host-containerd, we can still eventually get the task status back when containerd comes back up.

etungsten · 2020-12-09T20:31:31Z

Push above rebases upon develop to pull in the refactor for host-ctr.

Tested things and the things still work as expected, the previous test results still stands.

zmrow

🎖️

sources/host-ctr/cmd/host-ctr/main.go

Adds a new subcommand `clean-up` that checks if a given container exists, if it does, `host-ctr` will attempt to kill the container task and delete the container.

Now that host-ctr has the ability to rebind to existing host containers. We want to ensure whenever we enable host containers the container will be running with its latest configuration. We utilize `host-ctr`'s clean-up command to clean up any potential lingering host-container when we're enabling a previously disabled host-container and whenever a host-container is disabled.

etungsten · 2020-12-10T23:10:59Z

Push above drops the commit for removing KillMode=mixed from the host containers unit files.

zmrow approved these changes Dec 3, 2020

View reviewed changes

sources/host-ctr/cmd/host-ctr/main.go Outdated Show resolved Hide resolved

sources/host-ctr/cmd/host-ctr/main.go Show resolved Hide resolved

tjkirch requested review from bcressey and samuelkarp December 3, 2020 16:42

etungsten force-pushed the host-ctr-restarts branch from 8ef22e3 to c00eb70 Compare December 3, 2020 18:56

etungsten force-pushed the host-ctr-restarts branch from c00eb70 to 7fa1683 Compare December 3, 2020 19:10

etungsten mentioned this pull request Dec 3, 2020

settings.network: add new proxy settings #1204

Merged

3 tasks

webern approved these changes Dec 3, 2020

View reviewed changes

sources/host-ctr/cmd/host-ctr/main.go Show resolved Hide resolved

sources/host-ctr/cmd/host-ctr/main.go Show resolved Hide resolved

samuelkarp reviewed Dec 4, 2020

View reviewed changes

packages/os/host-containers@.service Show resolved Hide resolved

sources/host-ctr/cmd/host-ctr/main.go Show resolved Hide resolved

sources/host-ctr/cmd/host-ctr/main.go Outdated Show resolved Hide resolved

sources/host-ctr/cmd/host-ctr/main.go Show resolved Hide resolved

etungsten force-pushed the host-ctr-restarts branch from 7fa1683 to 43ad5dd Compare December 4, 2020 01:39

samuelkarp approved these changes Dec 4, 2020

View reviewed changes

bcressey requested changes Dec 4, 2020

View reviewed changes

bcressey reviewed Dec 4, 2020

View reviewed changes

packages/os/host-containers@.service Show resolved Hide resolved

etungsten changed the title ~~host-ctr: proper restarts~~ host-ctr, host-containers: proper restarts Dec 7, 2020

etungsten requested review from bcressey and tjkirch December 7, 2020 18:36

samuelkarp reviewed Dec 8, 2020

View reviewed changes

sources/host-ctr/cmd/host-ctr/main.go Outdated Show resolved Hide resolved

sources/host-ctr/cmd/host-ctr/main.go Outdated Show resolved Hide resolved

etungsten marked this pull request as draft December 9, 2020 00:50

etungsten mentioned this pull request Dec 9, 2020

host-ctr: refactoring #1235

Merged

etungsten added 2 commits December 9, 2020 11:42

etungsten force-pushed the host-ctr-restarts branch from 81906dd to 69c8fce Compare December 9, 2020 20:30

etungsten marked this pull request as ready for review December 9, 2020 20:31

etungsten requested a review from samuelkarp December 9, 2020 20:32

zmrow approved these changes Dec 9, 2020

View reviewed changes

samuelkarp approved these changes Dec 10, 2020

View reviewed changes

bcressey reviewed Dec 10, 2020

View reviewed changes

sources/host-ctr/cmd/host-ctr/main.go Show resolved Hide resolved

etungsten added 2 commits December 10, 2020 15:09

host-ctr: add new subcommand clean-up

030a3af

Adds a new subcommand `clean-up` that checks if a given container exists, if it does, `host-ctr` will attempt to kill the container task and delete the container.

etungsten force-pushed the host-ctr-restarts branch from 69c8fce to d0781db Compare December 10, 2020 23:10

bcressey approved these changes Dec 11, 2020

View reviewed changes

etungsten merged commit 294f7af into bottlerocket-os:develop Dec 11, 2020

etungsten deleted the host-ctr-restarts branch December 11, 2020 00:24

This was referenced Dec 11, 2020

host-ctr: don't try to create the container under the systemd unit's cgroup #1237

Closed

api: improve restart-commands by passing the list of settings that have changed #1245

Closed

etungsten mentioned this pull request Mar 12, 2021

Improve restart-commands: pass list of changed settings that's triggering the restart #1389

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

host-ctr, host-containers: proper restarts #1230

host-ctr, host-containers: proper restarts #1230

etungsten commented Dec 3, 2020 •

edited

Loading

zmrow left a comment

etungsten commented Dec 3, 2020

etungsten commented Dec 3, 2020

webern left a comment

samuelkarp commented Dec 4, 2020

samuelkarp left a comment

bcressey Dec 4, 2020

bcressey Dec 4, 2020

etungsten Dec 4, 2020 •

edited

Loading

etungsten Dec 4, 2020 •

edited

Loading

bcressey Dec 4, 2020

bcressey Dec 4, 2020

bcressey Dec 4, 2020

tjkirch Dec 4, 2020

etungsten commented Dec 7, 2020

etungsten commented Dec 7, 2020 •

edited

Loading

etungsten commented Dec 7, 2020 •

edited

Loading

tjkirch commented Dec 7, 2020 •

edited

Loading

samuelkarp left a comment

etungsten commented Dec 8, 2020

samuelkarp commented Dec 8, 2020

etungsten commented Dec 9, 2020

etungsten commented Dec 9, 2020 •

edited

Loading

zmrow left a comment

etungsten commented Dec 10, 2020

		// Check if the target container already exists. If it does, take over the helm to manage it.
		container, err := client.LoadContainer(ctx, containerID)

host-ctr, host-containers: proper restarts #1230

host-ctr, host-containers: proper restarts #1230

Conversation

etungsten commented Dec 3, 2020 • edited Loading

zmrow left a comment

Choose a reason for hiding this comment

etungsten commented Dec 3, 2020

etungsten commented Dec 3, 2020

webern left a comment

Choose a reason for hiding this comment

samuelkarp commented Dec 4, 2020

samuelkarp left a comment

Choose a reason for hiding this comment

bcressey Dec 4, 2020

Choose a reason for hiding this comment

bcressey Dec 4, 2020

Choose a reason for hiding this comment

etungsten Dec 4, 2020 • edited Loading

Choose a reason for hiding this comment

etungsten Dec 4, 2020 • edited Loading

Choose a reason for hiding this comment

bcressey Dec 4, 2020

Choose a reason for hiding this comment

bcressey Dec 4, 2020

Choose a reason for hiding this comment

bcressey Dec 4, 2020

Choose a reason for hiding this comment

tjkirch Dec 4, 2020

Choose a reason for hiding this comment

etungsten commented Dec 7, 2020

etungsten commented Dec 7, 2020 • edited Loading

etungsten commented Dec 7, 2020 • edited Loading

tjkirch commented Dec 7, 2020 • edited Loading

samuelkarp left a comment

Choose a reason for hiding this comment

etungsten commented Dec 8, 2020

samuelkarp commented Dec 8, 2020

etungsten commented Dec 9, 2020

etungsten commented Dec 9, 2020 • edited Loading

zmrow left a comment

Choose a reason for hiding this comment

etungsten commented Dec 10, 2020

etungsten commented Dec 3, 2020 •

edited

Loading

etungsten Dec 4, 2020 •

edited

Loading

etungsten Dec 4, 2020 •

edited

Loading

etungsten commented Dec 7, 2020 •

edited

Loading

etungsten commented Dec 7, 2020 •

edited

Loading

tjkirch commented Dec 7, 2020 •

edited

Loading

etungsten commented Dec 9, 2020 •

edited

Loading