Automated reconciliation of boot settings #2375

markusboehme · 2022-08-24T12:44:27Z

Issue number:

Description of changes:

This PR implements what has been proposed in #2155, the addition of a new settings.boot.reboot-to-reconcile setting to automatically reboot Bottlerocket if either the kernel or systemd command line were changed during boot. At a high level, prairiedog now compares any newly written bootconfig initrd against the one already present and flags the need for a reboot. Before entering the configured systemd target, a new service unit then triggers the reboot if needed.

To reviewers, I recommend going commit by commit. The commits are fairly detailed and hopefully tell a story if followed in order.

Testing done:

No boot settings specified: No automatic reboot; host can be rebooted
Boot settings specified with reboot-to-reconcile unset: No automatic reboot; host can be rebooted and comes up with specified settings
Boot settings specified with reboot-to-reconcile set to false: No automatic reboot; host can be rebooted and comes up with specified settings
Boot settings specified with reboot-to-reconcile set to true: Automatic reboot; host comes up with specified settings
Boot settings specified with reboot-to-reconcile set to true and bootstrap containers: Automatic reboot after bootstrap containers have exited; host comes up with specified settings
Tested forward and backward migration from 1.9.2 to a faux 1.10.0 with this PR on builds of the aws-k8s-1.23 variant, both with and without reboot-to-reconcile set

Notes:

While this implements a new setting, I don't think a migration is needed as the setting is persisted and restored by prairiedog in the bootconfig initrd. Older versions of prairiedog will just ignore the setting. Update: A migration is needed as pointed out by @etungsten: While prairiedog may ignore the unknown setting, it will bite when trying to deserialize the data store. The current version comes with a migration.
If you're taking this on a test ride yourself, please see Prairiedog and the kernel may disagree about parsing boot config keys without values #2359: If you configure a boot setting that doesn't require a value, please specify the empty string for now.

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

etungsten · 2022-08-24T16:47:00Z

I don't think a migration is needed as the setting is persisted and restored by prairiedog in the bootconfig initrd. Older versions of prairiedog will just ignore the setting.

We would still need a migration as we're introducing a new setting. If this new setting is not properly removed from the settings data-store on downgrade then the host will fail to boot because the settings model will not anticipate this new setting when apiserver tries to de-serialize the existing settings data-store.

markusboehme · 2022-08-24T18:08:32Z

I don't think a migration is needed as the setting is persisted and restored by prairiedog in the bootconfig initrd. Older versions of prairiedog will just ignore the setting.

We would still need a migration as we're introducing a new setting. If this new setting is not properly removed from the settings data-store on downgrade then the host will fail to boot because the settings model will not anticipate this new setting when apiserver tries to de-serialize the existing settings data-store.

I see, thanks! I was under the impression all serialization/deserialization for BootSettings would be handled by prairiedog, but clearly the settings are persisted in the data store as well.

While anything under and including settings.boot would be scrubbed by the migration that introduced that setting as a prefix in v1.8.0, this doesn't take care of any rollbacks to v1.8.0, v1.9.0, etc. I will add a new AddSettingsMigration for the settings.boot.reboot-to-reconcile key.

packages/os/bootstrap-containers@.service

bcressey

I don't quite understand the need to persist the reboot-to-reconcile flag within the actual bootconfig data.

Let me back up a step. As a rule I tend to find marker files distasteful, and one of my stronger architectural preferences is to favor API queries over ad-hoc files, since those files tend to create their own mini-API surfaces that aren't consistent.

(This isn't entirely binding - we use markers for early-boot-config and cfnsignal, and those can make sense if the alternative would be creating weird marker-file-like settings, like settings.marker-files.early-boot-config-ran = true, which I think would be taking the above principle too far.)

Two patterns that tend to avoid some of the downsides of applying this architectural preference:

if a program would be overly complicated by querying the API, or if the API isn't always available to query, it might be better to render a config file for it based on settings and just have it load its own config (see: updog, netdog).
if a program is writing a marker file so some other, easier to use utility can avoid querying the API, it might be better to add a sub-command to that program and wrap the execution of the utility instead (prairiedog sort of does this in spirit already by wrapping the mount command for the boot partition)

Applying that second pattern would lead me to expect a command like prairiedog reboot-if-required where we'd look at the bootconfig file we'd previously rendered, compare that with the kernel settings, and query the API to see if we need to reboot after all, and then run systemctl reboot ourselves. (I'm not saying this is right or better, just sharing where I'm coming from.)

Encoding the reconcile boolean in the bootconfig data could make sense if we want to optimize away that final API query - after all, we've already seen all the settings before and can record that fact somewhere - but then I'm not sure why we don't just write the marker file instead, if we find a delta after creating the config, or remove it if there's no longer a delta.

It's really that nested sequence of state that's not clear to me - why we have API settings and bootconfig and the marker file, and if we can do away with one or both of the latter bits of state.

packages/os/reboot-to-reconcile.service

markusboehme · 2022-08-26T17:25:31Z

I don't quite understand the need to persist the reboot-to-reconcile flag within the actual bootconfig data.

Originally, this is mainly a result of misunderstanding the serialization of BootSettings, i.e. assuming they would only be persisted by prairiedog. Erikson clarified this in his earlier comment. There should be nothing in the way of dropping that setting from the bootconfig data again. When invoked by sundog, prairiedog generate-boot-settings will have to query the API for the current data and merge the kernel and init settings read from the bootconfig data into it. I will try this next--I expect it will get rid of quite a bit of extra code.

Let me back up a step. As a rule I tend to find marker files distasteful, and one of my stronger architectural preferences is to favor API queries over ad-hoc files, since those files tend to create their own mini-API surfaces that aren't consistent. [...]

It's really that nested sequence of state that's not clear to me - why we have API settings and bootconfig and the marker file, and if we can do away with one or both of the latter bits of state.

Thanks for taking the time to share your thoughts on the overall design! I think a part of the confusion stems from trying to see deeper reasons in having reboot-to-reconcile in the bootconfig datat when I thought of it as just a necessity. 😊 Not doing that anymore will likely make things clearer.

I had considered, but not implemented, the idea of a new prairiedog command. Decoupling the decision to reboot and actually rebooting via a marker file was motivated by the idea a bootstrap container might find other reasons to request a reboot before the host enters its normal operational state. There is no such request for now, so perhaps it makes sense to drop this and go with the specific solution that keeps all logic inside prairiedog.

bcressey · 2022-08-26T19:00:19Z

Decoupling the decision to reboot and actually rebooting via a marker file was motivated by the idea a bootstrap container might find other reasons to request a reboot before the host enters its normal operational state. There is no such request for now, so perhaps it makes sense to drop this and go with the specific solution that keeps all logic inside prairiedog.

I was thinking that an alternative using helper commands might be a oneshot unit with multiple potential restart commands, e.g.:

ExecStart=/usr/bin/prairiedog reboot-if-required
ExecStart=/usr/bin/netdog reboot-if-required
ExecStart=/usr/bin/updog reboot-if-required

Where each program gets to encapsulate its own state checks and decision-making.

But your use case - allowing bootstrap containers to "schedule" a reboot after all other bootstrap containers have run, by writing a marker file in a well-known place - makes sense too, and wouldn't be possible with the sequenced helper commands.

I'm happy to keep the /run/reboot-required approach.

markusboehme · 2022-08-30T17:24:41Z

@bcressey I see what you mean now. This definitely looks and feels more as if made from one piece, and doesn't preclude bootstrap containers from signalling a need to reboot either: There likewise could be a /usr/bin/bootstrap-containers reboot-if-required, checking e.g. a marker file that is local to a particular bootstrap container. This even would provide more flexibility (e.g. accountability, or honoring the request only for certain containers). I'm thus moving away from the global marker file in the next iteration of the PR.

bcressey · 2022-08-31T05:20:10Z

There likewise could be a /usr/bin/bootstrap-containers reboot-if-required, checking e.g. a marker file that is local to a particular bootstrap container.

Nice! I really like this idea.

markusboehme · 2022-09-06T10:13:43Z

The force push changes the approach as discussed above:

The reboot now happens after reaching the configured.target, sidestepping the need to explicitly consider bootstrap containers and whether they're essential or not.
The initrd boot config now only contains the kernel and init command line arguments. The new reboot-to-reconcile setting is only persisted in the data store.
prairiedog grew a new reboot-if-required command to compare the initrd boot config against the boot config the host booted with, issuing a reboot if that is required and settings.boot.reboot-to-reconcile is set to true. Other first-party components can likewise introduce reboot-if-required commands of their own and hook into the existing reboot-if-required.service.
The reboot-if-required setting now comes with the migration needed for a rollback and has been tested.

There's merge conflicts to sort out in Cargo.toml and Release.toml needs merging as well. I refrained from rebasing for now to allow for easy diffing against the older revision. Will rebase to unblock the test process once folks have signaled they're in agreement with the current approach.

bcressey

Looks good!

bcressey · 2022-09-08T19:49:10Z

sources/api/prairiedog/src/bootconfig.rs

+    if new_boot_settings.reboot_to_reconcile.unwrap_or(false) {
+        /// Determines whether two optional hash maps changed. Treats empty and missing hash maps as equal.
+        fn has_changed_materially(
+            old_params: &Option<HashMap<BootConfigKey, Vec<BootConfigValue>>>,
+            new_params: &Option<HashMap<BootConfigKey, Vec<BootConfigValue>>>,
+        ) -> bool {


It might be worth splitting this out into a helper function that takes two BootSettings and returns a bool, so that it can be unit-tested.

Excellent point! I've done so now, and cleaned up a couple existing tests in the process (in a separate commit, of course).

bcressey · 2022-09-08T19:51:36Z

sources/api/prairiedog/src/bootconfig.rs

+            match (old_params, new_params) {
+                (None, None) => false,
+                (None, Some(new)) => !new.is_empty(),
+                (Some(old), None) => !old.is_empty(),
+                (Some(old), Some(new)) => old != new,
+            }


The two middle cases were what triggered my "oh, I hadn't thought of that" reaction and prompted me to look for unit tests. 😀

bcressey · 2022-09-08T20:03:52Z

sources/api/prairiedog/src/main.rs

+{
+    if is_reboot_required(socket_path).await? {
+        info!("Boot settings changed and require a reboot to take effect. Initiating reboot...");
+        command("/usr/bin/systemctl", &["reboot"])?;


Did we lose the property from the previous revision where the system would never continue starting other units?

The manual for systemctl reboot says it's mostly equivalent to:

systemctl start reboot.target --job-mode=replace-irreversibly --no-block

It might be good to somehow confirm that activate-multi-user.target never gets a chance to run. I'd guess that since it has DefaultDependencies=yes (implied) then it will conflict with shutdown.target and never actually start because the stop job will run first.

We did, for the reason that Arnaldo noticed below: I accidentally changed the systemd service type to one that wouldn't wait for anything but a successful fork. Once systemctl reboot returns, from looking at the code (and later on logs as well) the system would be working on the shutdown already. I realize this is all a bit subtle, though, so I also added an explicit loop that'll wait for systemd to send prairiedog a SIGTERM and adds a second line of defense together with the Type=oneshot.

arnaldo2792

This is great!

packages/os/os.spec

arnaldo2792 · 2022-09-09T00:23:45Z

packages/os/reboot-if-required.service

@@ -0,0 +1,10 @@
+[Unit]


I think we still want this unit to be a oneshot and configured with RemainAfterExit, right? According to the docs, since the unit has ExecStart it won't be considered oneshot by default, and the behavior of a simple service doesn't quite fit for reboot-if-required:

the service manager will consider the unit started immediately after the main service process has been forked off.
...
the service manager will immediately proceed starting follow-up units, right after creating the main service process, and before executing the service's binary. Note that this means systemctl start command lines for simple services will report success even if the service's binary cannot be invoked successfully

That's a good catch, thanks! Previously, there was a brief window of time in which the boot would proceed, which practically speaking probably wouldn't be problematic, but still is super undesirable. I set Type=oneshot explicitly now, and the debug logs confirm the mechanism is working as intended:

reboot-if-required.service: starting held back, waiting for: configured.target activate-multi-user.service: starting held back, waiting for: reboot-if-required.service reboot-if-required.service: About to execute /usr/bin/prairiedog reboot-if-required reboot-if-required.service: Changed dead -> start [ 13.305648] prairiedog[1163]: 19:35:00 [INFO] Boot settings changed and require a reboot to take effect. Initiating reboot... Unit reboot.target has alias ctrl-alt-del.target. reboot.target: Trying to enqueue job reboot.target/start/replace-irreversibly shutdown.target: Installed new job shutdown.target/start as 472 reboot-if-required.service: Job 425 reboot-if-required.service/start finished, result=canceled reboot-if-required.service: Installed new job reboot-if-required.service/stop as 510 activate-multi-user.service: Job 424 activate-multi-user.service/start finished, result=canceled activate-multi-user.service: Installed new job activate-multi-user.service/stop as 452 reboot.target: Installed new job reboot.target/start as 426

markusboehme · 2022-09-13T15:14:39Z

The latest force push contains these changes:

reboot-if-required.service now is Type=oneshot as it should have been, waiting for prairiedog to issue the reboot
Extracted a function to actually compare the boot settings for reboot-worthy changes, now coming with dedicated unit tests

I'll rebase onto develop and resolve merge conflicts once there's no need to diff against the previous revisions anymore. This will also unblock the automated testing for this PR.

bcressey

🎉

Add a new boolean settings.boot.reboot-to-reconcile to govern whether Bottlerocket should automatically reboot if kernel or systemd command line parameters are reconfigured during boot. This could happen either via user-data or via a bootstrap container. In either case, command line changes for the kernel or systemd will not take effect until the next reboot. This change only introduces the new setting to the model and threads it through in all places touching BootSettings. The flag is not yet armed, i.e. no reboot action is taken. Signed-off-by: Markus Boehme <markubo@amazon.com>

A previous commit introduced the new "settings.boot.reboot-to-reconcile" setting to the BootSettings model. Add a migration to allow rolling back to a version of Bottlerocket without the new setting on a host that has it set. Signed-off-by: Markus Boehme <markubo@amazon.com>

Refactoring, no change in behavior. The code to read /proc/bootconfig will be useful later to determine whether the system needs to be rebooted to reconcile boot settings and the current boot config. Signed-off-by: Markus Boehme <markubo@amazon.com>

The unit tests have needs for instances of the BootSettings model. Some tests already use repetitive and involved code to convert regular hash maps into the modeled types needed for creating BootSettings. Extract that code into its dedicated helper function so that the tests can remain focused on the actual test data and logic. Signed-off-by: Markus Boehme <markubo@amazon.com>

When boot settings change, "prairiedog generate-boot-config" is invoked to regenerate the initrd containing the boot config. However, this does not mean that any changes to e.g. the kernel or systemd command lines take effect yet. A reboot is needed for this. Add a new "prairiedog reboot-if-required" command that compares the current boot settings (and hence the currently persisted boot config in the initrd) against the one the host booted with to detect any changes requiring a reboot. If such changes are detected and the "settings.boot.reboot-to-reconcile" flag is set, reboot the host. For now, only changes to the kernel and systemd command lines require a reboot. Bottlerocket-specific boot settings like the newly introduced "reboot-to-reconcile" may change freely and will not be flagged by prairiedog. Signed-off-by: Markus Boehme <markubo@amazon.com>

Reboot the system towards the end of the configuration phase during boot if any of the first-party components deem this necessary. The reboot is enacted after reaching the configured.target and before reaching the multi-user.target, so e.g. user data has been applied and bootstrap containers have already run, but the host is not yet fully operational. Components can hook into the new reboot-if-required.service to be able to run at this point. They should implement a "reboot-if-required" command to trigger a reboot if it is needed. Currently, prairiedog is the only component hook into this service and implement a "reboot-if-required" command. It will trigger a reboot if boot settings have changed, i.e. if the kernel or systemd command line arguments were changed via the user data mechanism or a bootstrap container. Signed-off-by: Markus Boehme <markubo@amazon.com>

markusboehme · 2022-09-21T20:04:37Z

Thanks for the reviews! I force-pushed now to resolve the conflicts in Cargo.toml and Release.toml, unblocking the automated tests.

markusboehme requested review from bcressey, zmrow, arnaldo2792 and etungsten August 24, 2022 12:44

markusboehme changed the title ~~Feature/reboot to reconcile~~ Automated reconciliation of boot settings Aug 24, 2022

arnaldo2792 reviewed Aug 25, 2022

View reviewed changes

packages/os/bootstrap-containers@.service Outdated Show resolved Hide resolved

bcressey reviewed Aug 26, 2022

View reviewed changes

packages/os/reboot-to-reconcile.service Outdated Show resolved Hide resolved

packages/os/reboot-to-reconcile.service Outdated Show resolved Hide resolved

markusboehme force-pushed the feature/reboot-to-reconcile branch from fa7619b to 4690e1d Compare September 6, 2022 10:00

bcressey approved these changes Sep 8, 2022

View reviewed changes

arnaldo2792 reviewed Sep 9, 2022

View reviewed changes

markusboehme force-pushed the feature/reboot-to-reconcile branch from 4690e1d to 4e174f0 Compare September 13, 2022 14:54

markusboehme requested a review from bcressey September 13, 2022 14:54

bcressey approved these changes Sep 21, 2022

View reviewed changes

arnaldo2792 approved these changes Sep 21, 2022

View reviewed changes

markusboehme added 6 commits September 21, 2022 19:57

markusboehme force-pushed the feature/reboot-to-reconcile branch from 4e174f0 to d07baac Compare September 21, 2022 20:03

markusboehme added this to the 1.10.0 milestone Sep 21, 2022

markusboehme merged commit aa0087f into bottlerocket-os:develop Sep 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automated reconciliation of boot settings #2375

Automated reconciliation of boot settings #2375

markusboehme commented Aug 24, 2022 •

edited

Loading

etungsten commented Aug 24, 2022 •

edited

Loading

markusboehme commented Aug 24, 2022

bcressey left a comment

markusboehme commented Aug 26, 2022

bcressey commented Aug 26, 2022

markusboehme commented Aug 30, 2022

bcressey commented Aug 31, 2022

markusboehme commented Sep 6, 2022

bcressey left a comment

bcressey Sep 8, 2022

markusboehme Sep 13, 2022

bcressey Sep 8, 2022

bcressey Sep 8, 2022

markusboehme Sep 13, 2022

arnaldo2792 left a comment

arnaldo2792 Sep 9, 2022

markusboehme Sep 13, 2022

markusboehme commented Sep 13, 2022

bcressey left a comment

markusboehme commented Sep 21, 2022

Automated reconciliation of boot settings #2375

Automated reconciliation of boot settings #2375

Conversation

markusboehme commented Aug 24, 2022 • edited Loading

etungsten commented Aug 24, 2022 • edited Loading

markusboehme commented Aug 24, 2022

bcressey left a comment

Choose a reason for hiding this comment

markusboehme commented Aug 26, 2022

bcressey commented Aug 26, 2022

markusboehme commented Aug 30, 2022

bcressey commented Aug 31, 2022

markusboehme commented Sep 6, 2022

bcressey left a comment

Choose a reason for hiding this comment

bcressey Sep 8, 2022

Choose a reason for hiding this comment

markusboehme Sep 13, 2022

Choose a reason for hiding this comment

bcressey Sep 8, 2022

Choose a reason for hiding this comment

bcressey Sep 8, 2022

Choose a reason for hiding this comment

markusboehme Sep 13, 2022

Choose a reason for hiding this comment

arnaldo2792 left a comment

Choose a reason for hiding this comment

arnaldo2792 Sep 9, 2022

Choose a reason for hiding this comment

markusboehme Sep 13, 2022

Choose a reason for hiding this comment

markusboehme commented Sep 13, 2022

bcressey left a comment

Choose a reason for hiding this comment

markusboehme commented Sep 21, 2022

markusboehme commented Aug 24, 2022 •

edited

Loading

etungsten commented Aug 24, 2022 •

edited

Loading