-
Notifications
You must be signed in to change notification settings - Fork 511
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automated reconciliation of boot settings #2375
Automated reconciliation of boot settings #2375
Conversation
We would still need a migration as we're introducing a new setting. If this new setting is not properly removed from the settings data-store on downgrade then the host will fail to boot because the settings model will not anticipate this new setting when apiserver tries to de-serialize the existing settings data-store. |
I see, thanks! I was under the impression all serialization/deserialization for While anything under and including |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand the need to persist the reboot-to-reconcile
flag within the actual bootconfig data.
Let me back up a step. As a rule I tend to find marker files distasteful, and one of my stronger architectural preferences is to favor API queries over ad-hoc files, since those files tend to create their own mini-API surfaces that aren't consistent.
(This isn't entirely binding - we use markers for early-boot-config
and cfnsignal
, and those can make sense if the alternative would be creating weird marker-file-like settings, like settings.marker-files.early-boot-config-ran = true
, which I think would be taking the above principle too far.)
Two patterns that tend to avoid some of the downsides of applying this architectural preference:
- if a program would be overly complicated by querying the API, or if the API isn't always available to query, it might be better to render a config file for it based on settings and just have it load its own config (see: updog, netdog).
- if a program is writing a marker file so some other, easier to use utility can avoid querying the API, it might be better to add a sub-command to that program and wrap the execution of the utility instead (
prairiedog
sort of does this in spirit already by wrapping themount
command for the boot partition)
Applying that second pattern would lead me to expect a command like prairiedog reboot-if-required
where we'd look at the bootconfig file we'd previously rendered, compare that with the kernel settings, and query the API to see if we need to reboot after all, and then run systemctl reboot
ourselves. (I'm not saying this is right or better, just sharing where I'm coming from.)
Encoding the reconcile boolean in the bootconfig data could make sense if we want to optimize away that final API query - after all, we've already seen all the settings before and can record that fact somewhere - but then I'm not sure why we don't just write the marker file instead, if we find a delta after creating the config, or remove it if there's no longer a delta.
It's really that nested sequence of state that's not clear to me - why we have API settings and bootconfig and the marker file, and if we can do away with one or both of the latter bits of state.
Originally, this is mainly a result of misunderstanding the serialization of
Thanks for taking the time to share your thoughts on the overall design! I think a part of the confusion stems from trying to see deeper reasons in having I had considered, but not implemented, the idea of a new prairiedog command. Decoupling the decision to reboot and actually rebooting via a marker file was motivated by the idea a bootstrap container might find other reasons to request a reboot before the host enters its normal operational state. There is no such request for now, so perhaps it makes sense to drop this and go with the specific solution that keeps all logic inside prairiedog. |
I was thinking that an alternative using helper commands might be a oneshot unit with multiple potential restart commands, e.g.:
Where each program gets to encapsulate its own state checks and decision-making. But your use case - allowing bootstrap containers to "schedule" a reboot after all other bootstrap containers have run, by writing a marker file in a well-known place - makes sense too, and wouldn't be possible with the sequenced helper commands. I'm happy to keep the |
@bcressey I see what you mean now. This definitely looks and feels more as if made from one piece, and doesn't preclude bootstrap containers from signalling a need to reboot either: There likewise could be a |
Nice! I really like this idea. |
fa7619b
to
4690e1d
Compare
The force push changes the approach as discussed above:
There's merge conflicts to sort out in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
if new_boot_settings.reboot_to_reconcile.unwrap_or(false) { | ||
/// Determines whether two optional hash maps changed. Treats empty and missing hash maps as equal. | ||
fn has_changed_materially( | ||
old_params: &Option<HashMap<BootConfigKey, Vec<BootConfigValue>>>, | ||
new_params: &Option<HashMap<BootConfigKey, Vec<BootConfigValue>>>, | ||
) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth splitting this out into a helper function that takes two BootSettings
and returns a bool
, so that it can be unit-tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent point! I've done so now, and cleaned up a couple existing tests in the process (in a separate commit, of course).
match (old_params, new_params) { | ||
(None, None) => false, | ||
(None, Some(new)) => !new.is_empty(), | ||
(Some(old), None) => !old.is_empty(), | ||
(Some(old), Some(new)) => old != new, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two middle cases were what triggered my "oh, I hadn't thought of that" reaction and prompted me to look for unit tests. 😀
{ | ||
if is_reboot_required(socket_path).await? { | ||
info!("Boot settings changed and require a reboot to take effect. Initiating reboot..."); | ||
command("/usr/bin/systemctl", &["reboot"])?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we lose the property from the previous revision where the system would never continue starting other units?
The manual for systemctl reboot
says it's mostly equivalent to:
systemctl start reboot.target --job-mode=replace-irreversibly --no-block
It might be good to somehow confirm that activate-multi-user.target
never gets a chance to run. I'd guess that since it has DefaultDependencies=yes
(implied) then it will conflict with shutdown.target
and never actually start because the stop job will run first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We did, for the reason that Arnaldo noticed below: I accidentally changed the systemd service type to one that wouldn't wait for anything but a successful fork
. Once systemctl reboot
returns, from looking at the code (and later on logs as well) the system would be working on the shutdown already. I realize this is all a bit subtle, though, so I also added an explicit loop that'll wait for systemd to send prairiedog a SIGTERM
and adds a second line of defense together with the Type=oneshot
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great!
@@ -0,0 +1,10 @@ | |||
[Unit] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we still want this unit to be a oneshot
and configured with RemainAfterExit
, right? According to the docs, since the unit has ExecStart
it won't be considered oneshot
by default, and the behavior of a simple
service doesn't quite fit for reboot-if-required
:
the service manager will consider the unit started immediately after the main service process has been forked off.
...
the service manager will immediately proceed starting follow-up units, right after creating the main service process, and before executing the service's binary. Note that this means systemctl start command lines for simple services will report success even if the service's binary cannot be invoked successfully
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good catch, thanks! Previously, there was a brief window of time in which the boot would proceed, which practically speaking probably wouldn't be problematic, but still is super undesirable. I set Type=oneshot
explicitly now, and the debug logs confirm the mechanism is working as intended:
reboot-if-required.service: starting held back, waiting for: configured.target
activate-multi-user.service: starting held back, waiting for: reboot-if-required.service
reboot-if-required.service: About to execute /usr/bin/prairiedog reboot-if-required
reboot-if-required.service: Changed dead -> start
[ 13.305648] prairiedog[1163]: 19:35:00 [INFO] Boot settings changed and require a reboot to take effect. Initiating reboot...
Unit reboot.target has alias ctrl-alt-del.target.
reboot.target: Trying to enqueue job reboot.target/start/replace-irreversibly
shutdown.target: Installed new job shutdown.target/start as 472
reboot-if-required.service: Job 425 reboot-if-required.service/start finished, result=canceled
reboot-if-required.service: Installed new job reboot-if-required.service/stop as 510
activate-multi-user.service: Job 424 activate-multi-user.service/start finished, result=canceled
activate-multi-user.service: Installed new job activate-multi-user.service/stop as 452
reboot.target: Installed new job reboot.target/start as 426
4690e1d
to
4e174f0
Compare
The latest force push contains these changes:
I'll rebase onto |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
Add a new boolean settings.boot.reboot-to-reconcile to govern whether Bottlerocket should automatically reboot if kernel or systemd command line parameters are reconfigured during boot. This could happen either via user-data or via a bootstrap container. In either case, command line changes for the kernel or systemd will not take effect until the next reboot. This change only introduces the new setting to the model and threads it through in all places touching BootSettings. The flag is not yet armed, i.e. no reboot action is taken. Signed-off-by: Markus Boehme <markubo@amazon.com>
A previous commit introduced the new "settings.boot.reboot-to-reconcile" setting to the BootSettings model. Add a migration to allow rolling back to a version of Bottlerocket without the new setting on a host that has it set. Signed-off-by: Markus Boehme <markubo@amazon.com>
Refactoring, no change in behavior. The code to read /proc/bootconfig will be useful later to determine whether the system needs to be rebooted to reconcile boot settings and the current boot config. Signed-off-by: Markus Boehme <markubo@amazon.com>
The unit tests have needs for instances of the BootSettings model. Some tests already use repetitive and involved code to convert regular hash maps into the modeled types needed for creating BootSettings. Extract that code into its dedicated helper function so that the tests can remain focused on the actual test data and logic. Signed-off-by: Markus Boehme <markubo@amazon.com>
When boot settings change, "prairiedog generate-boot-config" is invoked to regenerate the initrd containing the boot config. However, this does not mean that any changes to e.g. the kernel or systemd command lines take effect yet. A reboot is needed for this. Add a new "prairiedog reboot-if-required" command that compares the current boot settings (and hence the currently persisted boot config in the initrd) against the one the host booted with to detect any changes requiring a reboot. If such changes are detected and the "settings.boot.reboot-to-reconcile" flag is set, reboot the host. For now, only changes to the kernel and systemd command lines require a reboot. Bottlerocket-specific boot settings like the newly introduced "reboot-to-reconcile" may change freely and will not be flagged by prairiedog. Signed-off-by: Markus Boehme <markubo@amazon.com>
Reboot the system towards the end of the configuration phase during boot if any of the first-party components deem this necessary. The reboot is enacted after reaching the configured.target and before reaching the multi-user.target, so e.g. user data has been applied and bootstrap containers have already run, but the host is not yet fully operational. Components can hook into the new reboot-if-required.service to be able to run at this point. They should implement a "reboot-if-required" command to trigger a reboot if it is needed. Currently, prairiedog is the only component hook into this service and implement a "reboot-if-required" command. It will trigger a reboot if boot settings have changed, i.e. if the kernel or systemd command line arguments were changed via the user data mechanism or a bootstrap container. Signed-off-by: Markus Boehme <markubo@amazon.com>
4e174f0
to
d07baac
Compare
Thanks for the reviews! I force-pushed now to resolve the conflicts in |
Issue number:
Closes #2155
Description of changes:
This PR implements what has been proposed in #2155, the addition of a new
settings.boot.reboot-to-reconcile
setting to automatically reboot Bottlerocket if either the kernel or systemd command line were changed during boot. At a high level, prairiedog now compares any newly written bootconfig initrd against the one already present and flags the need for a reboot. Before entering theconfigured
systemd target, a new service unit then triggers the reboot if needed.To reviewers, I recommend going commit by commit. The commits are fairly detailed and hopefully tell a story if followed in order.
Testing done:
reboot-to-reconcile
unset: No automatic reboot; host can be rebooted and comes up with specified settingsreboot-to-reconcile
set to false: No automatic reboot; host can be rebooted and comes up with specified settingsreboot-to-reconcile
set to true: Automatic reboot; host comes up with specified settingsreboot-to-reconcile
set to true and bootstrap containers: Automatic reboot after bootstrap containers have exited; host comes up with specified settingsaws-k8s-1.23
variant, both with and withoutreboot-to-reconcile
setNotes:
While this implements a new setting, I don't think a migration is needed as the setting is persisted and restored by prairiedog in the bootconfig initrd. Older versions of prairiedog will just ignore the setting.Update: A migration is needed as pointed out by @etungsten: While prairiedog may ignore the unknown setting, it will bite when trying to deserialize the data store. The current version comes with a migration.Terms of contribution:
By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.