diff --git a/sources/updater/README.md b/sources/updater/README.md index 1b1f1b417b6..ed34cdb8d35 100644 --- a/sources/updater/README.md +++ b/sources/updater/README.md @@ -46,7 +46,9 @@ If the calculated time has not passed yet, Updog returns the update timestamp to Assuming all the requirements are met, Updog requests the update images from the TUF repository and writes them to the "inactive" partition. -For more information on what's Updog see [Updog](updog/) +For more information on what's Updog see [Updog](updog/). +For more information about update waves see [Waves](waves/). + ## Signpost Once an update has been successfully written to the inactive partition, Updog calls the Signpost utility. This updates the priority bits in the GUID partition table of each partition and swaps the "active" and "inactive" partitions. diff --git a/sources/updater/waves/README.md b/sources/updater/waves/README.md new file mode 100644 index 00000000000..e48ce28ced9 --- /dev/null +++ b/sources/updater/waves/README.md @@ -0,0 +1,39 @@ +# Update waves + +As mentioned in the [Bottlerocket update overview](../README.md), OS updates can include "waves" for staggered deployment. +These waves are defined in the `manifest.json`, which lives in the TUF repository. +Each time an OS update is made available, `manifest.json` is updated with the information pertinent to that update using [updata](../updog) or its related libraries. +Waves may be supplied to `updata` on the command line, passed as a TOML-formatted file. + +This directory contains a few examples of these update wave files. +Specific details are encapsulated in each file, but they are: + +* `default-waves.toml`: A "normal" deployment +* `accelerated-waves.toml`: An accelerated deployment +* `ohno.toml`: An extremely accelerated deployment in case of emergency. + +## Understanding the concept of waves + +Waves include a *seed* and a *start time*. + +Each Bottlerocket node generates a "seed" for itself which is simply a number between 0-2048 that determines where it falls in the update order. +Nodes that have a seed within the current wave will update. +All waves include the seeds of the prior wave, so if a node misses its wave for whatever reason, it still updates at a later time. + +## Writing wave files + +Wave files must be [valid TOML](https://github.com/toml-lang/toml) containing a list of `[[wave]]` entries. +Waves defined in these files must contain two keys, `start_after` and `fleet_percentage`. + +`start_after` must be: + +* a valid RFC3339 formatted string OR +* a string like `"7 days"` or `"2 hours"`. Additional details about valid strings can be found [here](../../parse-datetime) + +It represents an offset of time starting from when the operator updates the `manifest.json` file, NOT an offset starting at the time `manifest.json` is uploaded to S3. In simple terms, it is "now" plus whatever time period is specified. + +`fleet_percentage` must be an unsigned integer from 1 to 100. +It represents the desired total percentage of the fleet to be updated by the time this wave is over. +This percentage maps directly to the seed value; it's the percentage of the maximum seed, 2048. + +Please see the files in this directory for proper examples. diff --git a/sources/updater/waves/accelerated-waves.toml b/sources/updater/waves/accelerated-waves.toml new file mode 100644 index 00000000000..2f8f0540385 --- /dev/null +++ b/sources/updater/waves/accelerated-waves.toml @@ -0,0 +1,18 @@ +# The following represents an "accelerated" set of update waves for a much +# quicker deployment. The deployment lasts for 1 day, and quickly increases the +# nodes updated at once. +[[wave]] +start_after = '1 hour' +fleet_percentage = 3 + +[[wave]] +start_after = '4 hours' +fleet_percentage = 12 + +[[wave]] +start_after = '8 hours' +fleet_percentage = 50 + +[[wave]] +start_after = '1 day' +fleet_percentage = 100 diff --git a/sources/updater/waves/default-waves.toml b/sources/updater/waves/default-waves.toml new file mode 100644 index 00000000000..d9bedae367e --- /dev/null +++ b/sources/updater/waves/default-waves.toml @@ -0,0 +1,22 @@ +# The following represents a "normal" set of update waves for a typical +# deployment. The deployment lasts for 6 days, and gradually increases the +# nodes updated at once. +[[wave]] +start_after = '1 hour' +fleet_percentage = 1 + +[[wave]] +start_after = '4 hours' +fleet_percentage = 5 + +[[wave]] +start_after = '1 day' +fleet_percentage = 10 + +[[wave]] +start_after = '3 days' +fleet_percentage = 25 + +[[wave]] +start_after = '6 days' +fleet_percentage = 100 diff --git a/sources/updater/waves/ohno.toml b/sources/updater/waves/ohno.toml new file mode 100644 index 00000000000..6afdde78d6f --- /dev/null +++ b/sources/updater/waves/ohno.toml @@ -0,0 +1,14 @@ +# The following represents an "emergency" set of update waves for a rapid +# deployment. The deployment lasts for 3 hours, with a small initial wave, +# and then all nodes will be updated after the first hour. +[[wave]] +start_after = '1 hour' +fleet_percentage = 5 + +[[wave]] +start_after = '2 hours' +fleet_percentage = 25 + +[[wave]] +start_after = '3 hours' +fleet_percentage = 100