Skip to content
This repository has been archived by the owner on Sep 18, 2020. It is now read-only.

agent: set a label when after successful os-update #168

Merged
merged 2 commits into from
Jan 25, 2018

Conversation

squeed
Copy link

@squeed squeed commented Jan 18, 2018

agent: set a label when after successful os-update

This sets the label container-linux-update.v1.coreos.com/reboot-needed: "true" on nodes where the OS has been upgraded but not yet rebooted. This is so daemonsets like tectonic-torcx can run ASAP.

Fixes: #167

@squeed squeed requested a review from dghubble January 18, 2018 16:29
@squeed
Copy link
Author

squeed commented Jan 18, 2018

cc @lucab

Copy link
Member

@dghubble dghubble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So to be clear, you're unsatisfied with just waiting for the before-reboot label on the node the coordinator chooses for reboot. Instead, you'd like status or reboot-needed information, but in a label so you can have a DaemonSet selecting it. The benefit is some speedup since torcx takes a while, but in this scheme might already be done.

@@ -72,6 +72,11 @@ const (
// auto-label-container-linux=true
LabelUpdateAgentEnabled = Prefix + "agent"

// Label set to "true" on nodes where the update-agent has installed a new
// version, and a reboot is needed. Equivalent to the "reboot-needed"
// annotation, except that other agents sometimes set the annotation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh? "Equivalent except other agents sometimes set the annotation"?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was referencing the behavior of the kvo, but I doubt this is worth mentioning.

// indicate we need a reboot
if s.CurrentOperation == updateengine.UpdateStatusUpdatedNeedReboot {
glog.Info("Indicating a reboot is needed")
anno[constants.AnnotationRebootNeeded] = constants.True
labels[constants.LabelUpdateStaged] = constants.True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming we do need this label, why not have it be "reboot-needed"? Its effectively acting the same as the existing reboot-needed annotation and gets set true/false in the same locations.

We have a bunch of labels and annotations documented in https://github.com/coreos/container-linux-update-operator/blob/master/doc/labels-and-annotations.md#update-agent so I'm not keen to introduce a new one. Its complicated enough to explain them. At least with reboot-needed, folks are already familiar and we can say both a label and an annotation are set in lockstep.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason I didn't do that is because the reboot-needed annotation is sometimes set by other operators. But I think you're right.

@@ -80,6 +80,20 @@ func SetNodeAnnotations(nc v1core.NodeInterface, node string, m map[string]strin
})
}

// SetNodeAnnotationsLabels sets all keys in a and l to their respective values in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likewise -> respectively.

// SetNodeAnnotationsLabels sets all keys in a and l to their respective values in
// node's annotations and labels, likewise
func SetNodeAnnotationsLabels(nc v1core.NodeInterface, node string, a, l map[string]string) error {
return UpdateNodeRetry(nc, node, func(n *v1api.Node) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just call the SetNodeLabels and SetNodeAnnotation methods that are already there.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any value in doing it in a single transaction?

@dghubble
Copy link
Member

Ah, missed #167 which has the background context.

@dghubble
Copy link
Member

dghubble commented Jan 22, 2018

So can we have the label use the prefix + reboot-needed name? We'll document that CLUO sets the reboot-needed annotation and label in lock-step. Label is easier to write deployments / daemonsets against. Annotation is easier to write programs against.

The fact Tectonic sets CLUO's annotations is an anti-pattern I've tried to squash. CLUO doesn't make guarantees when external apps are setting its own clearly named attributes. So its not a concern for your PR.

@squeed
Copy link
Author

squeed commented Jan 23, 2018

Thanks for the feedback; updated. PTAL.

Copy link
Contributor

@sdemos sdemos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other than wanting slightly more fleshed out docs, this pr looks fine to me.

@@ -30,6 +30,7 @@ A few labels may be set directly by admins to customize behavior. These are call
| id | coreos | update-agent | Reflects the ID in `/etc/os-release` |
| version | 1497.7.0 | update-agent | Reflects the VERSION in `/etc/os-release` |
| group | stable | update-agent | Reflects the GROUP in `/usr/share/coreos/update.conf` or `/etc/coreos/update.conf` |
| reboot-needed | true | update-agent | Set to true to request a coordinated reboot |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should explicitly say that this label is required to be in lockstep with the associated annotation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we're at it, maybe we should say "Updates to true to request a coordinated reboot". I can see someone misreading "Set" as an instruction to manipulate it themselves.

@dghubble
Copy link
Member

I have no other issues. Just the point by @sdemos. We can do ship this feature out if needed or wait a bit after merge.

This sets the label
`container-linux-update.v1.coreos.com/reboot-needed: "true"`
on nodes where the OS has been upgraded but not yet rebooted. This is so
daemonsets like tectonic-torcx can run ASAP.

Fixes: coreos#167
@squeed
Copy link
Author

squeed commented Jan 24, 2018

Thanks for the feedback, updated. It would probably be good to release this soon.

@dghubble dghubble merged commit 420190e into coreos:master Jan 25, 2018
@dghubble dghubble mentioned this pull request Jan 25, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants