Skip to content
This repository has been archived by the owner on Sep 18, 2020. It is now read-only.

CLUO agent should only run on desired nodes #76

Closed
robszumski opened this issue May 25, 2017 · 8 comments
Closed

CLUO agent should only run on desired nodes #76

robszumski opened this issue May 25, 2017 · 8 comments
Assignees

Comments

@robszumski
Copy link
Member

robszumski commented May 25, 2017

The CLUO agent should not run on anything other than Container Linux nodes. There are a few ways to implement this:

  • CLUO has a node selector on OS type. This would depend on Tectonic adding labels for each OS type.
  • CLUO tolerates a taint that would be added to each CL node. Depends on Tectonic adding that taint.

Note that we set container-linux-update.v1.coreos.com=coreos on CL nodes, but I assume this comes from the CLUO, so it can't schedule based on that.

Once we figure out a strategy, let's file an issue on the Tectonic installer.

@dghubble
Copy link
Member

dghubble commented Jun 27, 2017

Adding a node label to nodes and having the daemonset match that would be the way I prefer to go. Tainting nodes would affect other pods we don't intend. This probably needs to be implemented by changing the daemonset the operator creates for the moment - however this is a temporary approach as we intend to decouple the operator and daemonset creation. Later, users should just deploy the agent to run on the desired nodes through some means, they won't need to contain logic for this.

See #98 or ping directly about those Tectonic discussions.

@dghubble
Copy link
Member

dghubble commented Jun 28, 2017

Discussing this more with @aaronlevy, this is harder than initially stated.

Goal:

update-agent should only run on selected (whitelisted) nodes:

  • Heterogeneous clusters have nodes with other OS's
  • Container Linux nodes may be diskless, PXE booted images
  • Administrator prefers to manage updates through some other mechanism (their prerogative)

It is acceptable for deployment tools (like Tectonic) to label nodes by default (their prerogative).

Initial Proposal (new users):

Users should simply add a node selector on the update-agent daemonset manifest such that update-agent only runs on correctly labeled nodes (e.g. container-linux-update.v1.coreos.com/managed=true). Set -manage-agent=false. New users in open-source can adopt this strategy now.

Problems for Tectonic and those who used manage-agent=true

Tectonic 1.6 used -manage-agent=true (now discouraged, see #98 for plans to migrate away from using this) which deployed update-agent on all nodes. Even if we changed the manage-agent behavior to deploy a daemonset with a node selector to match nodes, there is no reliable mechanism to add a new label to nodes on existing clusters (non-starter):

  • update-agent adds new whitelist label to nodes it currently runs on, before upgrade. This doesn't work because cloud provider user-data will still launch nodes with the old labels. You cannot upgrade to a CLUO daemonset which switches from running everywhere (blacklist) to selecting nodes (whitelist).
  • No components running in Tectonic have the ability to modify user-data

Using Existing Labels:

Proposal Addendum

Support blacklisting for backward compatibility:

  • Tectonic Installer (1.6, 1.7) starts to label Container Linux nodes container-linux-update.v1.coreos.com/managed=true (for the future)
  • CLUO -manage-agent=true deploys an update-agent daemonset with a selector for nodes without container-linux-update.v1.coreos.com/managed=false
    • update-agent runs on new nodes labeled with managed=true
    • update-agent runs on old nodes without any managed label
    • Admins can manually add managed=false to exclude update-agent if desired (e.g. non Container Linux nodes)
  • In the future version operator, we continue deploying the daemonset with this NOT selector
  • Once the support window passes (Tectonic 1.8?) Once Tectonic can update user-data (hard problem, maybe someday), the version operator can switch from the blacklist style agent daemonset to the whitelist style daemonset

@dghubble dghubble changed the title CLUO agent should only run on CL nodes CLUO agent should only run on desired nodes Jun 29, 2017
@euank
Copy link
Contributor

euank commented Jun 29, 2017

To summarize some points and questions from a meeting on this:

Adding labels to existing clusters

There actually is a somewhat sane migration path if we let the operator infer the operating system version bsed on the nodeInfo.osImage field. It contains information akin to "osImage": "Container Linux by CoreOS 1409.5.0 (Ladybug)". This information has been accurate for the last year of kubelet-wrappers (see coreos/coreos-overlay#1889).

At a minimum, it was proposed that the next version of the operator add a label, based on this information, to all Container Linux nodes.

It was proposed that the operator, if it manages the agent (-manage-agent=true), also update the daemonset to match this label. The example daemonset should also be updated.

What labels do we add/select on?

This is where things are a little more murky; it's unclear the exact form these labels should take. Beware, here be bikesheds. The options discussed are:

  1. ID label -- The ID field from /etc/os-release. We can infer ID=coreos from the PRETTY_NAME we currently get via nodeInfo.osImage.
    Advantages
    This is the most generic option. Long term, the logic for adding this label could end up in the kubelet itself since, frankly, it should be labeling this instead of PRETTY_NAME
    Disadvantages
    If we wish user-configurability, an additional knob would likely be needed; editing/deleting this label would be a bad idea.
  2. container-linux-update.v1.coreos.com/managed = true -- This label could be added if no such key exists currently
    Advantages
    This is a simple new-node/on/off tristate and aligns with the above proposal.
    Disadvantages
    Simple; the granularity would not allow any additional logic we might need, but this is all "what-if"
    The operator logic for applying these labels isn't great to have to maintain forever and there's no easy path out of this.
  3. container-linux-update.v1.coreos.com/class = default -- This label would be added if no such label existed and could be edited by users to disable any operation
    Advantages
    This is in-line with how a few other things operate, e.g. ingress controllers
    This is a path towards allowing far more complex configurations
    Disadvantages
    This is a path towards allowing far more complex configurations which we might not need.
    This is both more complicated than manage agent and less flexible than totally custom selectors
  4. Arbitrary selection, defaulting to 1. and/or 2. -- This would model the problem of selecting nodes as something to be handled by a totally arbitrary labelselection query
    Advantages
    The most flexible option
    Allows leveraging kubernetes knowledge
    Less lock-in towards what a complex future might look like than class
    Disadvantages
    Only usable with some configuration; it's unclear where that configuration would live
    The most flexible option

Proposal

It's difficult to provide the full set of options for the complete picture of what we do, so I'm going to write what I propose and ask that @aaronlevy and @dghubble mention their preferences too.

Note that my preference has moved to basically be: "Let's do the bare minimum to allow 1) CLUO to not run on other distros and 2) allow a fairly sane hatch to edit the agent's daemonset selector". We can always figure out more complicated things, such as knobs the user can actually turn in a supported way, in the future.

I propose we do the following:

  1. Update the operator to apply cluo-namespaced ID=coreos style labels to each node based on osImage.
  2. Update the operator to, when manage-agent=true, also update the daemonset to match this selector.
  3. (Optionally) Update the operator to, if manage-agent=true, read the labelselector of the previous daemonset and, if it has changed, use the changed one instead.
  4. When the operator no longer manages the agent, ensure the above mentioned 2/3 logic is preserved as desired.

In the future, the kubelet should apply a generically named os-ID node label. Migrating the daemonset between these labels in the future is trivial (even if we do 3. it's not hard)

Problems not addressed:

  1. Disabling updates -- mask update engine, this is out of scope of CLUO in general
  2. Providing an obvious supported knob to disable CLUO
    Nodes may not want to reboot for a number of reasons. These include:
    a. Very broken or special -- These nodes should probably be tainted or have update-engine masked
    b. Known regressions on update -- See above, update-engine should be masked
    c. I just don't want CLUO -- Non-tectonic users can edit the agent's selector with kubectl. Tectonic users can too. If we ever want it to be supported, we can expose knobs.

@dghubble
Copy link
Member

dghubble commented Jul 13, 2017

Nice notes @euank. My aim is similar - that we find the bare minimum necessary to allow those using -manage-agent=true (now deprecated) (who can't edit this themselves and stay on automated updates) to migrate to a state where the agent daemonset selects the nodes it runs on (to achieve proposal goal 1 and 2 as you said).

To reiterate, new users in open-source can just set -manage-agent=false and add any desired node selectors on the agent daemonset. The rest of discussion concerns users who are still using -manage-agent=true.

From the options, I prefer 1 or 2 since they're more minimal. I don't like 3 because it introduces a new "class" kinda design we'd write code for, which would then only be used in the deprecated code flow. I wanna focus on getting users from -manage-agent=true to -manage-agent=false (in effect at least, even though the shim code needs to stick around).

I mostly like the proposal. Its valuable to add the OS id as a label (or later kubelet adds it officially) independent of this change so I agree. I would prefer the operator, if manage-agent=true not use the OS id in the agent daemonset node selector and instead use an application-specific label, which we can add by default (i.e. label strategy 2).

  • OS id - labels identifies the OS distro of the node. Its represents part of the identity of the node. There should never be a reason to remove it.
  • container-linux-update.v1.coreos.com/managed = true/false/nil labels that the update-agent should run on a node. We'd ideally like this on every Container Linux node, but it serves as a pragmatic escape hatch if someone doesn't want the agent on a node for any reason (don't assume we can anticipate everything). A user may set this label to false if desired. We can decide how the update-operator should handle nil (i.e. same as no label) - likely the operator should switch nil to true for any CL node as part of the Tectonic migration.

I prefer the application-specific label because its scoped and versioned to CLUO, independent of node identity, it provides a migration path to the agent using node selectors, and its less code upkeep. Open source users could, if they choose, use a similar label scheme as its simple to explain to fellow cluster admins. Finally, it still leaves the possibility of adding a class = default / free-form-thing later if desired.

I still see no way we could ever delete the -manage-agent=true (deprecated) option unfortunately to get new users and Tectonic users going through the same code. Whatever code we write here has to persist as a shim.

@dghubble
Copy link
Member

The agent already adds an application-specific id based on the host's /etc/os-release ID field (coreos, fedora, etc), at least for nodes that have the expected file and field.

container-linux-update.v1.coreos.com/id=coreos

Is there additional information we require from nodeInfo.osImage or you mean switching the current id settings mechanism from reading a host file to querying the Kubernetes API, and thus pulling from Kubelet?

@euank
Copy link
Contributor

euank commented Jul 18, 2017

@dghubble It's a chicken-egg problem if we rely on the agent to self-label nodes it can run on (before it can run on them).

The suggestion is that the operator, or ideally in the long term the kubelet, do it in order to dodge that problem.

@dghubble
Copy link
Member

Yeah, you're right, we shouldn't rely on those labels added by the agent. Even though this code path is for people still using -managed-agent=true (deprecated) which runs the agent everywhere, that wouldn't be the case after the daemonset is updated to not be run everywhere with this migration (say, later when a new instance in an auto-scaling group is added). Disregard my last note, the operator should make the decision based on nodeInfo.osImage.

@dghubble
Copy link
Member

dghubble commented Aug 1, 2017

With respect to code changes in CLUO, this issue has been addressed.

Direct users of CLUO should switch from manage-agent=true to -manage-agent=false and create the daemonset and deployment shown in examples. If desired, add a node selector to the daemonset:

    spec:
      nodeSelector:
        container-linux-update.v1.coreos.com/my-favorite-label: "true"
      containers:
        ...

Tectonic clusters will get this behavior in in September. Container Linux nodes will automatically be labeled with container-linux-update.v1.coreos.com/agent: "true" and the DaemonSet will node select on that label. Set the label to false to stop the agent running on a node. Add the label to other nodes where update-agent should run.

cc @robszumski

@dghubble dghubble closed this as completed Aug 7, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants