Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

baremetal: enhancement for ironic config passthrough #401

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
140 changes: 140 additions & 0 deletions enhancements/baremetal/baremetal-ironic-config-passthrough.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
---
title: baremetal-ironic-config-passthrough
authors:
- "@derekh"
reviewers:
- TBD
approvers:
- TBD
creation-date: 2020-07-17
last-updated: 2020-07-17
status: provisional
see-also:
- "https://github.com/openshift/installer/pull/3887"
- "https://github.com/openshift/machine-api-operator/pull/645"
---

## Release Signoff Checklist

- [ ] Enhancement is `implementable`
- [ ] Design details are appropriately documented from clear requirements
- [ ] Test plan is defined
- [ ] Graduation criteria for dev preview, tech preview, GA
- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/)

## Summary

Baremetal IPI makes use of OpenStack Ironic to provision RHCOS to each baremetal
host. Ironic is deployed in the metal3 pod and its configuration is highly opinionated.
This enhancement would provide a mechanism to pass arbitary ironic config options
into the metal3 pod in order to change how ironic and ironic-inspector are configured.

## Motivation

The current opinionated ironic works well when we have a single set of target hardware
to support, this is the ideal but hasn't been our experience. As users deploy IPI clusters
in labs, production systems, and for CI, we occasionally needs to deal with corner cases arise.

The motivation for this enhancement is to allow people to deal with their individual corner
cases without the need to insert special cases into the ironic containers. The ironic
container can stay opinionated and targeted for a specific set of environments while also
being usable where uniq envirnments require adjustments.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have some examples of configuration changes that users have needed to change?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, there are 3 examples ongoing that I can think of

Operators trying to deploy kni on openstack(for demos) have been looking for a way to customise IPMI timouts and retries, see here
metal3-io/ironic-image#167

For CI, we've been trying to find a way to add certain kernel args for CI only, one of this merged which I believe means IPA is no longer being logged to the console, making debugging very difficult
metal3-io/ironic-image#172
metal3-io/ironic-image#160

For debugging IPA users need to add a SSH key to kernel args (this one could probbaly be a top level config option)
metal3-io/ironic-image#150

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, there are 3 examples ongoing that I can think of

Operators trying to deploy kni on openstack(for demos) have been looking for a way to customise IPMI timouts and retries, see here
metal3-io/ironic-image#167

That seems like a reasonable thing to put into an API. Is it configurable on a per-host basis, or do we have to put a value into the ironic config file?

For CI, we've been trying to find a way to add certain kernel args for CI only, one of this merged which I believe means IPA is no longer being logged to the console, making debugging very difficult
metal3-io/ironic-image#172
metal3-io/ironic-image#160

Is there any reason we wouldn't want serial console logging all the time? Can we re-enable it and write a test to ensure it can't be disabled accidentally?

If it needs to be configurable, is it a per-host option or global?

For debugging IPA users need to add a SSH key to kernel args (this one could probbaly be a top level config option)
metal3-io/ironic-image#150

Could we use the same ssh key that is given in the install-config for access to the control plane nodes?


In addition, when a change is being considered to ironic's opinionated configurations,
deployers will be able to test this change in their deployment and make recommendations on
an the suitability of making a config update permanent in the ironic image.

### Goals

Allow users of to tweak ironic config options on the fly without the need for a new
ironic image.

### Non-Goals

Our goal is not to provide an API the ends up being commonly used with a small
subset of options. i.e. if any specific config option becomes a frequently
tweaked option for multiple environments then it should be considered a condidate
as a top-level platform configuration.

## Proposal

Add a new platform configuration "ironicExtraConf" of type map, this would consist
of keys and values to be used as config options for configuring ironic.

Each config option should be expressed as a key/value pair with the format

OS_<section>_\_<name>=<value> - where `section` and `name` are the
reprepresent the config option in ironic.conf e.g. to set a IPA ssh key and
set the number of ironic API workers

```yaml
platform:
baremetal:
ironicExtraConf: {"OS_PXE__PXE_APPEND_PARAMS":'nofb nomodeset vga=normal sshkey="ssh-rsa AAAA..."', "OS_API__API_WORKERS":"8"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ironic (and IPA) was always intended to be an implementation detail for IPI, so I'm pretty uncomfortable with adding a top-level ironicFoo option tbh.

In particular if you check the discussion on coreos/ignition#979 you'll see there are a bunch of unsolved issues wrt the OpenShift coreos-install based ramdisk vs IPA, and the options in either case are completely different.

I wonder if we could perhaps make this more generic and have some pxeExtra interface or something, or even better some interface which describes the required behavior e.g which console to use?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, upstream we want to use the oslo.config environment variable interface instead of customizing individual settings in the ironic config to cut down on the number of changes to the image. The whole point of adding environment variable handling to oslo.config was to make it easier to pass settings to OpenStack services running in containers.

Downstream in OpenShift, however, I would rather that we add explicit APIs for the things we need to allow users to change. That allows us to take advantage of the API server for some basic validation, and makes it harder for users to put the system into an unsupportable state or one where metal3 is crash-looping due to invalid settings.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For readability, can you stick to canonical YAML:

platform: 
  baremetal: 
    ironicExtraConf: 
      OS_API__API_WORKERS: "8"
      OS_PXE__PXE_APPEND_PARAMS: "nofb nomodeset vga=normal sshkey=\"ssh-rsa AAAA...\""

```

### User Stories [optional]

#### Story 1

As a developer debuging baremetal ipi CI, I want to be able to log the console
of virtual baremetal nodes while they are being provisioned, in order to do
this I need to tweak IPA kernel command line options to write to ttyS0, this
is a CI only adjustment is not suitable for real baremetal (where tty0 would
be ideal).

#### Story 2

As a Deployment Operator, I want Barametal IPI deployments to be customizable to my
hardware that features some distinctive characteristic.

### Implementation Details/Notes/Constraints [optional]

A new property "ironicExtraConf" would be added to the "provisionings.metal3.io" CRD,
this property would be a map holding environment variables to be defined in the ironic
containers.

This enhancement would make use of "Config Opts From Environment" provided by
ironic's use of oslo.config(see https://specs.openstack.org/openstack/oslo-specs/specs/rocky/config-from-environment.html).
To utilize this each of the containers running an ironic service should
export an environment variable which will then automatically be picked up
by ironic.

### Risks and Mitigations

The risk here is that Deployment Operators would start making common use
of this API without providing feedback upstream. They would not be "forced"
to push improvments into a release where others could take advantage.

To mitigate against this any passthrough API provided should be marked as
experimental and any config options set unsuported, along with suitable
message to warn the operator of the situation.

## Design Details

### Open Questions [optional]

### Test Plan

### Graduation Criteria

- As we wouldn't intend to support any options changed with this API is status
would remain as `Tech Preview` once implemented.

##### Dev Preview -> Tech Preview

- Ability to update any ironic or ironic-inspector config option
- config option should be reflected in ironic on the bootstrap node
- config option should be reflected in ironic in the metal3 pod

### Upgrade / Downgrade Strategy

### Version Skew Strategy

## Implementation History


## Drawbacks

## Alternatives