Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop hard requirement on networking #211

Merged
merged 4 commits into from
Jul 2, 2020

Conversation

jlebon
Copy link
Contributor

@jlebon jlebon commented Jul 2, 2020

Hi,

While integrating Clevis into Fedora CoreOS and RHCOS, we've encountered problems with the way the dracut modules and Clevis units pull in networking targets. These problems are not unique to FCOS/RHCOS (see e.g. #54 and #206). This patch series fixes these dependencies to better conform to the systemd and dracut models. The most important commits in this patch series are the first two:

commit 088be96472a234b9368a437078582e8446fffc73
Date:   Thu Jul 2 11:08:55 2020 -0400

    systemd: drop hard requirement on networking

    Whether we need networking or not for unlocking an encrypted block
    device is a property of the block device in question. This is expressed
    in `/etc/crypttab` via the `_netdev` option. For example, the systemd
    cryptsetup generator[1] picks up on this and correctly orders unlocking
    of devices that need networking after `remote-fs-pre.target`.

    Thus, we shouldn't need to unconditionally require and order ourselves
    after networking comes up. Let whatever interprets `/etc/crypttab` take
    care of this.

    Add `DefaultDependencies=no` because we need to be able to run well
    before `sysinit.target`.

    [1] https://www.freedesktop.org/software/systemd/man/systemd-cryptsetup-generator.html

commit 48e426f4631eeac7dc516ef4bbe25768b71cc58c
Date:   Thu Jul 2 11:08:56 2020 -0400

    dracut: drop rd.neednet=1 injection

    By default, dracut builds generic initrds which by design shouldn't have
    any configuration specific to a host baked in (as opposed to so-called
    "hostonly" initrds). This property is leveraged with great success in
    immutable hosts like Fedora CoreOS and its downstream RHCOS where the
    initrd is created server-side.

    By unconditionally injecting `rd.neednet=1`, the clevis-pin-tang dracut
    module makes it impossible to be included into a truly generic initrd,
    where one cannot make assumptions about the network (or lack thereof,
    see #54) of the target hosts.

    So with a generic initrd, how can we make sure that networking is up at
    initrd time on a host which has been configured with root-on-LUKS with a
    Tang pin? By also configuring it with `rd.neednet=1` specified on the
    kernel command-line, and possibly `ip=...` to configure the network
    interfaces.

    This is no different from root-on-{NFS,iSCSI,NBD,...}, where one must
    use explicit kernel arguments like `root=nfs:<server>:...` or
    `root=iscsi:<server>:...` or `root=nbd:<server>:...`, all of which imply
    `rd.neednet=1` (one could imagine then a `root=tang:<luks2_uuid>` type
    karg in the future which would be roughly equivalent to
    `root=UUID=<luks2_uuid> rd.neednet=1`).

    Dracut also allows one to build host-specific initrds using the
    `-H`/`--hostonly` option, and further the ability to bake the
    command-line arguments when `--hostonly-cmdline` is provided.

    So a supplementary approach here would be for `install()` to only inject
    `rd.neednet=1` if using `--hostonly-cmdline` *and* somewhere along the
    root block device hierarchy, there is a Tang-pinned LUKS device. This is
    also analogous to what other dracut modules like 95nfs and 95iscsi do.

    However, optimizations for host-only initrds should not come before
    getting correct support for generic initrds.

    Closes: #54
    Closes: #206

Whether we need networking or not for unlocking an encrypted block
device is a property of the block device in question. This is expressed
in `/etc/crypttab` via the `_netdev` option. For example, the systemd
cryptsetup generator[1] picks up on this and correctly orders unlocking
of devices that need networking after `remote-fs-pre.target`.

Thus, we shouldn't need to unconditionally require and order ourselves
after networking comes up. Let whatever interprets `/etc/crypttab` take
care of this.

Add `DefaultDependencies=no` because we need to be able to run well
before `sysinit.target`.

[1] https://www.freedesktop.org/software/systemd/man/systemd-cryptsetup-generator.html
By default, dracut builds generic initrds which by design shouldn't have
any configuration specific to a host baked in (as opposed to so-called
"hostonly" initrds). This property is leveraged with great success in
immutable hosts like Fedora CoreOS and its downstream RHCOS where the
initrd is created server-side.

By unconditionally injecting `rd.neednet=1`, the clevis-pin-tang dracut
module makes it impossible to be included into a truly generic initrd,
where one cannot make assumptions about the network (or lack thereof,
see latchset#54) of the target hosts.

So with a generic initrd, how can we make sure that networking is up at
initrd time on a host which has been configured with root-on-LUKS with a
Tang pin? By also configuring it with `rd.neednet=1` specified on the
kernel command-line, and possibly `ip=...` to configure the network
interfaces.

This is no different from root-on-{NFS,iSCSI,NBD,...}, where one must
use explicit kernel arguments like `root=nfs:<server>:...` or
`root=iscsi:<server>:...` or `root=nbd:<server>:...`, all of which imply
`rd.neednet=1` (one could imagine then a `root=tang:<luks2_uuid>` type
karg in the future which would be roughly equivalent to
`root=UUID=<luks2_uuid> rd.neednet=1`).

Dracut also allows one to build host-specific initrds using the
`-H`/`--hostonly` option, and further the ability to bake the
command-line arguments when `--hostonly-cmdline` is provided.

So a supplementary approach here would be for `install()` to only inject
`rd.neednet=1` if using `--hostonly-cmdline` *and* somewhere along the
root block device hierarchy, there is a Tang-pinned LUKS device. This is
also analogous to what other dracut modules like 95nfs and 95iscsi do.

However, optimizations for host-only initrds should not come before
getting correct support for generic initrds.

Closes: latchset#54
Closes: latchset#206
To be nice to users who want to learn more about these units.
Let's match the description style that systemd itself uses for their
password agents (see e.g. `systemd-ask-password-wall.{path,service}`).
Keeping it uniform makes it more obvious that it's the exact same setup
without having to look inside it.
@jlebon jlebon mentioned this pull request Jul 2, 2020
@jlebon
Copy link
Contributor Author

jlebon commented Jul 2, 2020

Looks like Travis CI is failing on centos8 because of:

/tmp/build/build/src/luks/tests/tests-common-functions: line 111: diff: command not found

and on debian:latest because of:

+docker exec '' -e CC -e DISTRO fc61de21c7aaebe8ccef07fefbe539d301665970 ./.travis/script
Error: No such container: 

Neither of which I think is related to this PR.

Copy link

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me!

Copy link
Collaborator

@sergio-correia sergio-correia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jlebon, it looks good. The CI failure is unrelated, indeed. Thanks @cgwalters also for looking into it.

@sergio-correia sergio-correia merged commit fa30660 into latchset:master Jul 2, 2020
@jlebon
Copy link
Contributor Author

jlebon commented Jul 2, 2020

Thanks all for the quick review! @sergio-correia Were you planning on doing a release some time soon? Would greatly appreciate these patches in Fedora soon-ish (and RHEL8 too, though will make that request through bugzilla). :)

@jlebon
Copy link
Contributor Author

jlebon commented Jul 3, 2020

OK filed an RHBZ for that here: https://bugzilla.redhat.com/show_bug.cgi?id=1853651.

I was looking at the other RHBZs filed against clevis, and I think at least https://bugzilla.redhat.com/show_bug.cgi?id=1628258 and maybe https://bugzilla.redhat.com/show_bug.cgi?id=1810332 can be closed once this PR reaches Fedora.

@sergio-correia
Copy link
Collaborator

Thanks, @jlebon. There's a couple of PR's I would like to get reviewed first, but yeah, we can have a release soon-ish.

jlebon added a commit to jlebon/fedora-coreos-config that referenced this pull request Aug 11, 2020
This implements the rootmap functionality that figures out all the
dependencies required to find `/sysroot`, and injects them into the BLS
config. For more information, see:

coreos/fedora-coreos-tracker#94 (comment)

The `rdcore` code supports RAID and LUKS devices, though the latter
needs a new Clevis release with the following patches to be fully
supported:

latchset/clevis#211
latchset/clevis#217

This also implements the `root=UUID=$uuid` inject patch proposed in
coreos/fedora-coreos-tracker#465.

On its own, this unlocks reprovisioning FCOS with root on a RAID device,
or e.g. in-place reprovisioning of root on btrfs.

Closes: coreos/fedora-coreos-tracker#465
Closes: coreos/fedora-coreos-tracker#94
jlebon added a commit to jlebon/fedora-coreos-config that referenced this pull request Aug 20, 2020
This implements the rootmap functionality that figures out all the
dependencies required to find `/sysroot`, and injects them into the BLS
config. For more information, see:

coreos/fedora-coreos-tracker#94 (comment)

The `rdcore` code supports RAID and LUKS devices, though the latter
needs a new Clevis release with the following patches to be fully
supported:

latchset/clevis#211
latchset/clevis#217

This also implements the `root=UUID=$uuid` inject patch proposed in
coreos/fedora-coreos-tracker#465.

On its own, this unlocks reprovisioning FCOS with root on a RAID device,
or e.g. in-place reprovisioning of root on btrfs.

Closes: coreos/fedora-coreos-tracker#465
Closes: coreos/fedora-coreos-tracker#94
jlebon added a commit to coreos/fedora-coreos-config that referenced this pull request Aug 27, 2020
This implements the rootmap functionality that figures out all the
dependencies required to find `/sysroot`, and injects them into the BLS
config. For more information, see:

coreos/fedora-coreos-tracker#94 (comment)

The `rdcore` code supports RAID and LUKS devices, though the latter
needs a new Clevis release with the following patches to be fully
supported:

latchset/clevis#211
latchset/clevis#217

This also implements the `root=UUID=$uuid` inject patch proposed in
coreos/fedora-coreos-tracker#465.

On its own, this unlocks reprovisioning FCOS with root on a RAID device,
or e.g. in-place reprovisioning of root on btrfs.

Closes: coreos/fedora-coreos-tracker#465
Closes: coreos/fedora-coreos-tracker#94
jlebon added a commit to jlebon/os that referenced this pull request Nov 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants