Skip to content
This repository has been archived by the owner on Jun 23, 2022. It is now read-only.

service: drop requirement on network-online.target #41

Merged
merged 1 commit into from
Jul 8, 2021

Conversation

jlebon
Copy link
Member

@jlebon jlebon commented Jul 7, 2021

As per systemd recommendations[1], we should avoid depending on this
unit. Instead, if/when we do implement the reporting bits, we should be
resilient to networks taking time to come online or temporary network
flakes.

This is the only thing which is actively pulling in
network-online.target right now (the other one being docker.service,
but we have less control over that one, and it's disabled by default).

[1] https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/

As per systemd recommendations[1], we should avoid depending on this
unit. Instead, if/when we do implement the reporting bits, we should be
resilient to networks taking time to come online or temporary network
flakes.

This is the only thing which is actively pulling in
`network-online.target` right now (the other one being `docker.service`,
but we have less control over that one, and it's disabled by default).

[1] https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/
jlebon added a commit to jlebon/fedora-coreos-config that referenced this pull request Jul 7, 2021
We originally did this in coreos#326 because we wanted to support booting the
live ISO without networking. This was solved on the initramfs side by
the conditional networking work (coreos#426). But for the real root, this was
still useful because if booting the ISO interactively on a system
without any network, or a non-DHCP network, we didn't want the user to
have to wait until the service timed out before getting a shell.

The core issue however is that we're requesting `network-online.target`
at all. It's an "active unit" which means that it's only pulled in the
transaction, possibly delaying boot, if another systemd unit needs it.
And ideally, no service would need it as per:

https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/

In our case, this unit was fedora-coreos-pinger. We drop that
requirement here:

coreos/fedora-coreos-pinger#41

With that, we no longer pull in `network-online.target` and so no longer
delay reaching the console even if NetworkManager isn't able to get an
active connection for whatever reason. This matches how it works on
traditional Fedora as well.

Having a short timeout actually also had a counterproductive effect in
the automated install case. There, `coreos-installer.service` does pull
in `network-online.target` (which with
coreos/coreos-installer#565 we could consider
dropping as advised by systemd, though we probably should bump the
number of retries some more in that case), but because of the short
timeout, we genuinely may not yet have the network fully up before we
run (see https://bugzilla.redhat.com/show_bug.cgi?id=1967483).
Copy link
Member

@dustymabe dustymabe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jlebon jlebon merged commit 603f665 into coreos:main Jul 8, 2021
@jlebon jlebon deleted the pr/drop-network-online branch July 8, 2021 13:58
@jlebon
Copy link
Member Author

jlebon commented Jul 8, 2021

Wasn't sure if we wanted to cut a release for this. Started a backport in https://src.fedoraproject.org/rpms/rust-fedora-coreos-pinger/pull-request/4.

jlebon added a commit to jlebon/fedora-coreos-config that referenced this pull request Jul 8, 2021
We originally did this in coreos#326 because we wanted to support booting the
live ISO without networking. This was solved on the initramfs side by
the conditional networking work (coreos#426). But for the real root, this was
still useful because if booting the ISO interactively on a system
without any network, or a non-DHCP network, we didn't want the user to
have to wait until the service timed out before getting a shell.

The core issue however is that we're requesting `network-online.target`
at all. It's an "active unit" which means that it's only pulled in the
transaction, possibly delaying boot, if another systemd unit needs it.
And ideally, no service would need it as per:

https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/

In our case, this unit was fedora-coreos-pinger. We drop that
requirement here:

coreos/fedora-coreos-pinger#41

With that, we no longer pull in `network-online.target` and so no longer
delay reaching the console even if NetworkManager isn't able to get an
active connection for whatever reason. This matches how it works on
traditional Fedora as well.

Having a short timeout actually also had a counterproductive effect in
the automated install case. There, `coreos-installer.service` does pull
in `network-online.target` (which with
coreos/coreos-installer#565 we could consider
dropping as advised by systemd, though we probably should bump the
number of retries some more in that case), but because of the short
timeout, we genuinely may not yet have the network fully up before we
run (see https://bugzilla.redhat.com/show_bug.cgi?id=1967483).
jlebon added a commit to jlebon/fedora-coreos-config that referenced this pull request Jul 8, 2021
jlebon added a commit to coreos/fedora-coreos-config that referenced this pull request Jul 8, 2021
We originally did this in #326 because we wanted to support booting the
live ISO without networking. This was solved on the initramfs side by
the conditional networking work (#426). But for the real root, this was
still useful because if booting the ISO interactively on a system
without any network, or a non-DHCP network, we didn't want the user to
have to wait until the service timed out before getting a shell.

The core issue however is that we're requesting `network-online.target`
at all. It's an "active unit" which means that it's only pulled in the
transaction, possibly delaying boot, if another systemd unit needs it.
And ideally, no service would need it as per:

https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/

In our case, this unit was fedora-coreos-pinger. We drop that
requirement here:

coreos/fedora-coreos-pinger#41

With that, we no longer pull in `network-online.target` and so no longer
delay reaching the console even if NetworkManager isn't able to get an
active connection for whatever reason. This matches how it works on
traditional Fedora as well.

Having a short timeout actually also had a counterproductive effect in
the automated install case. There, `coreos-installer.service` does pull
in `network-online.target` (which with
coreos/coreos-installer#565 we could consider
dropping as advised by systemd, though we probably should bump the
number of retries some more in that case), but because of the short
timeout, we genuinely may not yet have the network fully up before we
run (see https://bugzilla.redhat.com/show_bug.cgi?id=1967483).
jlebon added a commit to coreos/fedora-coreos-config that referenced this pull request Jul 8, 2021
jlebon added a commit to jlebon/fedora-coreos-config that referenced this pull request Jul 19, 2021
We originally did this in coreos#326 because we wanted to support booting the
live ISO without networking. This was solved on the initramfs side by
the conditional networking work (coreos#426). But for the real root, this was
still useful because if booting the ISO interactively on a system
without any network, or a non-DHCP network, we didn't want the user to
have to wait until the service timed out before getting a shell.

The core issue however is that we're requesting `network-online.target`
at all. It's an "active unit" which means that it's only pulled in the
transaction, possibly delaying boot, if another systemd unit needs it.
And ideally, no service would need it as per:

https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/

In our case, this unit was fedora-coreos-pinger. We drop that
requirement here:

coreos/fedora-coreos-pinger#41

With that, we no longer pull in `network-online.target` and so no longer
delay reaching the console even if NetworkManager isn't able to get an
active connection for whatever reason. This matches how it works on
traditional Fedora as well.

Having a short timeout actually also had a counterproductive effect in
the automated install case. There, `coreos-installer.service` does pull
in `network-online.target` (which with
coreos/coreos-installer#565 we could consider
dropping as advised by systemd, though we probably should bump the
number of retries some more in that case), but because of the short
timeout, we genuinely may not yet have the network fully up before we
run (see https://bugzilla.redhat.com/show_bug.cgi?id=1967483).

(cherry picked from commit dd54e8c)
jlebon added a commit to jlebon/fedora-coreos-config that referenced this pull request Jul 19, 2021
We originally did this in coreos#326 because we wanted to support booting the
live ISO without networking. This was solved on the initramfs side by
the conditional networking work (coreos#426). But for the real root, this was
still useful because if booting the ISO interactively on a system
without any network, or a non-DHCP network, we didn't want the user to
have to wait until the service timed out before getting a shell.

The core issue however is that we're requesting `network-online.target`
at all. It's an "active unit" which means that it's only pulled in the
transaction, possibly delaying boot, if another systemd unit needs it.
And ideally, no service would need it as per:

https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/

In our case, this unit was fedora-coreos-pinger. We drop that
requirement here:

coreos/fedora-coreos-pinger#41

With that, we no longer pull in `network-online.target` and so no longer
delay reaching the console even if NetworkManager isn't able to get an
active connection for whatever reason. This matches how it works on
traditional Fedora as well.

Having a short timeout actually also had a counterproductive effect in
the automated install case. There, `coreos-installer.service` does pull
in `network-online.target` (which with
coreos/coreos-installer#565 we could consider
dropping as advised by systemd, though we probably should bump the
number of retries some more in that case), but because of the short
timeout, we genuinely may not yet have the network fully up before we
run (see https://bugzilla.redhat.com/show_bug.cgi?id=1967483).

(cherry picked from commit dd54e8c)
jlebon added a commit to jlebon/fedora-coreos-config that referenced this pull request Jul 19, 2021
We originally did this in coreos#326 because we wanted to support booting the
live ISO without networking. This was solved on the initramfs side by
the conditional networking work (coreos#426). But for the real root, this was
still useful because if booting the ISO interactively on a system
without any network, or a non-DHCP network, we didn't want the user to
have to wait until the service timed out before getting a shell.

The core issue however is that we're requesting `network-online.target`
at all. It's an "active unit" which means that it's only pulled in the
transaction, possibly delaying boot, if another systemd unit needs it.
And ideally, no service would need it as per:

https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/

In our case, this unit was fedora-coreos-pinger. We drop that
requirement here:

coreos/fedora-coreos-pinger#41

With that, we no longer pull in `network-online.target` and so no longer
delay reaching the console even if NetworkManager isn't able to get an
active connection for whatever reason. This matches how it works on
traditional Fedora as well.

Having a short timeout actually also had a counterproductive effect in
the automated install case. There, `coreos-installer.service` does pull
in `network-online.target` (which with
coreos/coreos-installer#565 we could consider
dropping as advised by systemd, though we probably should bump the
number of retries some more in that case), but because of the short
timeout, we genuinely may not yet have the network fully up before we
run (see https://bugzilla.redhat.com/show_bug.cgi?id=1967483).

(cherry picked from commit dd54e8c)
jlebon added a commit to coreos/fedora-coreos-config that referenced this pull request Jul 19, 2021
We originally did this in #326 because we wanted to support booting the
live ISO without networking. This was solved on the initramfs side by
the conditional networking work (#426). But for the real root, this was
still useful because if booting the ISO interactively on a system
without any network, or a non-DHCP network, we didn't want the user to
have to wait until the service timed out before getting a shell.

The core issue however is that we're requesting `network-online.target`
at all. It's an "active unit" which means that it's only pulled in the
transaction, possibly delaying boot, if another systemd unit needs it.
And ideally, no service would need it as per:

https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/

In our case, this unit was fedora-coreos-pinger. We drop that
requirement here:

coreos/fedora-coreos-pinger#41

With that, we no longer pull in `network-online.target` and so no longer
delay reaching the console even if NetworkManager isn't able to get an
active connection for whatever reason. This matches how it works on
traditional Fedora as well.

Having a short timeout actually also had a counterproductive effect in
the automated install case. There, `coreos-installer.service` does pull
in `network-online.target` (which with
coreos/coreos-installer#565 we could consider
dropping as advised by systemd, though we probably should bump the
number of retries some more in that case), but because of the short
timeout, we genuinely may not yet have the network fully up before we
run (see https://bugzilla.redhat.com/show_bug.cgi?id=1967483).

(cherry picked from commit dd54e8c)
jlebon added a commit to coreos/fedora-coreos-config that referenced this pull request Jul 19, 2021
We originally did this in #326 because we wanted to support booting the
live ISO without networking. This was solved on the initramfs side by
the conditional networking work (#426). But for the real root, this was
still useful because if booting the ISO interactively on a system
without any network, or a non-DHCP network, we didn't want the user to
have to wait until the service timed out before getting a shell.

The core issue however is that we're requesting `network-online.target`
at all. It's an "active unit" which means that it's only pulled in the
transaction, possibly delaying boot, if another systemd unit needs it.
And ideally, no service would need it as per:

https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/

In our case, this unit was fedora-coreos-pinger. We drop that
requirement here:

coreos/fedora-coreos-pinger#41

With that, we no longer pull in `network-online.target` and so no longer
delay reaching the console even if NetworkManager isn't able to get an
active connection for whatever reason. This matches how it works on
traditional Fedora as well.

Having a short timeout actually also had a counterproductive effect in
the automated install case. There, `coreos-installer.service` does pull
in `network-online.target` (which with
coreos/coreos-installer#565 we could consider
dropping as advised by systemd, though we probably should bump the
number of retries some more in that case), but because of the short
timeout, we genuinely may not yet have the network fully up before we
run (see https://bugzilla.redhat.com/show_bug.cgi?id=1967483).

(cherry picked from commit dd54e8c)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants