Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

overlay/live: support booting from live ISO without networking #326

Merged
merged 3 commits into from
Apr 3, 2020

Conversation

dustymabe
Copy link
Member

There is a scenario where the user wants to configure networking
after they get to the interactive bash prompt. Let's support this.

Fixes coreos/fedora-coreos-tracker#349

@cgwalters
Copy link
Member

This looks sane but it heavily overlaps with #321 right?

@dustymabe
Copy link
Member Author

This looks sane but it heavily overlaps with #321 right?

correct. I'm in the middle of writing up a response to #321. I wanted to post up this WIP first to show what my alternative proposal was.

@cgwalters
Copy link
Member

cgwalters commented Mar 29, 2020

I guess #321 is trying to be more ambitious.

But, you're missing an important case - this needs to support coreos-installer iso embed, see discussion in coreos/ignition-dracut#161

Something like

ConditionPathExists=/config.ign

(Edit, the ! was wrong)

@dustymabe
Copy link
Member Author

I guess #321 is trying to be more ambitious.

But, you're missing an important case - this needs to support coreos-installer iso embed, see discussion in coreos/ignition-dracut#161

Something like

ConditionPathExists=!/config.ign

Is there a case where someone would provide both ignition.config.url on the kernel command line and also embed an ignition config using coreos-installer iso embed?

@cgwalters
Copy link
Member

cgwalters commented Mar 29, 2020

Is there a case where someone would provide both ignition.config.url on the kernel command line and also embed an ignition config using coreos-installer iso embed?

I can't think of a good use case for that, feels like we can leave it as unspecified behavior.

@bgilbert
Copy link
Contributor

Ignition prioritizes the karg over user.ign.

@bgilbert
Copy link
Contributor

Makes sense to me. Given the config.ign case, I think we'd want:

ConditionPathExists=/usr/lib/initrd-release
ConditionKernelCommandLine=!ip
ConditionKernelCommandLine=coreos.liveiso
ConditionPathExists=/run/ostree-live

ConditionKernelCommandLine=|ignition.config.url
ConditionPathExists=|/config.ign

@dustymabe
Copy link
Member Author

Makes sense to me. Given the config.ign case, I think we'd want:

So in this scenario you are assuming that config.ign could have remote references and that we need network? I was really hoping to keep the "do we need network" logic down to just a dumb check for if ignition.config.url= exists. The problem with the embedded config.ign is that I could see a case where someone embedded a config because they didn't have network (offline installer) and also a case where they embedded config.ign just because it was convenient to not to have to catch the boot prompt and add a karg.

@jlebon
Copy link
Member

jlebon commented Mar 30, 2020

WDYT about coreos/ignition#956 instead of this? This would solve the generic conditional networking problem.

Though we could get something like this in as a short-term fix too if there isn't consensus on the approach taken there.

@jlebon
Copy link
Member

jlebon commented Mar 30, 2020

Cross-linking coreos/ignition#956 (comment). I sanity checked that the live ISO now boots fully offline. (Haven't tried with an embedded Ignition config that pulls in networking yet, but that should Just Work.)

@cgwalters
Copy link
Member

(Haven't tried with an embedded Ignition config that pulls in networking yet, but that should Just Work.)

That's what kola testiso does!

@dustymabe
Copy link
Member Author

OK i'm going to revive this PR and move it forward. One thing I'm thinking about doing here is leave ip=dhcp on the kernel command line and just remove the rd.neednet=1 so that the coreos-liveiso-network-kargs.service only adds back rd.neednet=1. I'm not sure if that is better or worse than what I'm currently doing.

Thoughts?

@dustymabe
Copy link
Member Author

OK i'm going to revive this PR and move it forward. One thing I'm thinking about doing here is leave ip=dhcp on the kernel command line and just remove the rd.neednet=1 so that the coreos-liveiso-network-kargs.service only adds back rd.neednet=1. I'm not sure if that is better or worse than what I'm currently doing.

Thoughts?

I think the argument for doing this is incase someone provided ip= kargs but forgot to specify rd.neednet=1 themselves.

@cgwalters
Copy link
Member

I think the argument for doing this is incase someone provided ip= kargs but forgot to specify rd.neednet=1 themselves.

Doesn't seem like a strong argument to me. Using rd.neednet=1 is very well documented. And I think the elegance of reducing our default kernel arguments wins out.

@dustymabe
Copy link
Member Author

I think the argument for doing this is incase someone provided ip= kargs but forgot to specify rd.neednet=1 themselves.

Doesn't seem like a strong argument to me. Using rd.neednet=1 is very well documented. And I think the elegance of reducing our default kernel arguments wins out.

A couple of us stayed longer in the open discussion today and came up with a way to get the behavior we wanted without having leaving ip= on the kernel command line in the isolinux or grub configs. I'll post up an update shortly.

@dustymabe dustymabe marked this pull request as ready for review April 2, 2020 14:21
@dustymabe
Copy link
Member Author

ok - marking this as ready for review. The logic and the reasoning is captured in this description:

# This unit will run very early before the dracut-cmdline
# service and detect if we want to request dracut bring up
# networking or not. We do want to request networking if:
#
# - the user is booting the live ISO
# - the user didn't already request networking via rd.neednet
# - the user provided a ignition.config.url karg, implying
#   the need for networking
# - there is an embedded ignition config
#
# For the case of the embedded Ignition config there could be a
# case where the user embeds an Ignition config (via coreos-installer
# iso embed) but doesn't want networking. In that case we'll have
# smarter detection in the future (https://github.com/coreos/fedora-coreos-tracker/issues/443)
# but the user can override with `rd.neednet=0` now if needed.
#
# If we do determine we need network and there are no other
# `ip=` kargs then we'll use `ip=dhcp,dhcp6` by default.
#
# The requesting of network will be done by writing relevant
# dracut networking args into /etc/cmdline.d/coreos-live-network-kargs.conf
# so that it gets picked up by the dracut networking scripts later
# on in boot.
#
# This is all done because we want to support a mode where
# the user can boot the live ISO and get to an interactive
# prompt without requiring networking on boot. The user can
# then configure the networking interactively.

In this I also added a unit to change NetworkManager-wait-online.service in the real root to not show a failure if it can't get the network and also to timeout in 5 vs 30 seconds. This is done to improve the user experience of a user that boots the Live ISO without networking.

@dustymabe dustymabe removed the hold label Apr 2, 2020
@dustymabe dustymabe changed the title WIP: overlay/live: support booting from live ISO without networking overlay/live: support booting from live ISO without networking Apr 2, 2020
@@ -0,0 +1,23 @@
# Configure NetworkManager-wait-online in the real root for the
# Live ISO timeout quicker and also not explicitly fail since
# booting the Live ISO without network is a valid use case.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should support an explicit API to turn off networking instead of this.

Basically all we need is something like:

coreos-installer iso embed --no-initramfs-network
which would also create /etc/coreos-install-nonet in the cpio archive along with /config.ign.

Then we'd just have a:

ConditionPathExists=!/etc/coreos-install-nonet

in the other service above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note this configures the service in the real root, not in the initrd.

I think we should match what the Fedora live ISO does here, which is to opportunistically set up networking if available, but otherwise not fail. Was playing around with that in the Fedora 32 Workstation live ISO in both a VM without any network adapters and one connected to an isolated network and NetworkManager-wait-online.service worked just fine either way. I don't see any timeout overrides or configuration tweaks. But clearly, something is different.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note this configures the service in the real root, not in the initrd.

Oh, right.

I think we should match what the Fedora live ISO does here, which is to opportunistically set up networking if available, but otherwise not fail.

Hmm. OK yeah I guess so.

Copy link
Member

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, this looks good to me! I assume you tested it manually?

It shouldn't be terribly hard to add kola testiso --offline that starts the VM with no NICs at all; I actually did this recent PR with that in mind because we can use virtio-channels to talk to a VM with no NIC.

@dustymabe
Copy link
Member Author

thanks @cgwalters! Yep, I've been testing this over and over today. Note that I'm testing this by setting up libvirt network with no DHCP and then running nmtui (future PR) once I get the system up to configure a static IP. Here's the libvirt network XML I'm using:

<network>
  <name>nodhcp</name>
  <forward mode="nat">
    <nat>
      <port start="1024" end="65535"/>
    </nat>
  </forward>
  <bridge name="virbr100" stp="on" delay="0"/>
  <ip address="192.168.130.1" netmask="255.255.255.0">
  </ip>
</network>

I'll have to checkout kola testiso.

@jlebon if you have any more suggestions on what I can change to make it more like the Live ISO let me know. When you boot the Fedora live ISO does it timeout before you're able to get to a console? Maybe you just don't see the timeout/failure because it's a GUI interface?

@dustymabe
Copy link
Member Author

Since I know he had voiced earlier concerns about this I talked with @arithx earlier today and he said he was good with the following logic:

# For the case of the embedded Ignition config there could be a
# case where the user embeds an Ignition config (via coreos-installer
# iso embed) but doesn't want networking. In that case we'll have
# smarter detection in the future (https://github.com/coreos/fedora-coreos-tracker/issues/443)
# but the user can override with `rd.neednet=0` now if needed.

Copy link
Member

@jlebon jlebon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jlebon if you have any more suggestions on what I can change to make it more like the Live ISO let me know. When you boot the Fedora live ISO does it timeout before you're able to get to a console? Maybe you just don't see the timeout/failure because it's a GUI interface?

I add console=ttyS0 and attach to the VM console so I can see the logs. And yeah it's weird, it doesn't time out, it just... succeeds. Doesn't even seem to take more than a few seconds either. I'll have to poke around more on this, but anyway, I don't think it's a blocker!

Comment on lines 37 to 41
install_and_enable_unit "coreos-liveiso-network-kargs.service" \
"initrd.target"

install_and_enable_unit "coreos-liveiso-reconfigure-nm-wait-online.service" \
"initrd.target"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One suggestion here: there's a generator already for the live ISO, so we could instead dynamically enable this. An advantage of doing that is that systemd doesn't even bother with the service unit and doesn't spam the "Skipped ..." message on every boot on all other platforms. Definitely not a blocker of course. :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't thought of that. If it's just an extra message in the log I think I'd like to keep the unit separate and prevent yet another heredoc in the generator and also retesting this all again.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, definitely don't want heredocs either. I meant more doing the ln -s in the generator. See e.g. https://github.com/coreos/ignition-dracut/blob/6136be3d9d38d7926a61cd4d1b4ba5f9baf0892f/dracut/30ignition/ignition-generator#L39-L40. Anyway, as is is fine with me too!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh I just realized you were referring to both units and not just the reconfigure-nm-wait-online one. ok let me look into that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so right now the add_requires in the live generator for live makes them all a requirement of initrd-root-fs.target. Currently I had made them a target of initrd.target. I could modify the add_requires() function like is done in other generators or I could just make them required by initrd-root-fs.target. Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think enhancing add_requires like in Ignition to allow specifying the target makes sense (and keeping the new units to initrd.target).

# This is all done because we want to support a mode where
# the user can boot the live ISO and get to an interactive
# prompt without requiring networking on boot. The user can
# then configure the networking interactively.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome documentation!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks :)

# then configure the networking interactively.
#
[Unit]
Description=conditionally add networking kargs for live ISO
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "Request live ISO networking" ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this message show up in the logs even if it doesn't run? If so then I'd like to keep some words in there that indicates it is conditional

Copy link
Member

@jlebon jlebon Apr 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it'll print:

Condition check resulted in being skipped.

If so then I'd like to keep some words in there that indicates it is conditional

Is it though? Once the unit is activated, we're definitely activating networking, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it though? Once the unit is activated, we're definitely activating networking, no?

It think that is a "no" answer to my question: "Does this message show up in the logs even if it doesn't run?" If that is the case I'm 👍 to changing the wording.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh hehe, the GitHub markdown elided the critical part of my message. It prints:

Condition check resulted in $descrition being skipped.

What I mean is that all the conditionals are part of the systemd unit itself already, so whether the unit is skipped or not is already reflected in what systemd does (and prints). Since the script itself unconditionally enables rd.neednet, it'd be cleaner to say something like "Request live ISO networking".

Comment on lines 16 to 19
# Note that because of the priority of /etc/cmdline.d/*.conf it doesn't
# matter if we do this check or if we unconditionally write ip=dhcp,dhcp6
# because it will never take precedence over an ip= arg on the kernel
# command line.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, not sure I follow this... doesn't that mean we don't need to check for ip at all then and can just unconditionally print ip=dhcp,dhcp6?

Copy link
Member Author

@dustymabe dustymabe Apr 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, basically because of the way /etc/cmdline.d/*.conf gets merged with the kernel command line we could unconditionally write ip=dhcp,dhcp6 and the right thing would still happen but I think it would be misleading. IMHO future developers who come in here and try to figure out what is going on are better off with the current conditional logic and comment.

scratch all of that. I now think it does matter because it's valid for a user to provide multiple ip= kargs so if we unconditionally add it then we'd end up with something like

ip=dhcp,dhcp6 ip=192.168.130.2::192.168.130.1:255.255.255.0:fcos:eth0:none:192.168.130.1

that dracut/NM would then parse and that's not what we want.

I'll delete all but the first line of the comment if you agree.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, yup that makes sense!

Copy link
Member Author

@dustymabe dustymabe Apr 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see.. if I hadn't try to "do the right thing" and also comment the hell out of this (even though the comment was wrong) I would have unconditionally added ip=dhcp,dhcp6 😜

@dustymabe
Copy link
Member Author

@jlebon if you have any more suggestions on what I can change to make it more like the Live ISO let me know. When you boot the Fedora live ISO does it timeout before you're able to get to a console? Maybe you just don't see the timeout/failure because it's a GUI interface?

I add console=ttyS0 and attach to the VM console so I can see the logs. And yeah it's weird, it doesn't time out, it just... succeeds. Doesn't even seem to take more than a few seconds either. I'll have to poke around more on this, but anyway, I don't think it's a blocker!

This is on a system with a network without DHCP? See #326 (comment) for how I'm setting up a libvirt network to test this. On the Fedora Live ISO what does systemctl cat NetworkManager-wait-online.service show. I wonder if they are doing some overrides.

@jlebon
Copy link
Member

jlebon commented Apr 3, 2020

This is on a system with a network without DHCP? See #326 (comment) for how I'm setting up a libvirt network to test this.

Ahh yup, good catch! I tested no network adapters, and with a network adapter isolated, but still with DHCP and those worked fine. An isolated network adapter without DHCP does indeed cause NetworkManager-wait-online.service to time out and error out after 30s.

@dustymabe
Copy link
Member Author

ok pushed up a change to address code review comments. Once you say it looks good I'll give a final round of testing.

Copy link
Member

@jlebon jlebon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, LGTM!

This matches what is done in our other generators already and
will make it easier to enable units for different targets.
@dustymabe
Copy link
Member Author

rebased on top of latest testing-devel - now doing final testing

There is a scenario where the user wants to configure networking
after they get to the interactive bash prompt. Let's support this.

Fixes coreos/fedora-coreos-tracker#349
This adds coreos-liveiso-reconfigure-nm-wait-online.service which will
configured NetworkManager-wait-online.service in the real root timeout
quicker and also not show a failure if there is no connection.

Doing this for the Live ISO improves the user experience when booting
the Live ISO without network.
@dustymabe
Copy link
Member Author

ok pushed one final commit to fix a problem introduced in the last change.. merging

@dustymabe dustymabe merged commit 34f3b09 into coreos:testing-devel Apr 3, 2020
@dustymabe dustymabe deleted the dusty-no-network-liveiso branch April 3, 2020 16:54
jlebon added a commit to jlebon/fedora-coreos-config that referenced this pull request Jul 7, 2021
We originally did this in coreos#326 because we wanted to support booting the
live ISO without networking. This was solved on the initramfs side by
the conditional networking work (coreos#426). But for the real root, this was
still useful because if booting the ISO interactively on a system
without any network, or a non-DHCP network, we didn't want the user to
have to wait until the service timed out before getting a shell.

The core issue however is that we're requesting `network-online.target`
at all. It's an "active unit" which means that it's only pulled in the
transaction, possibly delaying boot, if another systemd unit needs it.
And ideally, no service would need it as per:

https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/

In our case, this unit was fedora-coreos-pinger. We drop that
requirement here:

coreos/fedora-coreos-pinger#41

With that, we no longer pull in `network-online.target` and so no longer
delay reaching the console even if NetworkManager isn't able to get an
active connection for whatever reason. This matches how it works on
traditional Fedora as well.

Having a short timeout actually also had a counterproductive effect in
the automated install case. There, `coreos-installer.service` does pull
in `network-online.target` (which with
coreos/coreos-installer#565 we could consider
dropping as advised by systemd, though we probably should bump the
number of retries some more in that case), but because of the short
timeout, we genuinely may not yet have the network fully up before we
run (see https://bugzilla.redhat.com/show_bug.cgi?id=1967483).
jlebon added a commit to jlebon/fedora-coreos-config that referenced this pull request Jul 8, 2021
We originally did this in coreos#326 because we wanted to support booting the
live ISO without networking. This was solved on the initramfs side by
the conditional networking work (coreos#426). But for the real root, this was
still useful because if booting the ISO interactively on a system
without any network, or a non-DHCP network, we didn't want the user to
have to wait until the service timed out before getting a shell.

The core issue however is that we're requesting `network-online.target`
at all. It's an "active unit" which means that it's only pulled in the
transaction, possibly delaying boot, if another systemd unit needs it.
And ideally, no service would need it as per:

https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/

In our case, this unit was fedora-coreos-pinger. We drop that
requirement here:

coreos/fedora-coreos-pinger#41

With that, we no longer pull in `network-online.target` and so no longer
delay reaching the console even if NetworkManager isn't able to get an
active connection for whatever reason. This matches how it works on
traditional Fedora as well.

Having a short timeout actually also had a counterproductive effect in
the automated install case. There, `coreos-installer.service` does pull
in `network-online.target` (which with
coreos/coreos-installer#565 we could consider
dropping as advised by systemd, though we probably should bump the
number of retries some more in that case), but because of the short
timeout, we genuinely may not yet have the network fully up before we
run (see https://bugzilla.redhat.com/show_bug.cgi?id=1967483).
jlebon added a commit that referenced this pull request Jul 8, 2021
We originally did this in #326 because we wanted to support booting the
live ISO without networking. This was solved on the initramfs side by
the conditional networking work (#426). But for the real root, this was
still useful because if booting the ISO interactively on a system
without any network, or a non-DHCP network, we didn't want the user to
have to wait until the service timed out before getting a shell.

The core issue however is that we're requesting `network-online.target`
at all. It's an "active unit" which means that it's only pulled in the
transaction, possibly delaying boot, if another systemd unit needs it.
And ideally, no service would need it as per:

https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/

In our case, this unit was fedora-coreos-pinger. We drop that
requirement here:

coreos/fedora-coreos-pinger#41

With that, we no longer pull in `network-online.target` and so no longer
delay reaching the console even if NetworkManager isn't able to get an
active connection for whatever reason. This matches how it works on
traditional Fedora as well.

Having a short timeout actually also had a counterproductive effect in
the automated install case. There, `coreos-installer.service` does pull
in `network-online.target` (which with
coreos/coreos-installer#565 we could consider
dropping as advised by systemd, though we probably should bump the
number of retries some more in that case), but because of the short
timeout, we genuinely may not yet have the network fully up before we
run (see https://bugzilla.redhat.com/show_bug.cgi?id=1967483).
jlebon added a commit to jlebon/fedora-coreos-config that referenced this pull request Jul 19, 2021
We originally did this in coreos#326 because we wanted to support booting the
live ISO without networking. This was solved on the initramfs side by
the conditional networking work (coreos#426). But for the real root, this was
still useful because if booting the ISO interactively on a system
without any network, or a non-DHCP network, we didn't want the user to
have to wait until the service timed out before getting a shell.

The core issue however is that we're requesting `network-online.target`
at all. It's an "active unit" which means that it's only pulled in the
transaction, possibly delaying boot, if another systemd unit needs it.
And ideally, no service would need it as per:

https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/

In our case, this unit was fedora-coreos-pinger. We drop that
requirement here:

coreos/fedora-coreos-pinger#41

With that, we no longer pull in `network-online.target` and so no longer
delay reaching the console even if NetworkManager isn't able to get an
active connection for whatever reason. This matches how it works on
traditional Fedora as well.

Having a short timeout actually also had a counterproductive effect in
the automated install case. There, `coreos-installer.service` does pull
in `network-online.target` (which with
coreos/coreos-installer#565 we could consider
dropping as advised by systemd, though we probably should bump the
number of retries some more in that case), but because of the short
timeout, we genuinely may not yet have the network fully up before we
run (see https://bugzilla.redhat.com/show_bug.cgi?id=1967483).

(cherry picked from commit dd54e8c)
jlebon added a commit to jlebon/fedora-coreos-config that referenced this pull request Jul 19, 2021
We originally did this in coreos#326 because we wanted to support booting the
live ISO without networking. This was solved on the initramfs side by
the conditional networking work (coreos#426). But for the real root, this was
still useful because if booting the ISO interactively on a system
without any network, or a non-DHCP network, we didn't want the user to
have to wait until the service timed out before getting a shell.

The core issue however is that we're requesting `network-online.target`
at all. It's an "active unit" which means that it's only pulled in the
transaction, possibly delaying boot, if another systemd unit needs it.
And ideally, no service would need it as per:

https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/

In our case, this unit was fedora-coreos-pinger. We drop that
requirement here:

coreos/fedora-coreos-pinger#41

With that, we no longer pull in `network-online.target` and so no longer
delay reaching the console even if NetworkManager isn't able to get an
active connection for whatever reason. This matches how it works on
traditional Fedora as well.

Having a short timeout actually also had a counterproductive effect in
the automated install case. There, `coreos-installer.service` does pull
in `network-online.target` (which with
coreos/coreos-installer#565 we could consider
dropping as advised by systemd, though we probably should bump the
number of retries some more in that case), but because of the short
timeout, we genuinely may not yet have the network fully up before we
run (see https://bugzilla.redhat.com/show_bug.cgi?id=1967483).

(cherry picked from commit dd54e8c)
jlebon added a commit to jlebon/fedora-coreos-config that referenced this pull request Jul 19, 2021
We originally did this in coreos#326 because we wanted to support booting the
live ISO without networking. This was solved on the initramfs side by
the conditional networking work (coreos#426). But for the real root, this was
still useful because if booting the ISO interactively on a system
without any network, or a non-DHCP network, we didn't want the user to
have to wait until the service timed out before getting a shell.

The core issue however is that we're requesting `network-online.target`
at all. It's an "active unit" which means that it's only pulled in the
transaction, possibly delaying boot, if another systemd unit needs it.
And ideally, no service would need it as per:

https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/

In our case, this unit was fedora-coreos-pinger. We drop that
requirement here:

coreos/fedora-coreos-pinger#41

With that, we no longer pull in `network-online.target` and so no longer
delay reaching the console even if NetworkManager isn't able to get an
active connection for whatever reason. This matches how it works on
traditional Fedora as well.

Having a short timeout actually also had a counterproductive effect in
the automated install case. There, `coreos-installer.service` does pull
in `network-online.target` (which with
coreos/coreos-installer#565 we could consider
dropping as advised by systemd, though we probably should bump the
number of retries some more in that case), but because of the short
timeout, we genuinely may not yet have the network fully up before we
run (see https://bugzilla.redhat.com/show_bug.cgi?id=1967483).

(cherry picked from commit dd54e8c)
jlebon added a commit that referenced this pull request Jul 19, 2021
We originally did this in #326 because we wanted to support booting the
live ISO without networking. This was solved on the initramfs side by
the conditional networking work (#426). But for the real root, this was
still useful because if booting the ISO interactively on a system
without any network, or a non-DHCP network, we didn't want the user to
have to wait until the service timed out before getting a shell.

The core issue however is that we're requesting `network-online.target`
at all. It's an "active unit" which means that it's only pulled in the
transaction, possibly delaying boot, if another systemd unit needs it.
And ideally, no service would need it as per:

https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/

In our case, this unit was fedora-coreos-pinger. We drop that
requirement here:

coreos/fedora-coreos-pinger#41

With that, we no longer pull in `network-online.target` and so no longer
delay reaching the console even if NetworkManager isn't able to get an
active connection for whatever reason. This matches how it works on
traditional Fedora as well.

Having a short timeout actually also had a counterproductive effect in
the automated install case. There, `coreos-installer.service` does pull
in `network-online.target` (which with
coreos/coreos-installer#565 we could consider
dropping as advised by systemd, though we probably should bump the
number of retries some more in that case), but because of the short
timeout, we genuinely may not yet have the network fully up before we
run (see https://bugzilla.redhat.com/show_bug.cgi?id=1967483).

(cherry picked from commit dd54e8c)
jlebon added a commit that referenced this pull request Jul 19, 2021
We originally did this in #326 because we wanted to support booting the
live ISO without networking. This was solved on the initramfs side by
the conditional networking work (#426). But for the real root, this was
still useful because if booting the ISO interactively on a system
without any network, or a non-DHCP network, we didn't want the user to
have to wait until the service timed out before getting a shell.

The core issue however is that we're requesting `network-online.target`
at all. It's an "active unit" which means that it's only pulled in the
transaction, possibly delaying boot, if another systemd unit needs it.
And ideally, no service would need it as per:

https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/

In our case, this unit was fedora-coreos-pinger. We drop that
requirement here:

coreos/fedora-coreos-pinger#41

With that, we no longer pull in `network-online.target` and so no longer
delay reaching the console even if NetworkManager isn't able to get an
active connection for whatever reason. This matches how it works on
traditional Fedora as well.

Having a short timeout actually also had a counterproductive effect in
the automated install case. There, `coreos-installer.service` does pull
in `network-online.target` (which with
coreos/coreos-installer#565 we could consider
dropping as advised by systemd, though we probably should bump the
number of retries some more in that case), but because of the short
timeout, we genuinely may not yet have the network fully up before we
run (see https://bugzilla.redhat.com/show_bug.cgi?id=1967483).

(cherry picked from commit dd54e8c)
HuijingHei pushed a commit to HuijingHei/fedora-coreos-config that referenced this pull request Oct 10, 2023
We originally did this in coreos#326 because we wanted to support booting the
live ISO without networking. This was solved on the initramfs side by
the conditional networking work (coreos#426). But for the real root, this was
still useful because if booting the ISO interactively on a system
without any network, or a non-DHCP network, we didn't want the user to
have to wait until the service timed out before getting a shell.

The core issue however is that we're requesting `network-online.target`
at all. It's an "active unit" which means that it's only pulled in the
transaction, possibly delaying boot, if another systemd unit needs it.
And ideally, no service would need it as per:

https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/

In our case, this unit was fedora-coreos-pinger. We drop that
requirement here:

coreos/fedora-coreos-pinger#41

With that, we no longer pull in `network-online.target` and so no longer
delay reaching the console even if NetworkManager isn't able to get an
active connection for whatever reason. This matches how it works on
traditional Fedora as well.

Having a short timeout actually also had a counterproductive effect in
the automated install case. There, `coreos-installer.service` does pull
in `network-online.target` (which with
coreos/coreos-installer#565 we could consider
dropping as advised by systemd, though we probably should bump the
number of retries some more in that case), but because of the short
timeout, we genuinely may not yet have the network fully up before we
run (see https://bugzilla.redhat.com/show_bug.cgi?id=1967483).
HuijingHei pushed a commit to HuijingHei/fedora-coreos-config that referenced this pull request Oct 10, 2023
We originally did this in coreos#326 because we wanted to support booting the
live ISO without networking. This was solved on the initramfs side by
the conditional networking work (coreos#426). But for the real root, this was
still useful because if booting the ISO interactively on a system
without any network, or a non-DHCP network, we didn't want the user to
have to wait until the service timed out before getting a shell.

The core issue however is that we're requesting `network-online.target`
at all. It's an "active unit" which means that it's only pulled in the
transaction, possibly delaying boot, if another systemd unit needs it.
And ideally, no service would need it as per:

https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/

In our case, this unit was fedora-coreos-pinger. We drop that
requirement here:

coreos/fedora-coreos-pinger#41

With that, we no longer pull in `network-online.target` and so no longer
delay reaching the console even if NetworkManager isn't able to get an
active connection for whatever reason. This matches how it works on
traditional Fedora as well.

Having a short timeout actually also had a counterproductive effect in
the automated install case. There, `coreos-installer.service` does pull
in `network-online.target` (which with
coreos/coreos-installer#565 we could consider
dropping as advised by systemd, though we probably should bump the
number of retries some more in that case), but because of the short
timeout, we genuinely may not yet have the network fully up before we
run (see https://bugzilla.redhat.com/show_bug.cgi?id=1967483).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Booting from ISO w/o DHCP server fails and ends in emergency mode
4 participants