Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

overlay: add 15copy-installer-network dracut module #346

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# This unit will run early in boot and detect if the user copied
# in firstboot networking config files into the installed system
# (most likely by using `coreos-installer install --copy-network`).
# Since this unit is modifying network configuration there are some
# dependencies that we have:
#
# - Need to look for networking configuration on the /boot partition
# - i.e. after /dev/disk/by-label/boot is available
# - Need to run before networking is brought up.
# - This is done in nm-run.sh [1] that runs as part of dracut-initqueue [2]
# - i.e. Before=dracut-initqueue.service
# - Need to make sure karg networking configuration isn't applied
# - There are two ways to do this.
# - One is to run *before* the nm-config.sh [3] that runs as part of
# dracut-cmdline [4] and `ln -sf /bin/true /usr/libexec/nm-initrd-generator`.
# - i.e. Before=dracut-cmdline.service
# - Another is to run *after* nm-config.sh [3] in dracut-cmdline [4]
# and just delete all the files created by nm-initrd-generator.
# - i.e. After=dracut-cmdline.service, but Before=dracut-initqueue.service
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a clear and obvious downside of the "running NM via dracut initqueue"...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Though when we do finally get NM running via systemd we'll need to make sure we work with the team such that there are two separate services: one for running nm-initrd-generator and one for starting NM. This will enable us to hook in between them OR maybe be able to just say Before=nm-initrd-generator.service

# - We'll go with the second option here because the need for the /boot
# device (mentioned above) means we can't start before dracut-cmdline.service
#
# [1] https://github.com/dracutdevs/dracut/blob/master/modules.d/35network-manager/nm-run.sh
# [2] https://github.com/dracutdevs/dracut/blob/master/modules.d/35network-manager/module-setup.sh#L37
# [3] https://github.com/dracutdevs/dracut/blob/master/modules.d/35network-manager/nm-config.sh
# [4] https://github.com/dracutdevs/dracut/blob/master/modules.d/35network-manager/module-setup.sh#L36
#
[Unit]
Description=Copy CoreOS Firstboot Networking Config
ConditionPathExists=/usr/lib/initrd-release
DefaultDependencies=false
Before=ignition-diskful.target
Before=dracut-initqueue.service
After=dracut-cmdline.service
# Since we are mounting /boot/, require the device first
Requires=dev-disk-by\x2dlabel-boot.device
After=dev-disk-by\x2dlabel-boot.device

[Service]
Type=oneshot
RemainAfterExit=yes
# The MountFlags=slave is so the umount of /boot is guaranteed to happen
# /boot will only be mounted for the lifetime of the unit.
MountFlags=slave
ExecStart=/usr/sbin/coreos-copy-firstboot-network
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
#!/bin/bash
set -euo pipefail

# For a description of how this is used see coreos-copy-firstboot-network.service

bootmnt=/mnt/boot_partition
mkdir -p ${bootmnt}
bootdev=/dev/disk/by-label/boot
firstboot_network_dir_basename="coreos-firstboot-network"
initramfs_firstboot_network_dir="${bootmnt}/${firstboot_network_dir_basename}"
initramfs_network_dir="/run/NetworkManager/system-connections/"
realroot_firstboot_network_dir="/boot/${firstboot_network_dir_basename}"

# Mount /boot. Note that we mount /boot but we don't unmount boot because we
# are run in a systemd unit with MountFlags=slave so it is unmounted for us.
# Mount as read-only since we don't strictly need write access and we may be
# running alongside other code that also has it mounted ro
mountboot() {
# Wait for up to 5 seconds for the boot device to be available
# The After=...*boot.device in the systemd unit should be enough
# but there appears to be some race in the kernel where the link under
# /dev/disk/by-label exists but mount is not able to use the device yet.
# We saw errors like this in CI:
#
# [ 4.045181] systemd[1]: Found device /dev/disk/by-label/boot.
# [ OK ] Found device /dev/disk/by-label/boot
# [ 4.051500] systemd[1]: Starting Copy CoreOS Firstboot Networking Config...
# Starting Copy CoreOS Firstboot Networking Config
# [ 4.060573] vda: vda1 vda2 vda3 vda4
# [ 4.063296] coreos-copy-firstboot-network[479]: mount: /mnt/boot_partition: special device /dev/disk/by-label/boot does not exist.
#
mounted=0
for x in {1..5}; do
if mount -o ro ${bootdev} ${bootmnt}; then
echo "info: ${bootdev} successfully mounted."
mounted=1
break
else
echo "info: retrying ${bootdev} mount in 1 second..."
sleep 1
fi
done
if [ "${mounted}" == "0" ]; then
echo "error: ${bootdev} mount did not succeed" 1>&2
return 1
fi
}

mountboot || exit 1

if [ -n "$(ls -A ${initramfs_firstboot_network_dir} 2>/dev/null)" ]; then
# Clear out any files that may have already been generated from
# kargs by nm-initrd-generator
rm -f ${initramfs_network_dir}/*
# Copy files that were placed into boot (most likely by coreos-installer)
# to the appropriate location for NetworkManager to use the configuration.
echo "info: copying files from ${initramfs_firstboot_network_dir} to ${initramfs_network_dir}"
mkdir -p ${initramfs_network_dir}
cp -v ${initramfs_firstboot_network_dir}/* ${initramfs_network_dir}/
# If we make it to the realroot (successfully ran ignition) then
# clean up the files in the firstboot network dir
echo "R ${realroot_firstboot_network_dir} - - - - -" > \
/run/tmpfiles.d/15-coreos-firstboot-network.conf
Comment on lines +62 to +63
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this out so we always do this if the directory exists, not just if it's not empty?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer not. Deleting the files (presumably on success) will already make things hard to debug. This would make it worse I think.

else
echo "info: no files to copy from ${initramfs_firstboot_network_dir}. skipping"
fi
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
install_and_enable_unit() {
unit="$1"; shift
target="$1"; shift
inst_simple "$moddir/$unit" "$systemdsystemunitdir/$unit"
mkdir -p "$initdir/$systemdsystemunitdir/$target.requires"
ln_r "../$unit" "$systemdsystemunitdir/$target.requires/$unit"
}

install() {
inst_simple "$moddir/coreos-copy-firstboot-network.sh" \
"/usr/sbin/coreos-copy-firstboot-network"
# Only run this when ignition runs and only when the system
# has disks. ignition-diskful.target should suffice.
install_and_enable_unit "coreos-copy-firstboot-network.service" \
"ignition-diskful.target"
}