Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System gets stuck when initiating a reboot (x86_64/EFI/Bookworm) #7104

Open
ioctl2 opened this issue Jun 11, 2024 · 19 comments
Open

System gets stuck when initiating a reboot (x86_64/EFI/Bookworm) #7104

ioctl2 opened this issue Jun 11, 2024 · 19 comments
Labels
External bug 🐞 For bugs which are not caused by DietPi. Solution available 🥂 Definite solution has been done Testing/testers required 🔽 Waiting for user reply ⏳
Milestone

Comments

@ioctl2
Copy link

ioctl2 commented Jun 11, 2024

Creating a bug report/issue

  • [Y ] I have searched the existing open and closed issues

Required Information

  • DietPi version | cat /boot/dietpi/.version
    G_DIETPI_VERSION_CORE=9
    G_DIETPI_VERSION_SUB=5
    G_DIETPI_VERSION_RC=1
    G_GITBRANCH='master'
    G_GITOWNER='MichaIng'

  • Distro version | echo $G_DISTRO_NAME $G_RASPBIAN
    bookworm

  • Kernel version | uname -a
    Linux asus-psff 6.1.0-21-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.90-1 (2024-05-03) x86_64 GNU/Linux

  • SBC model | echo $G_HW_MODEL_NAME or (EG: RPi3)
    Manufacturer: ASUSTeK COMPUTER INC.
    Product Name: VM40B

  • Power supply used | (EG: 5V 1A RAVpower)
    The PSU is fine.

  • SD card used | (EG: SanDisk ultra)
    Lexar 64G

Additional Information (if applicable)

  • Software title | (EG: Nextcloud)
    N/A
  • Was the software title installed freshly or updated/migrated?
    Fresh deployment of the OS + all updates applied.
  • Can this issue be replicated on a fresh installation of DietPi?
    Yes.
  • Bug report ID | echo $G_HW_UUID

Steps to reproduce

  1. Deploy Dietpi Bookworm in UEFI mode
  2. Apply all available updates (as of June 10, 2024. May not be necessary.)
  3. Try to reboot via terminal by running reboot

Expected behaviour

  • System should reboot using the normal reboot command. This works in vanilla Debian Bookworm.

Actual behaviour

System starts the reboot process but gets stuck on Stopping networking.service - Raise network interfaces... / Stopping ifup@eth0.service - ifup for eth0
At this point it's stuck. If I hit Ctrl+Alt+Del, it continues to reboot and everything is fine (as in, the system reboots and comes back).
Also, reboot -f reboots the system without the issue above.

Extra details

  • Main ethernet interface configured via DHCP.
  • WLAN present but disabled.
  • This is happening on two dissimilar x86 systems that I tried DietPi on.
@ioctl2
Copy link
Author

ioctl2 commented Jun 11, 2024

Updates:

  • I popped in a USB stick with Ubuntu 20.04 preinstalled on it. It boots and reboots fine.
  • I thought the issue was only with rebooting, but poweroff gets stuck too and the system does not power off.

@MichaIng
Copy link
Owner

MichaIng commented Jun 11, 2024

Since this is what we changed with last DietPi release, can you test whether this makes a difference:

sed -i '/^[^#].*network-pre.target/s/^/#/' /etc/systemd/system/ifupdown-pre.service.d/dietpi.conf
systemctl daemon-reload
reboot

Probably you need to do another reboot to see the effect.

@ioctl2
Copy link
Author

ioctl2 commented Jun 11, 2024

I gave it a try but unfortunately no dice:
image

@MichaIng
Copy link
Owner

Was this the first reboot, or did you try a 2nd reboot? Because I am not sure whether it has an effect on the 1st reboot, despite systemd reload.

So the last thing we see in the logs which did not finish, is ifup@eth0.service. What it does it:

ifdown eth0

Can you try this command manually from console, and see whether it hangs as well?

@ioctl2
Copy link
Author

ioctl2 commented Jun 11, 2024

I first rebooted remotely with reboot -f, then tried another reboot without the -f flag. It got stuck and I had to go to the device to hit Ctrl+Alt+Del. I then tried to reboot from the terminal while in front of it, with the same issue persisting.

Earlier, when I ran systemctl stop networking before trying to reboot, it worked. I haven't tried ifdown eth0 yet.

@ioctl2
Copy link
Author

ioctl2 commented Jun 12, 2024

ifdown eth0 followed by reboot also worked.

@MichaIng
Copy link
Owner

Hmm, so stopping either networking.service or ifup@eth0.service prior to reboot works? Maybe the problem is somehow when both are trying to stop concurrently, making one of them hang up.

The question is why this never was an issue before. Since our recent change to /etc/systemd/system/ifupdown-pre.service.d/dietpi.conf had some impact on the order in which those services start and are stopped, it would have been a good explanation, but since reverting this did not help, I have no idea what changed. Neither systemd, not ifupdown had an upgrade recently.

Just to rule it out, can you show the content of the file:

cat /etc/systemd/system/ifupdown-pre.service.d/dietpi.conf

And try to move it out of place completely?

mv /etc/systemd/system/ifupdown-pre.service.d/dietpi.conf{,_bak}

I played a bit around on a VM with DHCP, starting, stopping and restarting those services, and rebooting, but could not replicate the issue. However, one must note that networking.service does not configure hotplug interfaces when it starts, but de-configures them when it stops. So on shutdown, both services deconfigure the eth0 interface, and maybe when both try to do that in a particular way concurrently, or one tries to do so when the other is at a particular stage of it, they can hang up each other. A solution would be to add After=networking.service to ifup@.service, so on shutdown, all hotplug interfaces are de-configured first, and networking.service only brings down the loopback interface afterwards.

However, we should find out whether our service ordering caused this. Because in general I see no reason why those two services could not have run/stopped concurrently before. It would have been a more random incidence, but perfectly possible.

@MichaIng MichaIng added this to the v9.6 milestone Jun 12, 2024
@MichaIng
Copy link
Owner

MichaIng commented Jun 12, 2024

Found it, at least part of the issue. I am still not sure how our ordering might have an effect on it, or if it really has (testing above would help), since I cannot replicate the issue, but I thought and found the following:

  1. As mentioned above, my theory is that concurrent ifdown calls for the same interface can hang up each other.
  2. Now networking.service brings down all interfaces, and ifup@eth0 doubles that for eth0 on shutdown, which is at least unnecessary, and seems to be the reason for your issue.
  3. Hence I was thinking about ways to prevent ifup@eth0 to be stopped at all at shutdown. And interestingly I found Debian aiming for this already: https://salsa.debian.org/debian/ifupdown/-/commit/3d06b084f15081c3f0740f50ea50291919aa522c
  4. Exactly with argument I thought about, and using system.slice in combination with DefaultDependencies=no is the documented way to achieve it. However, ifup@eth0 is stopped on shutdown despite that.
  5. So I though that maybe our ordering broke it, but it doesn't. Having another look into the original service file shows Conflicts=shutdown.target, which is actually part of DefaultDependencies=yes, hence inverts/conflicts with above directive.
  6. Conflicts=shutdown.target has been added after system.slice in a dedicated commit, while no one seems to have ever recognised that this broke the aim of a previous commit: https://salsa.debian.org/debian/ifupdown/-/commit/117cfa63ed4f3dd3d81d2e09a0903052340e3d18
  7. I first got crazy, being unable to unset/reset Conflicts=shutdown.target with an empty override in our /etc/systemd/system/ifup@.service.d/dietpi.conf. Turns out, while this works for ExecStart= and others, for Before=, After=, Conflicts= and other unit directives, it does not. So there is no way to fix this bug without overriding the whole unit file.

Hence here the proper solution:

cat << '_EOF_' > /etc/systemd/system/ifup@.service
[Unit]
Description=ifup for %I
After=local-fs.target network-pre.target apparmor.service systemd-sysctl.service
Before=network.target network-online.target
BindsTo=sys-subsystem-net-devices-%i.device
After=sys-subsystem-net-devices-%i.device
DefaultDependencies=no
IgnoreOnIsolate=yes

[Service]
Type=oneshot
# avoid stopping on shutdown via stopping system-ifup.slice
Slice=system.slice
ExecStart=/sbin/ifup --allow=hotplug %I
ExecStop=/sbin/ifdown %I
RemainAfterExit=true
TimeoutStartSec=5min
_EOF_
rm -R /etc/systemd/system/ifup@.service.d

This is a copy of /lib/systemd/system/ifup@.service, with Conflicts= removed and our other enhancements added. So this service does not stop anymore on shutdown, as originally intended by Debian, since networking.service brings down all interfaces already.

@MichaIng
Copy link
Owner

MichaIng commented Jun 12, 2024

Merge request sent upstream: https://salsa.debian.org/debian/ifupdown/-/merge_requests/23

And our end: 5dac7e3

MichaIng added a commit that referenced this issue Jun 12, 2024
- Network | Resolved a rare issue, where shutdowns could hang, when networking.service and ifup@.service instances try to bring down the same network interface concurrently. Many thanks to @ioctl2 for reporting this issue: #7104
@MichaIng MichaIng added Testing/testers required 🔽 Solution available 🥂 Definite solution has been done External bug 🐞 For bugs which are not caused by DietPi. and removed Investigating 🤔 labels Jun 12, 2024
@ioctl2
Copy link
Author

ioctl2 commented Jun 12, 2024

The file you asked about contains this:

 cat /etc/systemd/system/ifupdown-pre.service.d/dietpi.conf
# Assure that ifupdown-pre always waits for udev to settle: https://dietpi.com/forum/t/6415/28
# Assure it finishes before ifup@.service instances start: https://github.com/MichaIng/DietPi/issues/6951
[Unit]
#Wants=network-pre.target
#Before=network-pre.target

[Service]
ExecStart=
ExecStart=/bin/dash -c '[ "$CONFIGURE_INTERFACES" = "no" ] || [ ! -x /bin/udevadm ] || udevadm settle'

I will try the fix and will report back.

@ioctl2
Copy link
Author

ioctl2 commented Jun 12, 2024

Unfortunately the patch did not resolve the issue on my end. It's possible I applied the patch incorrectly, though. Is there anything I can check?

@ioctl2
Copy link
Author

ioctl2 commented Jun 20, 2024

Is there any other workaround?

@MichaIng
Copy link
Owner

@ioctl2
Can you check this:

systemctl cat ifup@eth0.service
systemctl show -p Conflicts ifup@eth0.service

@ioctl2
Copy link
Author

ioctl2 commented Jun 22, 2024

Here are the outputs:

# systemctl cat ifup@eth0.service
# /etc/systemd/system/ifup@.service
[Unit]
Description=ifup for %I
After=local-fs.target network-pre.target apparmor.service systemd-sysctl.service
Before=network.target network-online.target
BindsTo=sys-subsystem-net-devices-%i.device
After=sys-subsystem-net-devices-%i.device
DefaultDependencies=no
IgnoreOnIsolate=yes

[Service]
Type=oneshot
# avoid stopping on shutdown via stopping system-ifup.slice
Slice=system.slice
ExecStart=/sbin/ifup --allow=hotplug %I
ExecStop=/sbin/ifdown %I
RemainAfterExit=true
TimeoutStartSec=5min
# systemctl show -p Conflicts ifup@eth0.service
Conflicts=

@MichaIng MichaIng mentioned this issue Jul 2, 2024
@MichaIng
Copy link
Owner

MichaIng commented Jul 2, 2024

Sorry for the late reply. On my tests, ifup@eth0.service is not stopped anymore on shutdown with this change. Can you check your logs/screen output whether this really is still true in your case?

@MichaIng
Copy link
Owner

@ioctl2
Ping, so I know whether to further investigate or close 🙂.

@ioctl2
Copy link
Author

ioctl2 commented Jul 10, 2024

Sorry about that. The issue persists on the one test system I have.

@MichaIng
Copy link
Owner

@ioctl2
Do shutdown logs/output still show ifup@eth0 being stopped?

@ioctl2
Copy link
Author

ioctl2 commented Jul 15, 2024

I had to move the affected system into its permanent home, and reboots are now disruptive. I can try to replicate the issue again on a similar system, though I'm not sure how quickly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
External bug 🐞 For bugs which are not caused by DietPi. Solution available 🥂 Definite solution has been done Testing/testers required 🔽 Waiting for user reply ⏳
Projects
None yet
Development

No branches or pull requests

2 participants