Root reprovisioning CI failures #619

cgwalters · 2020-08-31T17:14:45Z

The tests in coreos/fedora-coreos-config#503 are failing periodically in CI runs on multiple repos, e.g.:

Seems to relate to SELinux in at least some cases.

Sohan and I couldn't reproduce this locally but I am increasingly thinking it's a race condition.

cgwalters · 2020-08-31T17:21:21Z

If it's a race, one reason this might appear in CI versus locally is that we're running a lot of other tests in parallel in CI. I'm trying this in a local run:

$ kola run --parallel 8 --multiply 8 'ext.fedora-coreos-config.*'

Another possible reason this is happening in CI is those nodes have a lot of physical CPUs.

cgwalters · 2020-08-31T17:39:08Z

OK exciting, I had to go to --parallel 32 --multiply 32 to hit this interesting race:

XFS (vdc4): Internal error !uuid_equal(&mp->m_sb.sb_uuid, &head->h_fs_uuid) at line 278 of file fs/xfs/xfs_log_recover.c.  Caller xlog_header_check_mount+0x35/0xc0 [xfs]
CPU: 0 PID: 528 Comm: mount Not tainted 5.7.17-200.fc32.x86_64 coreos/fedora-coreos-config#1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014
Call Trace:
 dump_stack+0x64/0x88
 xfs_corruption_error+0x85/0x90 [xfs]
 ? xlog_header_check_mount+0x35/0xc0 [xfs]
 xlog_header_check_mount+0x62/0xc0 [xfs]
 ? xlog_header_check_mount+0x35/0xc0 [xfs]
 xlog_find_verify_log_record+0xfe/0x200 [xfs]
 xlog_find_head+0x1b0/0x370 [xfs]
 xlog_find_tail+0x34/0x340 [xfs]
 ? try_to_wake_up+0x26a/0x750
 xlog_recover+0x1c/0x140 [xfs]
 ? xfs_trans_ail_init+0xa8/0xd0 [xfs]
 xfs_log_mount+0x156/0x2b0 [xfs]
 xfs_mountfs+0x431/0x8a0 [xfs]
 ? xfs_mru_cache_create+0x12d/0x180 [xfs]
 xfs_fc_fill_super+0x35f/0x590 [xfs]
 ? xfs_setup_devices+0x80/0x80 [xfs]
 get_tree_bdev+0x15c/0x250
 vfs_get_tree+0x25/0xb0
 do_mount+0x7b7/0xad0
 ? memdup_user+0x4e/0x90
 __x64_sys_mount+0x8e/0xd0
 do_syscall_64+0x5b/0xf0
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f0e32af857e
Code: 48 8b 0d 1d 09 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ea 08 0c 00 f7 d8 64 89 01 48
RSP: 002b:00007fff6a08c5e8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007f0e32c1f204 RCX: 00007f0e32af857e
RDX: 000056300c41cb70 RSI: 000056300c415db0 RDI: 000056300c417a90
RBP: 000056300c415b60 R08: 0000000000000000 R09: 000056300c415010
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 000056300c417a90 R14: 000056300c41cb70 R15: 000056300c415b60
XFS (vdc4): Corruption detected. Unmount and run xfs_repair
XFS (vdc4): log has mismatched uuid - can't recover
XFS (vdc4): failed to find log head
XFS (vdc4): log mount/recovery failed: error -117
mount: /sysroot: mount(2) system call failed: Structure needs cleaning.
XFS (vdc4): log mount failed

I think this is something like: we need to xfs_freeze after unmounting the rootfs before reboting?

cgwalters · 2020-08-31T21:25:13Z

A few runs of kola run --parallel 12 --multiply 32 'ext.fedora-coreos-config.root-reprovision.*' all passed without hitting anything. May try to reproduce in the coreosci environment once I have a dev namespace.

This reverts commit 522ca22. These are constantly failing in CI, we don't know why yet. ref https://github.com/coreos/fedora-coreos-config/issues/591

jlebon · 2020-09-08T13:44:45Z

Will look into this.

We have a new Clevis release now with the fixes we need, so add the packages to the manifest. This is all that's needed to support root-on-LUKS since the rest of the rootfs replacement stack is already LUKS-aware. Sample Ignition config: ``` { "ignition": { "version": "3.2.0-experimental" }, "storage": { "luks": [ { "name": "myluksdev", "device": "/dev/disk/by-id/virtio-disk1", "clevis": { "tpm2": true }, "label": "root" } ], "filesystems": [ { "device": "/dev/disk/by-id/dm-name-myluksdev", "format": "xfs", "wipeFilesystem": true, "label": "root" } ] } } ``` Not adding tests for now until we've resolved coreos/fedora-coreos-tracker#619.

We have a new Clevis release now with the fixes we need, so add the packages to the manifest. This is all that's needed to support root-on-LUKS since the rest of the rootfs replacement stack is already LUKS-aware. Sample Ignition config: ``` { "ignition": { "version": "3.2.0-experimental" }, "storage": { "luks": [ { "name": "myluksdev", "device": "/dev/disk/by-id/virtio-disk1", "clevis": { "tpm2": true }, "label": "root" } ], "filesystems": [ { "device": "/dev/mapper/myluksdev", "format": "xfs", "wipeFilesystem": true, "label": "root" } ] } } ``` Not adding tests for now until we've resolved coreos/fedora-coreos-tracker#619.

jlebon · 2020-09-11T13:39:17Z

OK exciting, I had to go to --parallel 32 --multiply 32 to hit this interesting race:

Do you remember which of the two tests hit this? (filesystem-only or raid1?)

cgwalters · 2020-09-11T14:10:43Z

Do you remember which of the two tests hit this? (filesystem-only or raid1?)

I'm not sure, and looks like my kola run got GC'd. If I had to guess I'd say filesystem-only.

This reverts commit 77787a0. I can't reproduce the failures in coreos/fedora-coreos-tracker#619 locally or on the CI cluster, and the logs from previous failures are stale. Let's re-enable the tests and if it comes back we'll debug more deeply.

jlebon · 2020-09-11T17:11:28Z

I can't reproduce this, either locally or on the CI cluster. Let's optimistically revert the revert for now and I'll dig into errors if they pop out?

This reverts commit 77787a0. I can't reproduce the failures in coreos/fedora-coreos-tracker#619 locally or on the CI cluster, and the logs from previous failures are stale. Let's re-enable the tests and if it comes back we'll debug more deeply.

jlebon · 2020-09-11T20:44:16Z

Let's keep this open for now. If anyone sees this happening again, please post the logs as an attachment here so it doesn't get GC'ed.

sohankunkerkar · 2020-09-14T22:52:25Z

https://jenkins-coreos-ci.apps.ocp.ci.centos.org/blue/organizations/jenkins/github-ci%2Fcoreos%2Fignition/detail/PR-1087/26/pipeline/137

ext.config.root-reprovision.raid1
ext.config.root-reprovision.filesystem-only

cgwalters · 2020-09-15T14:05:59Z

https://jenkins-coreos-ci.apps.ocp.ci.centos.org/blue/organizations/jenkins/github-ci%2Fcoreos%2Fcoreos-assembler/detail/PR-1715/6/pipeline
had the raid1 test hang apparently indefinitely:

[    5.097711] ignition[433]: op(1): [finished] loading QEMU firmware config module
[    5.107959] systemd[1]: Condition check resulted in Ignition (fetch) being skipped.
[�[0;32m  OK  �[0m] Finished �[0;1;39mdracut pre-mount hook�[0m.
[    5.116769] systemd[1]: Starting Ignition OSTree: detect rootfs replacement...
[    5.122724] ignition[433]: fetch-offline: fetch-offline passed
[    5.126681] systemd[1]: Finished Copy CoreOS Firstboot Networking Config.
[    5.134695] ignition[433]: Ignition finished successfully
[    5.136552] systemd[1]: Starting dracut initqueue hook...
[    5.138407] systemd[1]: Finished dracut initqueue hook.
[    5.144700] systemd[1]: Reached target Remote File Systems (Pre).
[    5.146754] systemd[1]: Reached target Remote File Systems.
[    5.150697] systemd[1]: Condition check resulted in Acquire live PXE rootfs image being skipped.
[�[0;32m  OK  �[0m] Finished �[0;1;39mIgnition OSTree: detect rootfs replacement�[0m.
         Starting �[0;1;39mIgnition OSTree: save rootfs�[0m...
[    5.163909] ignition-ostree-dracut-rootfs[442]: Detected rootfs replacement in fetched Ignition config: /run/ignition.json
[    5.168333] systemd[1]: Condition check resulted in Persist osmet files (PXE) being skipped.
[    5.171938] systemd[1]: Starting dracut pre-mount hook...
[    5.174295] systemd[1]: Finished dracut pre-mount hook.
[    5.189550] systemd[1]: Finished Ignition OSTree: detect rootfs replacement.
[    5.191891] systemd[1]: Starting Ignition OSTree: save rootfs...
[    5.326556] SGI XFS with ACLs, security attributes, scrub, quota, no debug enabled
[    5.331608] XFS (vdc4): Mounting V5 Filesystem
[    5.342457] XFS (vdc4): Ending clean mount
[    5.346204] xfs filesystem being mounted at /sysroot supports timestamps until 2038 (0x7fffffff)
[    5.350338] ignition-ostree-dracut-rootfs[461]: Moving rootfs to RAM...
[�[0m�[0;31m*     �[0m] A start job is running for Ignition…STree: save rootfs (5s / no limit)
�M
�[K[�[0;1;31m*�[0m�[0;31m*    �[0m] A start job is running for Ignition…STree: save rootfs (5s / no limit)
�M
�[K[�[0;31m*�[0;1;31m*�[0m�[0;31m*   �[0m] A start job is running for Ignition…STree: save rootfs (6s / no limit)
�M
�[K[ �[0;31m*�[0;1;31m*�[0m�[0;31m*  �[0m] A start job is running for Ignition…STree: save rootfs (6s / no limit)
�M
�[K[  �[0;31m*�[0;1;31m*�[0m�[0;31m* �[0m] A start job is running for Ignition…STree: save rootfs (7s / no limit)
�M
�[K[   �[0;31m*�[0;1;31m*�[0m�[0;31m*�[0m] A start job is running for Ignition…STree: save rootfs (7s / no limit)
�M
�[K[    �[0;31m*�[0;1;31m*�[0m] A start job is running for Ignition…STree: save rootfs (8s / no limit)
�M
�[K[     �[0;31m*�[0m] A start job is running for Ignition…STree: save rootfs (9s / no limit)
[   11.764682] XFS (vdc4): Unmounting Filesystem
�M
�[K[    �[0;31m*�[0;1;31m*�[0m] A start job is running for Ignition…STree: save rootfs (9s / no limit)
[   11.862630] systemd[1]: Finished Ignition OSTree: save rootfs.
�M
�[K[�[0;32m  OK  �[0m] Finished �[0;1;39mIgnition OSTree: save rootfs�[0m.
�[K[   11.866260] kauditd_printk_skb: 19 callbacks suppressed
[   11.866261] audit: type=1130 audit(1600124341.468:30): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=ignition-ostree-rootfs-save comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   11.875167] systemd[1]: Starting Ignition (disks)...
         Starting �[0;1;39mIgnition (disks)�[0m...
[   11.890131] ignition[476]: Ignition 2.6.0
[   11.891777] ignition[476]: Stage: disks
[   11.894807] ignition[476]: reading system config file "/usr/lib/ignition/base.ign"
[   11.901967] ignition[476]: disks: createRaids: op(1): [started]  waiting for devices [/dev/disk/by-id/virtio-disk1 /dev/disk/by-id/virtio-disk2]
[   11.911678] ignition[476]: disks: createRaids: op(1): [finished] waiting for devices [/dev/disk/by-id/virtio-disk1 /dev/disk/by-id/virtio-disk2]
[   11.915716] ignition[476]: disks: createRaids: created device alias for "/dev/disk/by-id/virtio-disk1": "/run/ignition/dev_aliases/dev/disk/by-id/virtio-disk1" -> "/dev/vda"
[   11.920660] ignition[476]: disks: createRaids: created device alias for "/dev/disk/by-id/virtio-disk2": "/run/ignition/dev_aliases/dev/disk/by-id/virtio-disk2" -> "/dev/vdb"
[   11.925725] ignition[476]: disks: createRaids: op(2): [started]  creating "foobar"
[   12.034804] md/raid1:md127: not clean -- starting background reconstruction
[   12.037095] md/raid1:md127: active with 2 out of 2 mirrors
[   12.039095] md127: detected capacity change from 0 to 5363466240
[   12.044198] md: resync of RAID array md127
[   12.163572] ignition[476]: disks: createRaids: op(2): [finished] creating "foobar"
[   12.181909] ignition[476]: disks: createFilesystems: op(3): [started]  waiting for devices [/dev/md/foobar]
[   12.185192] ignition[476]: disks: createFilesystems: op(3): [finished] waiting for devices [/dev/md/foobar]
[   12.188500] ignition[476]: disks: createFilesystems: created device alias for "/dev/md/foobar": "/run/ignition/dev_aliases/dev/md/foobar" -> "/dev/md127"
[   12.192382] ignition[476]: disks: createFilesystems: op(4): [started]  determining filesystem type of "/dev/md/foobar"
[   12.207473] ignition[476]: disks: createFilesystems: op(4): [finished] determining filesystem type of "/dev/md/foobar"
[   12.212738] ignition[476]: disks: createFilesystems: found  filesystem at "/dev/md/foobar" with uuid "" and label ""
[   12.216217] ignition[476]: disks: createFilesystems: op(5): [started]  wiping filesystem signatures from "/run/ignition/dev_aliases/dev/md/foobar"
[   12.229653] ignition[476]: disks: createFilesystems: op(5): [finished] wiping filesystem signatures from "/run/ignition/dev_aliases/dev/md/foobar"
[   12.239151] ignition[476]: disks: createFilesystems: op(6): [started]  creating "xfs" filesystem on "/run/ignition/dev_aliases/dev/md/foobar"
[   �[0;31m*�[0;1;31m*�[0m�[0;31m*�[0m] A start job is running for Ignition (disks) (12s / no limit)
�M
�[K[  �[0;31m*�[0;1;31m*�[0m�[0;31m* �[0m] A start job is running for Ignition (disks) (12s / no limit)
[   15.195801] ignition[476]: disks: createFilesystems: op(6): [finished] creating "xfs" filesystem on "/run/ignition/dev_aliases/dev/md/foobar"
[   15.214291] ignition[476]: disks: op(7): [started]  waiting for udev to settle
[   15.281972] ignition[476]: disks: op(7): [finished] waiting for udev to settle
[   15.285835] ignition[476]: disks: disks passed
�M
�[K[[   15.295958] ignition[476]: Ignition finished successfully
�[0;32m  OK  �[0m] Finished �[0;1;39mIgnition (disks)�[0m.
�[K[   15.301102] audit: type=1130 audit(1600124344.903:31): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=ignition-disks comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[�[0;32m  OK  �[0m] Reached target �[0;1;39mInitrd Root Device�[0m.
[   15.313912] systemd[1]: Finished Ignition (disks).
         Starting �[0;1;39mIgnition OSTree: restore rootfs�[0m...
[   15.325062] systemd[1]: Reached target Initrd Root Device.
         Starting �[0;1;39mIgnition OSTree: …rate filesystem UUID (root)�[0m...
[   15.341523] systemd[1]: Starting Ignition OSTree: restore rootfs...
[   15.345379] systemd[1]: Starting Ignition OSTree: Regenerate filesystem UUID (root)...
[�[0;32m  OK  �[0m] [   15.360753] ignition-ostree-firstboot-uuid[498]: No changes required for /dev/disk/by-label/root TYPE=xfs UUID=67f15e39-f361-4364-9fdc-1786bc168312
Finished �[0;1;39mIgnition OSTree: …nerate filesystem UUID (root)�[0m.
[   15.367086] audit: type=1130 audit(1600124344.969:32): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=ignition-ostree-uuid-root comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   15.374568] systemd[1]: Finished Ignition OSTree: Regenerate filesystem UUID (root).
[   15.378564] XFS (md127): Mounting V5 Filesystem
[   15.389628] XFS (md127): Ending clean mount
[   15.395693] xfs filesystem being mounted at /sysroot supports timestamps until 2038 (0x7fffffff)
[   15.400226] ignition-ostree-dracut-rootfs[497]: Restoring rootfs from RAM...
[ �[0;31m*�[0;1;31m*�[0m�[0;31m*  �[0m] A start job is running for Ignition…e: restore rootfs (15s / no limit)
�M
�[K[�[0;31m*�[0;1;31m*�[0m�[0;31m*   �[0m] A start job is running for Ignition…e: restore rootfs (15s / no limit)
�M
�[K[�[0;1;31m*�[0m�[0;31m*    �[0m] A start job is running for Ignition…e: restore rootfs (16s / no limit)
�M
�[K[�[0m�[0;31m*     �[0m] A start job is running for Ignition…e: restore rootfs (17s / no limit)
�M
�[K[�[0;1;31m*�[0m�[0;31m*    �[0m] A start job is running for Ignition…e: restore rootfs (17s / no limit)
�M
�[K[�[0;31m*�[0;1;31m*�[0m�[0;31m*   �[0m] A start job is running for Ignition…e: restore rootfs (18s / no limit)
�M
�[K[ �[0;31m*�[0;1;31m*�[0m�[0;31m*  �[0m] A start job is running for Ignition…e: restore rootfs (18s / no limit)
�M
�[K[  �[0;31m*�[0;1;31m*�[0m�[0;31m* �[0m] A start job is running for Ignition…e: restore rootfs (19s / no limit)
�M
�[K[   �[0;31m*�[0;1;31m*�[0m�[0;31m*�[0m] A start job is running for Ignition…e: restore rootfs (19s / no limit)
�M
�[K[    �[0;31m*�[0;1;31m*�[0m] A start job is running for Ignition…e: restore rootfs (20s / no limit)
�M
�[K[     �[0;31m*�[0m] A start job is running for Ignition…e: restore rootfs (21s / no limit)
�M
�[K[    �[0;31m*�[0;1;31m*�[0m] A start job is running for Ignition…e: restore rootfs (21s / no limit)
�M
�[K[   �[0;31m*�[0;1;31m*�[0m�[0;31m*�[0m] A start job is running for Ignition…e: restore rootfs (22s / no limit)
�M
�[K[  �[0;31m*�[0;1;31m*�[0m�[0;31m* �[0m] A start job is running for Ignition…e: restore rootfs (22s / no limit)
�M
�[K[ �[0;31m*�[0;1;31m*�[0m�[0;31m*  �[0m] A start job is running for Ignition…e: restore rootfs (23s / no limit)
�M
�[K[�[0;31m*�[0;1;31m*�[0m�[0;31m*   �[0m] A start job is running for Ignition…e: restore rootfs (24s / no limit)
�M
�[K[�[0;1;31m*�[0m�[0;31m*    �[0m] A start job is running for Ignition…e: restore rootfs (24s / no limit)
�M
�[K[�[0m�[0;31m*     �[0m] A start job is running for Ignition…e: restore rootfs (25s / no limit)
�M
�[K[�[0;1;31m*�[0m�[0;31m*    �[0m] A start job is running for Ignition…e: restore rootfs (25s / no limit)
�M
�[K[�[0;31m*�[0;1;31m*�[0m�[0;31m*   �[0m] A start job is running for Ignition…e: restore rootfs (26s / no limit)
�M
�[K[ �[0;31m*�[0;1;31m*�[0m�[0;31m*  �[0m] A start job is running for Ignition…e: restore rootfs (26s / no limit)
�M
�[K[  �[0;31m*�[0;1;31m*�[0m�[0;31m* �[0m] A start job is running for Ignition…e: restore rootfs (27s / no limit)
�M
�[K[   �[0;31m*�[0;1;31m*�[0m�[0;31m*�[0m] A start job is running for Ignition…e: restore rootfs (28s / no limit)

jlebon · 2020-09-15T15:24:53Z

Both tests from Sohan's links above show SELinux-related issues:

[   71.843503] SELinux:  Context unlabeled is not valid (left unmapped).
[   71.850076] systemd[1]: Failed to compute init label, ignoring.
[   71.851847] systemd[1]: Successfully loaded SELinux policy in 457.134ms.
...
[�[0;32m  OK  �[0m] Listening on �[0;1;39mDevice-mapper event daemon FIFOs�[0m.
[   72.466410] systemd[1]: lvm2-lvmpolld.socket: Failed to determine SELinux label: Invalid argument
[   72.469014] systemd[1]: lvm2-lvmpolld.socket: Failed to listen on sockets: Invalid argument
[   72.471594] systemd[1]: lvm2-lvmpolld.socket: Failed with result 'resources'.
[   72.473514] systemd[1]: Failed to listen on LVM2 poll daemon socket.
[�[0;1;31mFAILED�[0m] Failed to listen on �[0;1;39mLVM2 poll daemon socket�[0m.
See 'systemctl status lvm2-lvmpolld.socket' for details.
[   72.477160] systemd[1]: multipathd.socket: Failed to determine SELinux label: Invalid argument
[   72.479823] systemd[1]: multipathd.socket: Failed to listen on sockets: Invalid argument
[   72.482379] systemd[1]: multipathd.socket: Failed with result 'resources'.
[   72.484246] systemd[1]: Failed to listen on multipathd control socket.
[�[0;1;31mFAILED�[0m] Failed to listen on �[0;1;39mmultipathd control socket�[0m.
See 'systemctl status multipathd.socket' for details.
...

Hmm, almost seems like there's a mismatch between the labels in the policy and the ones on disk?

I'm trying to reproduce this manually on the same CI cluster but not having any luck so far. I'll have to poke at this through PRs.

jlebon · 2020-09-15T15:38:32Z

One interesting thing is that the revert was merged 4 days ago and only now we're seeing failures. This coincides with the cluster currently having NFS issues: https://pagure.io/centos-infra/issue/26#comment-685779.

Ordinarily these tests run in an emptyDir, though Jenkins still has to write back the job logs over NFS and perhaps hangs in the hypervisor is causing hangs in the guest? Not sure.

cgwalters · 2020-09-15T16:34:11Z

So what's different about these jobs though from a CI perspective? Is it the amount of data written into the qcow2 overlay?
I wonder if it's something like us writing that qcow2 data into the container overlayfs in /var/tmp instead of the emptydir? Hopefully it's not ending up on NFS somehow.

The loopback issue is fixed now and we need a newer kernel for the rootfs reprovisioning tests to work. Closes: coreos/fedora-coreos-tracker#619

The loopback issue should be fixed now and we need a newer kernel for the rootfs reprovisioning tests to work. Closes: coreos/fedora-coreos-tracker#619

jlebon · 2020-09-15T19:21:17Z

OK, finally got to the end of this. I think it's only Ignition which is consistently failing on this and that is fixed by coreos/ignition#1093. It should've been obvious that we were using an older kernel which doesn't support reading labels. The smoking gun was comparing the output of setfiles in a successful test vs a broken one:

[   42.030869] ignition-ostree-populate-var[725]: Relabeled /sysroot//var/home from (null) to system_u:object_r:home_root_t:s0
...

vs

[   55.415904] ignition-ostree-populate-var[674]: Relabeled /sysroot//var/mnt from unlabeled to system_u:object_r:var_t:s0
...

Notice the (null) vs unlabeled here. Prior to 5.7.9, unlabeled is what the kernel returns when trying to read a label before the policy is loaded.

I do think there are other flakes, but at this point I think they're rare enough to live with for now. (At the very least, there's the xfs traceback Colin found above, and the raid1 test sometimes hanging). My proposal is to open separate issues for those if they become a real problem and we're inclined to dig deeper.

The loopback issue should be fixed now and we need a newer kernel for the rootfs reprovisioning tests to work. Closes: coreos/fedora-coreos-tracker#619

miabbott · 2022-02-16T14:59:16Z

While chasing down problems seen on a hotfixed RHCOS 4.9 build, we observed some races between randomizing the rootfs UUID and mounting the rootfs. The fix for this is believed to be in coreos/fedora-coreos-config#1357

Speaking with @sandeen about the race, he identified this issue that could be traced to the same root cause.

See also downstream BZ https://bugzilla.redhat.com/show_bug.cgi?id=2055258

cgwalters referenced this issue in cgwalters/fedora-coreos-config Sep 4, 2020

Revert "Add root reprovisioning tests"

0df04d4

This reverts commit 522ca22. These are constantly failing in CI, we don't know why yet. ref https://github.com/coreos/fedora-coreos-config/issues/591

jlebon self-assigned this Sep 8, 2020

jlebon changed the title ~~CI flakes from https://github.com/coreos/fedora-coreos-config/pull/503~~ Root reprovisioning CI failures Sep 8, 2020

jlebon transferred this issue from coreos/fedora-coreos-config Sep 8, 2020

jlebon added the jira for syncing to jira label Sep 8, 2020

jlebon mentioned this issue Sep 9, 2020

Add support for root-on-LUKS coreos/fedora-coreos-config#609

Merged

arithx mentioned this issue Sep 11, 2020

storage: add LUKS docs coreos/fedora-coreos-docs#172

Merged

jlebon mentioned this issue Sep 11, 2020

Revert "Revert "Add root reprovisioning tests"" coreos/fedora-coreos-config#611

Merged

sohankunkerkar mentioned this issue Sep 14, 2020

stages/disks: support resizing an existing partition coreos/ignition#1087

Merged

jlebon added a commit to jlebon/ignition that referenced this issue Sep 15, 2020

ci: unfreeze kernel

7f15bf6

The loopback issue is fixed now and we need a newer kernel for the rootfs reprovisioning tests to work. Closes: coreos/fedora-coreos-tracker#619

jlebon added a commit to jlebon/ignition that referenced this issue Sep 15, 2020

ci: unfreeze kernel

cec8936

The loopback issue should be fixed now and we need a newer kernel for the rootfs reprovisioning tests to work. Closes: coreos/fedora-coreos-tracker#619

jlebon mentioned this issue Sep 15, 2020

ci: unfreeze kernel coreos/ignition#1093

Merged

jlebon added a commit to jlebon/ignition that referenced this issue Sep 15, 2020

ci: unfreeze kernel

a64e1eb

The loopback issue should be fixed now and we need a newer kernel for the rootfs reprovisioning tests to work. Closes: coreos/fedora-coreos-tracker#619

sohankunkerkar closed this as completed in coreos/ignition#1093 Sep 16, 2020

bgilbert mentioned this issue Feb 16, 2022

ignition-ostree-uuid-root.service: add stricter ordering coreos/fedora-coreos-config#1357

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Root reprovisioning CI failures #619

Root reprovisioning CI failures #619

cgwalters commented Aug 31, 2020

cgwalters commented Aug 31, 2020

cgwalters commented Aug 31, 2020 •

edited

Loading

cgwalters commented Aug 31, 2020

jlebon commented Sep 8, 2020

jlebon commented Sep 11, 2020

cgwalters commented Sep 11, 2020

jlebon commented Sep 11, 2020

jlebon commented Sep 11, 2020

sohankunkerkar commented Sep 14, 2020 •

edited by jlebon

Loading

cgwalters commented Sep 15, 2020

jlebon commented Sep 15, 2020

jlebon commented Sep 15, 2020

cgwalters commented Sep 15, 2020

jlebon commented Sep 15, 2020

miabbott commented Feb 16, 2022 •

edited

Loading

Root reprovisioning CI failures #619

Root reprovisioning CI failures #619

Comments

cgwalters commented Aug 31, 2020

cgwalters commented Aug 31, 2020

cgwalters commented Aug 31, 2020 • edited Loading

cgwalters commented Aug 31, 2020

jlebon commented Sep 8, 2020

jlebon commented Sep 11, 2020

cgwalters commented Sep 11, 2020

jlebon commented Sep 11, 2020

jlebon commented Sep 11, 2020

sohankunkerkar commented Sep 14, 2020 • edited by jlebon Loading

cgwalters commented Sep 15, 2020

jlebon commented Sep 15, 2020

jlebon commented Sep 15, 2020

cgwalters commented Sep 15, 2020

jlebon commented Sep 15, 2020

miabbott commented Feb 16, 2022 • edited Loading

cgwalters commented Aug 31, 2020 •

edited

Loading

sohankunkerkar commented Sep 14, 2020 •

edited by jlebon

Loading

miabbott commented Feb 16, 2022 •

edited

Loading