lstat /sys/fs/cgroup/devices/machine.slice/libpod-SHA.scope: ENOENT #11784

edsantiago · 2021-09-29T12:35:47Z

New flake in f33-root:

[+0920s] not ok 236 podman selinux: shared context in (some) namespaces
         # (from function `is' in file test/system/helpers.bash, line 508,
         #  in test file test/system/410-selinux.bats, line 126)
         #   `is "$output" "$context_c1" "new container, run with --pid of existing one "' failed
         # # podman rm --all --force
         # # podman ps --all --external --format {{.ID}} {{.Names}}
         # # podman images --all --format {{.Repository}}:{{.Tag}} {{.ID}}
         # quay.io/libpod/testimage:20210610 9f9ec7f2fdef
         # # podman run -d --name myctr quay.io/libpod/testimage:20210610 top
         # 1f043d30e46e9f85a55a13e7bd72f16316cfd56534e42e699d656ffd3d20da09
         # # podman exec myctr cat -v /proc/self/attr/current
         # system_u:system_r:container_t:s0:c364,c713^@
         # # podman run --name myctr2 --ipc container:myctr quay.io/libpod/testimage:20210610 cat -v /proc/self/attr/current
         # system_u:system_r:container_t:s0:c364,c713^@
         # # podman run --rm --pid container:myctr quay.io/libpod/testimage:20210610 cat -v /proc/self/attr/current
         # system_u:system_r:container_t:s0:c364,c713^@time="2021-09-28T17:08:59-05:00" level=warning msg="lstat /sys/fs/cgroup/devices/machine.slice/libpod-11f69a5b1d699bf9ab9e8b5fa8994e43b3ea7c3d0f0d1e1bc0d5bf33d37cccae.scope: no such file or directory"
         # #/vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
         # #|     FAIL: new container, run with --pid of existing one 
         # #| expected: 'system_u:system_r:container_t:s0:c364,c713^@'
         # #|   actual: 'system_u:system_r:container_t:s0:c364,c713^@time="2021-09-28T17:08:59-05:00" level=warning msg="lstat /sys/fs/cgroup/devices/machine.slice/libpod-11f69a5b1d699bf9ab9e8b5fa8994e43b3ea7c3d0f0d1e1bc0d5bf33d37cccae.scope: no such file or directory"'
         # #\^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This is not (as of this writing) a flake that will cause a CI failure, because (as of this writing) system tests and integration tests do not check for extra cruft. THIS IS GOING TO CHANGE, at least in system tests.

I cannot reproduce with podman-3.4.0-0.10.rc2.fc33, in 30 minutes of looping, but my cirrus-flake-grep tool shows this happening as far back as June (when I started collecting CI logs). All the instances I see are root; none rootless.

The text was updated successfully, but these errors were encountered:

edsantiago · 2021-09-29T12:36:09Z

@giuseppe PTAL

giuseppe · 2021-09-30T08:59:23Z

runc generates that error.

Not sure if it is a regression but it appeared with: opencontainers/runc@cbb0a79

Simple reproducer (as root):

# podman run -d --name foo alpine top
# podman run --rm --pid container:foo alpine true
WARN[0000] lstat /sys/fs/cgroup/devices/machine.slice/libpod-f2498bb96e51a783698380494e772b9a13cf3d044fc229cc9e4710e4eb10f811.scope: no such file or directory

@kolyshkin FYI

github-actions · 2021-10-31T00:04:09Z

A friendly reminder that this issue had no activity for 30 days.

rhatdan · 2021-11-02T12:20:18Z

Since this is not a Podman issue, should I close this?

edsantiago · 2021-11-08T12:09:46Z

cirrus-flake-grep reports that this is still happening. Only f33, which makes sense if it's a runc bug. Here are two recent examples: pr 11956 and pr 12107, both f33 root.

github-actions · 2021-12-11T00:04:16Z

A friendly reminder that this issue had no activity for 30 days.

edsantiago · 2021-12-13T13:12:24Z

Still happening.

Five instances on November 20
Six on November 23

Please remember that these do not cause actual CI failures, so my flake logger only catches them if they're present in an actual CI-failure-causing flake. The stats above are probably an underrepresentation.

vrothberg · 2021-12-14T10:52:37Z

There's a race condition (see below). It seems that the freezer state has changed when attempting to freeze (i.e., it shouldn't attempt to freeze afaiks).

dda8f53a07a7a39771e646e66148bb4b4d3952db44439ea563b8396bb868ac7f                                                                                                                             
22ff177983d71ff968598f880053a937393b7c9ea3fd14a00ae36ab49db4b851                                                                                                                             
ERRO[0000] STATE: FROZEN                                                                                                                                                                     
WARN[0000] freezer not supported: openat2 /sys/fs/cgroup/machine.slice/libpod-2f526821ca315a919d32aad899b9817121363405c3829650b03fd512817a3801.scope/cgroup.freeze: no such file or directory
ERRO[0000] STATE: THAWED                                                                                                                                                                     
WARN[0000] lstat /sys/fs/cgroup/machine.slice/libpod-2f526821ca315a919d32aad899b9817121363405c3829650b03fd512817a3801.scope: no such file or directory

I used the following diff to get the error log:

diff --git a/libcontainer/cgroups/fs2/freezer.go b/libcontainer/cgroups/fs2/freezer.go
index 8917a6411d68..b3ed1626c851 100644                                               
--- a/libcontainer/cgroups/fs2/freezer.go                                             
+++ b/libcontainer/cgroups/fs2/freezer.go                                             
@@ -12,9 +12,11 @@ import (                                                           
                                                                                      
        "github.com/opencontainers/runc/libcontainer/cgroups"                         
        "github.com/opencontainers/runc/libcontainer/configs"                         
+       "github.com/sirupsen/logrus"                                                  
 )                                                                                    
                                                                                      
 func setFreezer(dirPath string, state configs.FreezerState) error {                  
+       logrus.Errorf("STATE: %s", state)                                             
        var stateStr string                                                           
        switch state {                                                                
        case configs.Undefined:

@kolyshkin @cyphar could you have a look? I am not familiar with the runc code and think you know where to poke.

kolyshkin · 2021-12-16T03:47:56Z

There's a race condition (see below). It seems that the freezer state has changed when attempting to freeze (i.e., it shouldn't attempt to freeze afaiks).

From what I see, this is just two calls to setFreezer -- the first one to freeze it, the second one to unfreeze it. No race here.

I will take a closer look later.

vrothberg · 2021-12-16T09:37:57Z

Thanks! To elaborate on why I think there's a race. Each time it fails, state is != configs.Undefined which made me believe that the specific path isn't always present or some condition must be waited for/on. But it's just an uninformed guess; I am not familiar with the runc code base.

github-actions · 2022-01-16T00:05:10Z

A friendly reminder that this issue had no activity for 30 days.

rhatdan · 2022-01-18T10:35:11Z

@kolyshkin Any progress on this?

kolyshkin · 2022-01-19T01:25:50Z

Had no time to look at it, hopefully later this week (I have a separate browser window opened with this as a reminder 😁 )

github-actions · 2022-02-19T00:05:11Z

A friendly reminder that this issue had no activity for 30 days.

kolyshkin · 2022-02-21T18:37:27Z

Not stale

giuseppe · 2022-03-23T17:15:41Z

I think the cgroup could have been cleaned up by systemd while runc is trying to use it.

Should we close this issue? I don't think there is anything we can do from the Podman side

vrothberg · 2022-03-24T08:58:43Z

I agree.

To silence my find-obsolete-skips script: - containers#11784 : issue closed wont-fix - containers#15013 : issue closed, we no longer test with runc - containers#15014 : bump timeout, see if that fixes things - containers#15025 : issue closed, we no longer test with runc ...and one FIXME not associated with an issue, ubuntu-related, and we no longer test ubuntu. Signed-off-by: Ed Santiago <santiago@redhat.com>

To silence my find-obsolete-skips script, remove the '#' from the following issues in skip messages: containers#11784 containers#15013 containers#15025 containers#17433 containers#17436 containers#17456 Also update the messages to reflect the fact that the issues will never be fixed. Also remove ubuntu skips: we no longer test ubuntu. Also remove one buildah skip that is no longer applicable: Fixes: containers#17520 Signed-off-by: Ed Santiago <santiago@redhat.com>

edsantiago added the flakes Flakes from Continuous Integration label Sep 29, 2021

This was referenced Sep 29, 2021

cgroupsv1(?): cannot toggle freezer: cgroups not configured for container #11785

Closed

System tests: tighten 'is' operator #11776

Merged

github-actions bot added the stale-issue label Oct 31, 2021

github-actions bot removed the stale-issue label Nov 3, 2021

github-actions bot added the stale-issue label Dec 11, 2021

edsantiago removed the stale-issue label Dec 13, 2021

vrothberg changed the title ~~cgroupsv1(?): lstat /sys/fs/cgroup/devices/machine.slice/libpod-SHA.scope: ENOENT~~ lstat /sys/fs/cgroup/devices/machine.slice/libpod-SHA.scope: ENOENT Dec 14, 2021

github-actions bot added the stale-issue label Jan 16, 2022

rhatdan removed the stale-issue label Jan 18, 2022

github-actions bot added the stale-issue label Feb 19, 2022

rhatdan removed the stale-issue label Feb 21, 2022

vrothberg closed this as completed Mar 24, 2022

edsantiago mentioned this issue Jan 9, 2023

Tests: remove/update obsolete skips #17047

Closed

edsantiago mentioned this issue Jun 13, 2023

test/e2e: check for stderr errors in cleanup() #18442

Merged

edsantiago mentioned this issue Jul 13, 2023

Tests: remove/update obsolete skips #19234

Merged

github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 20, 2023

github-actions bot locked as resolved and limited conversation to collaborators Sep 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lstat /sys/fs/cgroup/devices/machine.slice/libpod-SHA.scope: ENOENT #11784

lstat /sys/fs/cgroup/devices/machine.slice/libpod-SHA.scope: ENOENT #11784

edsantiago commented Sep 29, 2021

edsantiago commented Sep 29, 2021

giuseppe commented Sep 30, 2021

github-actions bot commented Oct 31, 2021

rhatdan commented Nov 2, 2021

edsantiago commented Nov 8, 2021

github-actions bot commented Dec 11, 2021

edsantiago commented Dec 13, 2021

vrothberg commented Dec 14, 2021 •

edited

Loading

kolyshkin commented Dec 16, 2021

vrothberg commented Dec 16, 2021

github-actions bot commented Jan 16, 2022

rhatdan commented Jan 18, 2022

kolyshkin commented Jan 19, 2022

github-actions bot commented Feb 19, 2022

kolyshkin commented Feb 21, 2022

giuseppe commented Mar 23, 2022

vrothberg commented Mar 24, 2022

lstat /sys/fs/cgroup/devices/machine.slice/libpod-SHA.scope: ENOENT #11784

lstat /sys/fs/cgroup/devices/machine.slice/libpod-SHA.scope: ENOENT #11784

Comments

edsantiago commented Sep 29, 2021

edsantiago commented Sep 29, 2021

giuseppe commented Sep 30, 2021

github-actions bot commented Oct 31, 2021

rhatdan commented Nov 2, 2021

edsantiago commented Nov 8, 2021

github-actions bot commented Dec 11, 2021

edsantiago commented Dec 13, 2021

vrothberg commented Dec 14, 2021 • edited Loading

kolyshkin commented Dec 16, 2021

vrothberg commented Dec 16, 2021

github-actions bot commented Jan 16, 2022

rhatdan commented Jan 18, 2022

kolyshkin commented Jan 19, 2022

github-actions bot commented Feb 19, 2022

kolyshkin commented Feb 21, 2022

giuseppe commented Mar 23, 2022

vrothberg commented Mar 24, 2022

vrothberg commented Dec 14, 2021 •

edited

Loading