Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

podman stop, after kube play: Storage for container xxx has been removed #19702

Closed
edsantiago opened this issue Aug 22, 2023 · 21 comments · Fixed by #20456
Closed

podman stop, after kube play: Storage for container xxx has been removed #19702

edsantiago opened this issue Aug 22, 2023 · 21 comments · Fixed by #20456
Labels
flakes Flakes from Continuous Integration locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@edsantiago
Copy link
Member

Seeing this often when I cherrypick #18442

...podman kube something
...test passes, then we get to AfterEach:
# podman [options] stop --all -t 0
           time="2023-08-22T07:11:18-05:00" level=error msg="Storage for container 5ec989fdc4954fe14c74f4e34345a34aeec61a0e2ee518219b01fa06e0445fca has been removed"
           59ded94ff300dd69bf0199a8f360cc1ac5a5262b03ccd34458407b421a765013
           4d93d65b97916b0b650c4ff50e4a1f1a8a4c657780c07a4a7546b7a96fc0b125
           5ec989fdc4954fe14c74f4e34345a34aeec61a0e2ee518219b01fa06e0445fca

FAIL expected no stderr

e.g. f38 root. Twice.

@edsantiago
Copy link
Member Author

The list so far:

  • fedora-37 : int podman fedora-37 root host sqlite
    • 08-22 23:24 in TOP-LEVEL [AfterEach] Podman play kube podman play kube with image data
    • 08-22 23:24 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved CIDFile annotation in yaml
    • 08-22 20:02 in TOP-LEVEL [AfterEach] Podman start podman start single container by id
    • 08-22 08:24 in TOP-LEVEL [AfterEach] Podman play kube podman play kube with image data
  • fedora-37 : int podman fedora-37 rootless host sqlite
    • 08-22 23:21 in TOP-LEVEL [AfterEach] Podman run networking podman run network bind to 127.0.0.1
    • 08-22 23:21 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved Apparmor annotation in yaml
    • 08-22 19:57 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved Label annotation in yaml
    • 08-22 08:18 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved PublishAll annotation in yaml
    • 08-22 08:18 in TOP-LEVEL [AfterEach] Podman run networking podman run --net container: and --uts container:
  • fedora-38 : int podman fedora-38 root container sqlite
    • 08-22 20:00 in TOP-LEVEL [AfterEach] Podman pod create podman start infra container different image
  • fedora-38 : int podman fedora-38 root host boltdb
    • 08-22 23:21 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved Label annotation in yaml
  • fedora-38 : int podman fedora-38 root host sqlite
    • 08-22 08:19 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved init annotation in yaml
    • 08-22 08:19 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved Label annotation in yaml
  • fedora-38 : int podman fedora-38 rootless host boltdb
    • 08-22 23:19 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved Seccomp annotation in yaml
  • fedora-38 : int podman fedora-38 rootless host sqlite
    • 08-22 23:22 in TOP-LEVEL [AfterEach] Podman container clone podman container clone basic test
  • rawhide : int podman rawhide rootless host sqlite
    • 08-22 23:20 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved CIDFile annotation in yaml
    • 08-22 19:58 in TOP-LEVEL [AfterEach] Podman play kube podman play kube deployment more than 1 replica test correct command

@edsantiago
Copy link
Member Author

New correlated symptom seen in f38 root:

# podman [options] stop --all -t 0
error opening file `/run/crun/b9824ce14f1c9cc3a51644f709ac9cf48a57ecf10acce92b55346c156a3d7fcb/status`: No such file or directory
time="2023-08-26T17:00:08-05:00" level=error msg="Storage for container b9824ce14f1c9cc3a51644f709ac9cf48a57ecf10acce92b55346c156a3d7fcb has been removed"

This is in #17831 with @giuseppe's #19760 cherrypicked. Could be coincidence. I can't look into it now.

@edsantiago
Copy link
Member Author

I'm giving up on this: I am pulling the stderr-on-teardown checks from my flake-check PR. It's too much, costing me way too much time between this and #19721. Until these two are fixed, I can't justify the time it takes me to sort through these flakes.

FWIW, here is the catalog so far:

  • fedora-37 : int podman fedora-37 root host sqlite
    • 08-27 21:45 in TOP-LEVEL [AfterEach] Podman play kube podman play kube test with hostPID
    • 08-27 09:09 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved Label annotation in yaml
    • 08-23 23:14 in TOP-LEVEL [AfterEach] Podman play kube podman play kube should not rename pod if container in pod has same name
    • 08-23 17:36 in TOP-LEVEL [AfterEach] Podman play kube podman play kube with image data
    • 08-23 17:36 in TOP-LEVEL [AfterEach] Podman kube generate podman generate kube - --privileged container
    • 08-23 13:41 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved Apparmor annotation in yaml
    • 08-22 23:24 in TOP-LEVEL [AfterEach] Podman play kube podman play kube with image data
    • 08-22 23:24 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved CIDFile annotation in yaml
    • 08-22 20:02 in TOP-LEVEL [AfterEach] Podman start podman start single container by id
    • 08-22 08:24 in TOP-LEVEL [AfterEach] Podman play kube podman play kube with image data
  • fedora-37 : int podman fedora-37 rootless host sqlite
    • 08-28 13:20 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved PublishAll annotation in yaml
    • 08-28 11:35 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved Seccomp annotation in yaml
    • 08-27 21:41 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved CIDFile annotation in yaml
    • 08-26 18:21 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved Label annotation in yaml
    • 08-24 19:02 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved init annotation in yaml
    • 08-24 11:21 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved privileged annotation in yaml
    • 08-23 23:11 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved init annotation in yaml
    • 08-23 13:39 in TOP-LEVEL [AfterEach] Podman play kube podman play kube RunAsUser
    • 08-22 23:21 in TOP-LEVEL [AfterEach] Podman run networking podman run network bind to 127.0.0.1
    • 08-22 23:21 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved Apparmor annotation in yaml
    • 08-22 19:57 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved Label annotation in yaml
    • 08-22 08:18 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved PublishAll annotation in yaml
    • 08-22 08:18 in TOP-LEVEL [AfterEach] Podman run networking podman run --net container: and --uts container:
  • fedora-38 : int podman fedora-38 root container sqlite
    • 08-28 21:51 in TOP-LEVEL [AfterEach] Podman play kube podman play kube with image data
    • 08-28 08:57 in TOP-LEVEL [AfterEach] Podman play kube podman play kube with image data
    • 08-22 20:00 in TOP-LEVEL [AfterEach] Podman pod create podman start infra container different image
  • fedora-38 : int podman fedora-38 root host boltdb
    • 08-24 18:59 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved init annotation in yaml
    • 08-24 18:59 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved Seccomp annotation in yaml
    • 08-24 11:19 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved Label annotation in yaml
    • 08-22 23:21 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved Label annotation in yaml
  • fedora-38 : int podman fedora-38 root host sqlite
    • 08-28 08:53 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved privileged annotation in yaml
    • 08-27 21:41 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved Label annotation in yaml
    • 08-26 18:23 in TOP-LEVEL [AfterEach] Podman play kube podman play kube test with hostPID
    • 08-23 17:32 in TOP-LEVEL [AfterEach] Podman run networking podman run network bind to 127.0.0.1
    • 08-23 17:32 in TOP-LEVEL [AfterEach] Podman pod create podman start infra container different image
    • 08-22 08:19 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved init annotation in yaml
    • 08-22 08:19 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved Label annotation in yaml
  • fedora-38 : int podman fedora-38 rootless host boltdb
    • 08-22 23:19 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved Seccomp annotation in yaml
  • fedora-38 : int podman fedora-38 rootless host sqlite
    • 08-23 17:30 in TOP-LEVEL [AfterEach] Podman container clone podman container clone basic test
    • 08-22 23:22 in TOP-LEVEL [AfterEach] Podman container clone podman container clone basic test
  • rawhide : int podman rawhide root host sqlite
    • 08-28 13:19 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with --no-trunc
    • 08-28 11:37 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved privileged annotation in yaml
    • 08-23 17:31 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved PublishAll annotation in yaml
    • 08-23 17:31 in TOP-LEVEL [AfterEach] Podman pod create podman start infra container different image
    • 08-23 13:36 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved Seccomp annotation in yaml
  • rawhide : int podman rawhide rootless host sqlite
    • 08-28 13:17 in TOP-LEVEL [AfterEach] Podman play kube podman play kube with image data
    • 08-23 17:30 in TOP-LEVEL [AfterEach] Podman play kube podman play kube with auto update annotations for first container only
    • 08-22 23:20 in TOP-LEVEL [AfterEach] Podman play kube podman kube play test with reserved CIDFile annotation in yaml
    • 08-22 19:58 in TOP-LEVEL [AfterEach] Podman play kube podman play kube deployment more than 1 replica test correct command

Seen in: int podman fedora-37/fedora-38/rawhide root/rootless container/host boltdb/sqlite

@vrothberg
Copy link
Member

I looked into it. We could demote the log from error to info as other code location do but I want to understand how this can happen.

@edsantiago, let me know if this is urgent. I can do a change without fully understanding what's going on.

@vrothberg
Copy link
Member

More background in cabe134

@edsantiago
Copy link
Member Author

@vrothberg thank you, this is not urgent from my perspective. The background is: @Luap99 and I would really like to enable checks in CI for spurious warnings. We can't actually do that, because there are soooooooo many, but every few months I try and see what new ones have cropped up. This one and #19721 are, I think, new warnings since the last time I ran checks. "New", to me, means that something changed recently, and me whining loudly might trigger a memory in someone.

Of course, priority may change if this starts showing up in the field.

@vrothberg
Copy link
Member

I stared at the code for a long time and tried to reproduce but was not successful so far. There's clearly a bug somewhere because Podman should not attempt unmounting the root FS of a container when it's not mounted (anymore). I'd like to go to the root of that.

What I find curious: @edsantiago, we don't see the error log in the system tests, do we?

@edsantiago
Copy link
Member Author

It's a warning message that doesn't affect exit status, and we don't check for those in system tests.

@edsantiago
Copy link
Member Author

...and it's high time that I do something about that (checking for warnings in system tests). I have a proof-of-concept, it's working nicely, but I now need to go through the dozens of failures looking for which are bugs and which are genuinely ok. That will be next week. TTFN.

@edsantiago
Copy link
Member Author

And, now that we're checking for warnings in system tests, here we are (f38 rootless):

<+010ms> # $ container inspect --format {{.HostConfig.NetworkMode}} 7473a06df14dd276e9ac3a082c5ec5041cf21f5b2927ef4133a71a0edeabdf61

<+013ms> # $ stop -a -t 0
<+286ms> # e49f3b5cd6d7e85bbbacc22cef8af9cd78d1cd85579b086e4528724e93595876
         # 7473a06df14dd276e9ac3a082c5ec5041cf21f5b2927ef4133a71a0edeabdf61
         #
<+009ms> # $ pod rm -t 0 -f test_pod
<+180ms> # time="2023-09-14T20:24:11-05:00" level=error msg="Unable to clean up network for container 7473a06df14dd276e9ac3a082c5ec5041cf21f5b2927ef4133a71a0edeabdf61: \"unmounting network namespace for container 7473a06df14dd276e9ac3a082c5ec5041cf21f5b2927ef4133a71a0edeabdf61: failed to unmount NS: at /run/user/5306/netns/netns-14948909-38e3-c3db-a42b-db7a3fcac98b: no such file or directory\""
         # time="2023-09-14T20:24:11-05:00" level=error msg="Storage for container 7473a06df14dd276e9ac3a082c5ec5041cf21f5b2927ef4133a71a0edeabdf61 has been removed"
         # 74a54e482c6d98ea12b6bd4a5cb2ce96fe6a008a60b92e6bfe65dd66495e3a0d

(I don't know why the podman command is gone. I'll deal with that next week.)

@vrothberg
Copy link
Member

@edsantiago, I am under the impression that we only get this error in the context of pods (and kube which is using pods). Have you seen this error when removing a container that is not part of a pod?

@edsantiago
Copy link
Member Author

@vrothberg TBH I have no idea. I tend to look at flakes and let my hindbrain look for common patterns, only delving deep when necessary. Here my brain noticed "kube play"; it will take deliberate effort to poke deeper. It's now on my TODO list.

@vrothberg
Copy link
Member

vrothberg commented Sep 18, 2023

Thanks, @edsantiago !

I took a look at the code before PTO but could not find any obvious issue in the code. With cabe134 also having no clear indication of how this can happen, I think we're up to some treasure hunt.

Re: pods: #4033 mentions --pod=xxx as well, which increases confidence that we need to have closer inspection of container-removal code inside a Pod. There is a number of conditionals (also impacting locking) which may reveal what's going on.

@edsantiago
Copy link
Member Author

@vrothberg in looking at the flake lists above, I noticed this one, container clone, which has nothing to do with kube. I wrote a reproducer, ran it, and bam, in about ten minutes:

# while :;do /tmp/foo2.sh || break;done
...

ca7fff1bbc5ad7643201687a4898619bf158f2a09b01299c1ae614fc687e5ac7
1147f28dc5aba39d7646ce7dff672b55cf0c74c78dce92ed2315b18321a745e2
9300e5454658f3d4d0d522c98f65eebfe31ef0a07e55cd55574227191f88c371
ca7fff1bbc5ad7643201687a4898619bf158f2a09b01299c1ae614fc687e5ac7
ca7fff1bbc5ad7643201687a4898619bf158f2a09b01299c1ae614fc687e5ac7
9300e5454658f3d4d0d522c98f65eebfe31ef0a07e55cd55574227191f88c371
1147f28dc5aba39d7646ce7dff672b55cf0c74c78dce92ed2315b18321a745e2

FAILED
time="2023-09-18T12:29:25-04:00" level=error msg="IPAM error: failed to get ips for container ID ca7fff1bbc5ad7643201687a4898619bf158f2a09b01299c1ae614fc687e5ac7 on network podman"
time="2023-09-18T12:29:25-04:00" level=error msg="IPAM error: failed to find ip for subnet 10.88.0.0/16 on network podman"
time="2023-09-18T12:29:25-04:00" level=error msg="tearing down network namespace configuration for container ca7fff1bbc5ad7643201687a4898619bf158f2a09b01299c1ae614fc687e5ac7: netavark: open container netns: open /run/netns/netns-be34b3e0-1ebe-20e7-54c4-1cdbc8b1546c: IO error: No such file or directory (os error 2)"
time="2023-09-18T12:29:25-04:00" level=error msg="Unable to clean up network for container ca7fff1bbc5ad7643201687a4898619bf158f2a09b01299c1ae614fc687e5ac7: \"unmounting network namespace for container ca7fff1bbc5ad7643201687a4898619bf158f2a09b01299c1ae614fc687e5ac7: failed to unmount NS: at /run/netns/netns-be34b3e0-1ebe-20e7-54c4-1cdbc8b1546c: no such file or directory\""
time="2023-09-18T12:29:25-04:00" level=error msg="Storage for container ca7fff1bbc5ad7643201687a4898619bf158f2a09b01299c1ae614fc687e5ac7 has been removed"

Because I was sloppy & lazy, I don't know if the errors are coming from the stop or the rm. I'm now trying to narrow it down. It might be a while because I'm also trying to see if it reproduces with testimage:20221018. So far it does not, which suggests something different between that an alpine, and my first guess is that the alpine entrypoint is sh, but I'm really just guessing.

I am 99% confident that this is correlated with #19721 (I see the correlation in my flake logs too) but I can't begin to guess what the connection is.

@edsantiago
Copy link
Member Author

With testimage, no failure. But with create testimage sh, it fails. New reproducer includes the while loop, so, just run it:

# /tmp/foo3.sh
...
e57c422616e5467aed877756aaae65eabb17782e0c62c9972064090ea9ca5aa8
1dc64b57e7fa8c6055667d5c4cb928d4288ceb1a6bd4b66ecbc9ecce85553a6b
1deab6842e13f58a7db6f025c9f04c05437bedddb98cbebf03feed3ab13e6e49
e57c422616e5467aed877756aaae65eabb17782e0c62c9972064090ea9ca5aa8

FAILED IN STOP
time="2023-09-18T13:14:55-04:00" level=error msg="IPAM error: failed to get ips for container ID e57c422616e5467aed877756aaae65eabb17782e0c62c9972064090ea9ca5aa8 on network podman"
time="2023-09-18T13:14:55-04:00" level=error msg="IPAM error: failed to find ip for subnet 10.88.0.0/16 on network podman"
time="2023-09-18T13:14:55-04:00" level=error msg="tearing down network namespace configuration for container e57c422616e5467aed877756aaae65eabb17782e0c62c9972064090ea9ca5aa8: netavark: open container netns: open /run/netns/netns-75483bef-09ef-d471-4538-1dca691fc819: IO error: No such file or directory (os error 2)"
time="2023-09-18T13:14:55-04:00" level=error msg="Unable to clean up network for container e57c422616e5467aed877756aaae65eabb17782e0c62c9972064090ea9ca5aa8: \"unmounting network namespace for container e57c422616e5467aed877756aaae65eabb17782e0c62c9972064090ea9ca5aa8: failed to unmount NS: at /run/netns/netns-75483bef-09ef-d471-4538-1dca691fc819: no such file or directory\""
time="2023-09-18T13:14:55-04:00" level=error msg="Storage for container e57c422616e5467aed877756aaae65eabb17782e0c62c9972064090ea9ca5aa8 has been removed"
# bin/podman ps -a
CONTAINER ID  IMAGE                              COMMAND     CREATED        STATUS                    PORTS       NAMES
1dc64b57e7fa  quay.io/libpod/testimage:20221018  sh          6 minutes ago  Created                               exciting_chaum
1deab6842e13  quay.io/libpod/testimage:20221018  sh          6 minutes ago  Created                               exciting_chaum-clone
e57c422616e5  quay.io/libpod/testimage:20221018  sh          6 minutes ago  Exited (0) 6 minutes ago              exciting_chaum-clone1

@edsantiago
Copy link
Member Author

It finally failed with plain testimage (no sh) but it took much longer. Failure looks identical to my eye:

...
65c2dfb97b929bdcb69a37ec26990f03ef814b0f668fe9d3c88847a511e6288e
1a1ca5ade903fdcb212ee27449e91b29ebc5ca768fe36f03bab1e8341bbb3081
65c2dfb97b929bdcb69a37ec26990f03ef814b0f668fe9d3c88847a511e6288e
6a6a1673966acb0c6afa809675419f1589d1e189acb5e5358262f789ff299232

FAILED IN STOP
time="2023-09-18T16:46:34-04:00" level=error msg="IPAM error: failed to get ips for container ID 65c2dfb97b929bdcb69a37ec26990f03ef814b0f668fe9d3c88847a511e6288e on network podman"
time="2023-09-18T16:46:34-04:00" level=error msg="IPAM error: failed to find ip for subnet 10.88.0.0/16 on network podman"
time="2023-09-18T16:46:34-04:00" level=error msg="tearing down network namespace configuration for container 65c2dfb97b929bdcb69a37ec26990f03ef814b0f668fe9d3c88847a511e6288e: netavark: open container netns: open /run/netns/netns-85a4b447-1c5b-7d70-50dd-1796ad8e5fe5: IO error: No such file or directory (os error 2)"
time="2023-09-18T16:46:34-04:00" level=error msg="Unable to clean up network for container 65c2dfb97b929bdcb69a37ec26990f03ef814b0f668fe9d3c88847a511e6288e: \"unmounting network namespace for container 65c2dfb97b929bdcb69a37ec26990f03ef814b0f668fe9d3c88847a511e6288e: failed to unmount NS: at /run/netns/netns-85a4b447-1c5b-7d70-50dd-1796ad8e5fe5: no such file or directory\""
time="2023-09-18T16:46:34-04:00" level=error msg="Storage for container 65c2dfb97b929bdcb69a37ec26990f03ef814b0f668fe9d3c88847a511e6288e has been removed"

@Luap99
Copy link
Member

Luap99 commented Sep 19, 2023

It finally failed with plain testimage (no sh) but it took much longer. Failure looks identical to my eye:

...
65c2dfb97b929bdcb69a37ec26990f03ef814b0f668fe9d3c88847a511e6288e
1a1ca5ade903fdcb212ee27449e91b29ebc5ca768fe36f03bab1e8341bbb3081
65c2dfb97b929bdcb69a37ec26990f03ef814b0f668fe9d3c88847a511e6288e
6a6a1673966acb0c6afa809675419f1589d1e189acb5e5358262f789ff299232

FAILED IN STOP
time="2023-09-18T16:46:34-04:00" level=error msg="IPAM error: failed to get ips for container ID 65c2dfb97b929bdcb69a37ec26990f03ef814b0f668fe9d3c88847a511e6288e on network podman"
time="2023-09-18T16:46:34-04:00" level=error msg="IPAM error: failed to find ip for subnet 10.88.0.0/16 on network podman"
time="2023-09-18T16:46:34-04:00" level=error msg="tearing down network namespace configuration for container 65c2dfb97b929bdcb69a37ec26990f03ef814b0f668fe9d3c88847a511e6288e: netavark: open container netns: open /run/netns/netns-85a4b447-1c5b-7d70-50dd-1796ad8e5fe5: IO error: No such file or directory (os error 2)"
time="2023-09-18T16:46:34-04:00" level=error msg="Unable to clean up network for container 65c2dfb97b929bdcb69a37ec26990f03ef814b0f668fe9d3c88847a511e6288e: \"unmounting network namespace for container 65c2dfb97b929bdcb69a37ec26990f03ef814b0f668fe9d3c88847a511e6288e: failed to unmount NS: at /run/netns/netns-85a4b447-1c5b-7d70-50dd-1796ad8e5fe5: no such file or directory\""
time="2023-09-18T16:46:34-04:00" level=error msg="Storage for container 65c2dfb97b929bdcb69a37ec26990f03ef814b0f668fe9d3c88847a511e6288e has been removed"

This shows that we try to cleanup twice which cannot work.

@edsantiago
Copy link
Member Author

The past two weeks. Mostly in #17831 except for the one in sys tests, where there are no flake retries:

  • fedora-37 : int podman fedora-37 root host sqlite
    • 09-22 12:12 in Podman kube play test with reserved volumes-from annotation in yaml
  • fedora-38 : int podman fedora-38 root host sqlite
    • 09-26 19:10 in Podman kube play test with reserved volumes-from annotation in yaml
  • fedora-38 : int podman fedora-38 rootless host sqlite
    • 09-27 10:49 in Podman kube play test with reserved volumes-from annotation in yaml
  • fedora-38 : sys podman fedora-38 rootless host sqlite
  • fedora-39 : int podman fedora-39 root host sqlite
    • 09-28 08:39 in Podman kube play test with reserved volumes-from annotation in yaml
  • fedora-39 : int podman fedora-39 rootless host boltdb
    • 09-28 08:36 in Podman kube play test with reserved volumes-from annotation in yaml
    • 09-26 19:01 in Podman kube play test with reserved volumes-from annotation in yaml

Seen in: int/sys fedora-37/fedora-38/fedora-39 root/rootless boltdb/sqlite

@giuseppe
Copy link
Member

is this still happening? 8ac2aa7 could have fixed it

@edsantiago
Copy link
Member Author

Last seen Oct 11, and a quick check shows that some of these failures included #20299, so I'm reluctant to close just yet. "Quick check" means scrolling to top of error log, clicking Base commit, and in-page search for "20299", which merged Oct 9)

  • fedora-38 : int podman fedora-38 root host sqlite
    • 10-09 12:18 in Podman kube play test with reserved volumes-from annotation in yaml
  • fedora-38 : sys podman fedora-38 rootless host sqlite
  • fedora-39β : int podman fedora-39β root host sqlite
    • 10-11 16:57 in Podman kube play test with reserved volumes-from annotation in yaml
    • 10-10 20:22 in Podman kube play test with reserved volumes-from annotation in yaml
  • fedora-39β : int podman fedora-39β rootless host boltdb
    • 10-11 09:30 in Podman kube play test with reserved volumes-from annotation in yaml
    • 10-10 16:44 in Podman kube play test with reserved volumes-from annotation in yaml
  • rawhide : int podman rawhide rootless host sqlite
    • 10-11 07:57 in Podman kube play test with reserved volumes-from annotation in yaml

Seen in: int+sys podman fedora-38+fedora-39β+rawhide root+rootless host boltdb+sqlite

@edsantiago
Copy link
Member Author

Today, with up-to-date main. f38 root.

rhatdan added a commit to rhatdan/podman that referenced this issue Oct 23, 2023
There is a potential race condition we are seeing where
we are seeing a message about a removed container which
could be caused by a non mounted container, this change
should clarify which is causing it.

Also if the container does not exists, just warn the user
instead of reporting an error, not much the user can do.

Fixes: containers#19702

[NO NEW TESTS NEEDED]

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Jan 23, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 23, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
flakes Flakes from Continuous Integration locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants