Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermediate layers for tagged images are incorrectly pruned #10832

Closed
Enchufa2 opened this issue Jul 1, 2021 · 43 comments · Fixed by containers/common#684 or #10983
Closed

Intermediate layers for tagged images are incorrectly pruned #10832

Enchufa2 opened this issue Jul 1, 2021 · 43 comments · Fixed by containers/common#684 or #10983
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@Enchufa2
Copy link

Enchufa2 commented Jul 1, 2021

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

podman-image-prune incorrectly prunes intermediate layers for tagged images.

Steps to reproduce the issue:

Consider the following Dockerfile:

FROM scratch
ENV test1=test1
ENV test2=test2

then,

$ podman build . -t test
STEP 1: FROM scratch
STEP 2: ENV test1=test1
--> f106d67818d
STEP 3: ENV test2=test2
STEP 4: COMMIT test
--> 3775b6bda38
Successfully tagged localhost/test:latest
3775b6bda38dc3cf77b00e540706c367bc7d6fff606bbc7eab75842fcb2f41a8
$ podman image prune 
WARNING! This will remove all dangling images.
Are you sure you want to continue? [y/N] y
f106d67818dd8ee583d3fb3052a87705ac8c953c9c9e3ce453b51a9da57e1271

Describe the results you received:

As you can see, intermediate layer f106d67818d was removed.

Describe the results you expected:

It shouldn't be removed. FWIW, docker doesn't remove it.

Output of podman version:

Version:      3.2.1
API Version:  3.2.1
Go Version:   go1.16.3
Built:        Mon Jun 14 21:12:29 2021
OS/Arch:      linux/amd64

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.21.0
  cgroupControllers: []
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.0.27-2.fc34.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.27, commit: '
  cpus: 8
  distribution:
    distribution: fedora
    version: "34"
  eventLogger: journald
  hostname: ****
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.12.11-300.fc34.x86_64
  linkmode: dynamic
  memFree: 6931976192
  memTotal: 16467206144
  ociRuntime:
    name: crun
    package: crun-0.20.1-1.fc34.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 0.20.1
      commit: 0d42f1109fd73548f44b01b3e84d04a279e99d2e
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.9-1.fc34.x86_64
    version: |-
      slirp4netns version 1.1.8+dev
      commit: 6dc0186e020232ae1a6fcc1f7afbc3ea02fd3876
      libslirp: 4.4.0
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.0
  swapFree: 14918758400
  swapTotal: 16932397056
  uptime: 50h 1m 6.28s (Approximately 2.08 days)
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /home/****/.config/containers/storage.conf
  containerStore:
    number: 1
    paused: 0
    running: 0
    stopped: 1
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-1.5.0-1.fc34.x86_64
      Version: |-
        fusermount3 version: 3.10.4
        fuse-overlayfs: version 1.5
        FUSE library version 3.10.4
        using FUSE kernel interface version 7.31
  graphRoot: /home/****/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 12
  runRoot: /run/user/1000
  volumePath: /home/****/.local/share/containers/storage/volumes
version:
  APIVersion: 3.2.1
  Built: 1623697949
  BuiltTime: Mon Jun 14 21:12:29 2021
  GitCommit: ""
  GoVersion: go1.16.3
  OsArch: linux/amd64
  Version: 3.2.1

Package info (e.g. output of rpm -q podman or apt list podman):

podman-3.2.1-1.fc34.x86_64

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/master/troubleshooting.md)

No

@openshift-ci openshift-ci bot added the kind/bug Categorizes issue or PR as related to a bug. label Jul 1, 2021
@mheon
Copy link
Member

mheon commented Jul 1, 2021

@vrothberg PTAL. I think this is definitely the recursive pruning grabbing it, though I'm not sure why Docker would bother retaining it?

@vrothberg
Copy link
Member

Thanks for reaching out, @Enchufa2!

I am not sure that's related to it being recursive. Per Docker docs: "_ A dangling image is one that is not tagged and is not referenced by any container_"

Which is the case for the intermediate image. It looks like Docker behavior doesn't match its docs.

@Enchufa2, can you elaborate why removing the intermediate image may not be desirable?

Note: it predates the libimage work.

@Enchufa2
Copy link
Author

Enchufa2 commented Jul 1, 2021

Let me explain my use case. I'm developing an application composed of several microservices closely interrelated, and I have a multistage Dockerfile that looks like this:

FROM centos:7 as base
# some basic configuration stuff
# update the base and clean the base image
FROM base as common
# install common software and other stuff
FROM common as service1
# install and configure specific stuff
FROM common as service2
# etc.

and I have a compose file with all the targets. Now, suppose I just modify some piece of code, some configuration file... that is added in the last layers of the services. Then, I rebuild, and I end up with something like this

$ podman images
REPOSITORY                        TAG     IMAGE ID
localhost/project_service2        latest  service2_id_2
localhost/project_service1        latest  service1_id_2
<none>                            <none>  service2_id_1
<none>                            <none>  service1_id_1
localhost/project_common          latest  common_id_1
localhost/project_base            latest  base_id_1

After several changes, several rebuilds, the list of "none" grows rapidly, as you imagine. What I want is to prune those dangling ones without having to do it by hand, one by one, but without losing all the intermediate steps in my tagged images. Otherwise, the next time I rebuild, I lost cache for common and everything is rebuilt from the start, which is very time consuming (and the whole idea of a multistage build no longer makes sense).

BTW, to my understanding, the layer f106d67818d above is part of a tagged image, it's not dangling.

@vrothberg
Copy link
Member

Thank you, that was quite helpful. So it seems may need to extend the definition of dangling: it's an image without a tag and without a tagged parent image.

@mtrmac @nalind, I'd like to know your thoughts on that.

@Enchufa2
Copy link
Author

Enchufa2 commented Jul 1, 2021

Also FWIW, I always assumed that the behaviour was the same as docker, and thus I've been running podman image prune for this project for months without any trouble. I detected this problem just last week and I was quite surprised. Is it possible that v3.2.0 and previous versions were working as I intended and v3.2.1 changed behaviour?

@vrothberg
Copy link
Member

I tried Podman v3.0.1 which is also pruning, so it looks like Podman always did that.

@Enchufa2
Copy link
Author

Enchufa2 commented Jul 1, 2021

Weird. What changed then? How was I able to rebuild after pruning without rebuilding everything?

@vrothberg
Copy link
Member

Weird. What changed then? How was I able to rebuild after pruning without rebuilding everything?

I only tested the reproducer of this issue. Maybe there is more to it?

@mtrmac
Copy link
Collaborator

mtrmac commented Jul 1, 2021

Thank you, that was quite helpful. So it seems may need to extend the definition of dangling: it's an image without a tag and without a tagged parent image.

To confirm, do mean that only images with the immediate parent tagged (but not more distant children) would be excluded? “Without a tagged ancestor” would stop pruning almost all images AFAICS.


If you have looked into this, can you write down the exact shape of the layer/tag graphs, please?

I’m guessing it is something like

base-image -> base-config, base-top-layer
intermediate-image -> intermediate-config, base-top-layer (used to represent config only)
final-image -> final-config, base-top-layer

or do the intermediate images have their own top layers on top of base-top-layer?


Either way my first very vague uninformed impression is to view it the other way, that the problem is not really the pruning, but the caching: even if we pruned intermediate-image, it seems possible to me that a build that reuses base-image could after the two or more ENV steps notice that it is equivalent to final-image, and continue using the cache for later build steps. (I haven’t checked, it might be too much code require exponential time to find).

If we are not changing the filesystem, we don’t really need to reuse a parent, the config changes alone are pretty cheap. And the current pruning rule seems easier to explain to users (“after a podman image prune, the image list will not show any untagged images”); if we do have to preserve the cache images, we can’t tell anymore, all that easily, whether those cache images are required for build caching of the existing descendant images, or whether all the relevant descendants have already been removed and the unnamed image is truly dangling.

@Enchufa2
Copy link
Author

Enchufa2 commented Jul 1, 2021

Another issue, probably related to this. Consider the Dockerfile in my first comment. Then, we produce a first build:

$ podman build . -t test        
STEP 1: FROM scratch
STEP 2: ENV test1=test1
--> a6e0cbdfa3e
STEP 3: ENV test2=test2
STEP 4: COMMIT test
--> d82f2c78728
d82f2c78728198673f9ee354ce85d453de5c93e728ba7eb4628b60dd02e217a1

Then, we change the last line of the Dockerfile to produce a second build:

$ podman build . -t test
STEP 1: FROM scratch
STEP 2: ENV test1=test1
--> Using cache a6e0cbdfa3eb3649be86041e77b16ffff8d8ad1ecdbc157f6891d6c7eb7d1c6c
--> a6e0cbdfa3e
STEP 3: ENV test2=test3
STEP 4: COMMIT test
--> 53bde127cab
53bde127cab68f853eb438f1c15fd7539853545ababee07cd58e5d48f4b0c5bc

Now, I see this, which doesn't look right to me:

$ podman images
REPOSITORY                                 TAG      IMAGE ID      CREATED        SIZE
localhost/test                             latest   53bde127cab6  2 seconds ago  897 B
<none>                                     <none>   a6e0cbdfa3eb  7 seconds ago  769 B
<none>                                     <none>   d82f2c787281  7 seconds ago  897 B

Why, because a6e0cbdfa3eb is an intermediate step for 53bde127cab6 (and for d82f2c787281!). So even if I clean images one by one by hand, I need to track down what is being used to avoid removing it! This doesn't feel right.

I tried the same with docker, and docker

  1. doesn't show the intermediate a6e0cbdfa3eb;
  2. only removes d82f2c787281 when I run docker image prune.

@Enchufa2
Copy link
Author

Enchufa2 commented Jul 1, 2021

the current pruning rule seems easier to explain to users

I don't think so. The current pruning rule basically implies squashing all tagged images into a single layer. I don't think any user expects that. If I'd like to squash it, I would have added --squash to podman build.

(“after a podman image prune, the image list will not show any untagged images”)

Exactly. Then, why is it removing things I don't see when I list the images available?

if we do have to preserve the cache images, we can’t tell anymore, all that easily, whether those cache images are required for build caching of the existing descendant images, or whether all the relevant descendants have already been removed and the unnamed image is truly dangling.

I don't know podman's internals, but it doesn't seem too complicated to keep track of this. In fact, docker does. It's evident that docker treats leaf and intermediate images differently. When you run docker image prune, docker prunes dangling leaf images, and removes parent intermediate images only if they're not part of any other tagged leaf.

Also, from the discussions e.g., in the moby repo, it is evident that there are many workflows out there relying a cron job that runs docker image prune with some regularity (because, otherwise, the output from docker images quickly becomes unmanageable), and people do not expect to lose their intermediate layers from properly tagged leafs.

@mtrmac
Copy link
Collaborator

mtrmac commented Jul 1, 2021

the current pruning rule seems easier to explain to users

I don't think so. The current pruning rule basically implies squashing all tagged images into a single layer.

ENV doesn’t need to produce layers; it’s just extra overhead.


At this point I think we need to see the exact graph shapes to have a specific discussion.

@vrothberg
Copy link
Member

Aaah. Dangling for Docker implies not having children (see https://github.com/moby/moby/blob/master/daemon/images/image_prune.go#L79).

@vrothberg
Copy link
Member

Which in Podman terms is an "intermediate" image.

@vrothberg
Copy link
Member

I think what we need to do is:

diff --git a/pkg/domain/infra/abi/images.go b/pkg/domain/infra/abi/images.go
index 5992181d3c01..940783daba3e 100644
--- a/pkg/domain/infra/abi/images.go
+++ b/pkg/domain/infra/abi/images.go
@@ -46,7 +46,7 @@ func (ir *ImageEngine) Prune(ctx context.Context, opts entities.ImagePruneOption
        }
 
        if !opts.All {
-               pruneOptions.Filters = append(pruneOptions.Filters, "dangling=true")
+               pruneOptions.Filters = append(pruneOptions.Filters, "intermediate=true")
        }
 
        var pruneReports []*reports.PruneReport

Likely, some tests need to be changed as well. @mtrmac WDYT?

@mtrmac
Copy link
Collaborator

mtrmac commented Jul 1, 2021

Either way my first very vague uninformed impression is to view it the other way, that the problem is not really the pruning, but the caching: even if we pruned intermediate-image, it seems possible to me that a build that reuses base-image could after the two or more ENV steps notice that it is equivalent to final-image, and continue using the cache for later build steps. (I haven’t checked, it might be too much code require exponential time to find).

… or to put this another way, a sequence of 1000 ENV and other config-edits might (in some vague conceptual sense) not need to create any storage.Image objects, and the associated I/O and locking overhead, at all, assuming a “sufficiently smart and efficient” implementation. That would probably make everyone happy, with no dangling untagged intermediate images to prune, and no cache breakage on prune. But that “sufficiently smart and efficient” might be an insurmountable ask.

@Enchufa2
Copy link
Author

Enchufa2 commented Jul 1, 2021

the current pruning rule seems easier to explain to users

I don't think so. The current pruning rule basically implies squashing all tagged images into a single layer.

ENV doesn’t need to produce layers; it’s just extra overhead.

For me, as a user, one instruction = one layer that is somehow cached. I used ENV above for the sake of the argument, just because it's cheap. But think about multiple RUN instructions installing or configuring stuff.

At this point I think we need to see the exact graph shapes to have a specific discussion.

Apologies, not sure if this is directed to me or to @vrothberg. If you need my help with this, I would need some instructions on how to do such thing. :)

@Enchufa2
Copy link
Author

Enchufa2 commented Jul 1, 2021

I think what we need to do is:

What about #10832 (comment)? Should I open another issue?

@vrothberg
Copy link
Member

What about #10832 (comment)? Should I open another issue?

Yes, that would be great, thank you.

@mtrmac
Copy link
Collaborator

mtrmac commented Jul 1, 2021

-               pruneOptions.Filters = append(pruneOptions.Filters, "dangling=true")
+               pruneOptions.Filters = append(pruneOptions.Filters, "intermediate=true")

I can’t see how this should work: if there is a chain of taggedBase → cache1 → cache2 → cache3, with the cache* images untagged descendants, AFAICS with dangling=true this selects cache3 and libimage.Image.remove then recursively removes all cache* images but not taggedBase. With intermediate=true this selects cache1 and cache2, but not cache3, and no layers are actually removed, only some image objects.

Aaah. Dangling for Docker implies not having children (see https://github.com/moby/moby/blob/master/daemon/images/image_prune.go#L79).

So Docker’s prune only removes images with no children, and the proposed fix is to… only remove images with children? I’m confused.


Can we talk directed acyclic graphs and precise shapes and specific examples, please? Or if this seems obvious, feel free to tell me to do my homework, I have only lazily asked for the layer/tag graphs and I didn’t do the work to reproduce this myself.

Right now I certainly don’t know what is going on and it rather feels that the three of us don’t have a shared language, let alone a shared understanding of what the issues and tradeoffs are.

@mtrmac
Copy link
Collaborator

mtrmac commented Jul 1, 2021

the current pruning rule seems easier to explain to users

I don't think so. The current pruning rule basically implies squashing all tagged images into a single layer.

ENV doesn’t need to produce layers; it’s just extra overhead.

For me, as a user, one instruction = one layer that is somehow cached.

The “layer” I am talking about is a specific implementation object (containers/storage.Layer).

I used ENV above for the sake of the argument, just because it's cheap. But think about multiple RUN instructions installing or configuring stuff.

The distinction between ENV and RUN, AFAICS, matters.

@vrothberg
Copy link
Member

Can we talk directed acyclic graphs and precise shapes and specific examples, please?

Only untagged leafs should be pruned (recursively)?

@mtrmac
Copy link
Collaborator

mtrmac commented Jul 1, 2021

Sounds intuitive. How does that fail in the original issue?

@Enchufa2
Copy link
Author

Enchufa2 commented Jul 1, 2021

@mtrmac Ok, let me try this. The example above would be:

tag0 → cache1 → tag1

podman image prune removes cache1; docker image prune doesn't.

A more complicated example from my use case would be something like this. First multistage build:

base → cache1 → common → cache2 → cache3 → service1
                                ↳ cache4 → service2

podman image prune removes all caches; docker image prune doesn't remove anything. Then, I modify something, second build:

base → cache1 → common → cache2 → cache3 → <none>
                                ↳ cache4 → <none>
                                ↳ cache5 → service1
                                ↳ cache6 → service2

podman image prune removes all caches plus <none>; docker image prune removes just cache3, cache4 and <none>.

@vrothberg
Copy link
Member

Sounds intuitive. How does that fail in the original issue?

We prune any untagged node, whether it's a leaf or not.

So Docker’s prune only removes images with no children, and the proposed fix is to… only remove images with children? I’m confused.

Yes, the intermediate filter is not the correct one since it's an untagged one with children. I begin to think that the dangling filter should actually be extended to perform a children check.

@Enchufa2
Copy link
Author

Enchufa2 commented Jul 1, 2021

And solution in pseudo-code:

dag ← get_dag()
while length(x ← get_untagged_leafs(dag)):
    remove(dag, x)

@Enchufa2
Copy link
Author

Enchufa2 commented Jul 1, 2021

And another argument based on podman's current behaviour. If we have:

base → cache1 → tag1
              ↳ tag2

Then, if we podman rmi either tag1 or tag2, but not both, then cache1 is not removed, as expected. So image prune should be consistent with that.

@mtrmac
Copy link
Collaborator

mtrmac commented Jul 1, 2021

@mtrmac Ok, let me try this. The example above would be:

tag0 → cache1 → tag2

Are the objects here images or layers?

podman image prune removes cache1; docker image prune doesn't.

Ahh, so I think I’ve been obtuse. cache1 counts as an libimage.Image.IsDangling image. And even if tag0/cache1/tag2 did share the top layer, the parent/child/dangling/intermediate checks in the pruning algorithm are based on layerTree’s parent/child definition, which reconstructs the links based on history matching; i.e. regardless of what the layer tree looks like (and c/storage.Image doesn’t have any parent links for images), the cache images look like the above.

@Enchufa2
Copy link
Author

Enchufa2 commented Jul 1, 2021

@mtrmac Ok, let me try this. The example above would be:

tag0 → cache1 → tag2

Are the objects here images or layers?

I think I'm probably talking layers here. But I'm not sure. :) What's the difference?

@mtrmac
Copy link
Collaborator

mtrmac commented Jul 1, 2021

I begin to think that the dangling filter should actually be extended to perform a children check.

That looks consistent with the linked prune code from a quick check, but I didn’t carefully review their definition of parent/child. (The same filters are used for image list, and the code looks a bit more convoluted there, but I think is also consistent with that.)

@mtrmac
Copy link
Collaborator

mtrmac commented Jul 1, 2021

Are the objects here images or layers?

I think I'm probably talking layers here. But I'm not sure. :) What's the difference?

Never mind, my mistake; the logic in is in image terms (as ~heuristically determined by layerTree, not by physical links), and layers don’t matter much.

@mtrmac
Copy link
Collaborator

mtrmac commented Jul 1, 2021

I begin to think that the dangling filter should actually be extended to perform a children check.

That looks consistent with the linked prune code from a quick check

BTW unlike the filters, it’s not at all obvious to me that Image.IsDangling should be modified as well. It would directly break IsIntermediate, which is easily remedied, but more importantly ImageSummary.Dangling is AFAICS completely undocumented. Guessing at its purpose I suspect the current behavior is correct there.

@vrothberg
Copy link
Member

I will take a look today or early next week. Likely a regression from libimage.

@vrothberg
Copy link
Member

Summary from some further investigation:

Podman's definition of dangling so far was "untagged". Docker's definition was "untagged without children". Since Podman aims to be compatible, this should be fixed which is fairly easy.

Looking at the initial issue/reproducer, I think there's some more thinking required:

~ $ cat F1
FROM scratch
ENV test1=test1
ENV test2=test2
~ $ cat F2
FROM scratch
ENV test1=test1
ENV test3=test3
~ $ podman build -f F1 -t test
STEP 1: FROM scratch
STEP 2: ENV test1=test1
--> 4833f21e3ae
STEP 3: ENV test2=test2
STEP 4: COMMIT test
--> 062de00e7b1
Successfully tagged localhost/test:latest
062de00e7b11440e5e687d1bf860e1e2495c71473d3b7ba2af7f7317a4b4b54b
~ $ podman build -f F2
STEP 1: FROM scratch
STEP 2: ENV test1=test1
--> Using cache 4833f21e3ae508c5eaa71de978418b2157402e43714c4dbff7d067d73a10260e
--> 4833f21e3ae
STEP 3: ENV test3=test3
STEP 4: COMMIT
--> 1a611267076
1a61126707673f4789a9a16db0e3015b11dbcec5e9e60cb2263d795d75bcc25c

Now, let's have a look at the images:

~ $ podman images -a
REPOSITORY      TAG         IMAGE ID      CREATED         SIZE
<none>          <none>      1a6112670767  34 seconds ago  897 B
<none>          <none>      4833f21e3ae5  41 seconds ago  770 B
localhost/test  latest      062de00e7b11  41 seconds ago  897 B
~ $ podman image tree 1a6112670767
Image ID: 1a6112670767
Tags:     []
Size:     897B
No Image Layers

~ $ podman image tree 4833f21e3ae5
Image ID: 4833f21e3ae5
Tags:     []
Size:     770B
No Image Layers

~ $ podman image tree 062de00e7b11
Image ID: 062de00e7b11
Tags:     [localhost/test:latest]
Size:     897B
No Image Layers

As @mtrmac mentioned above. None of these images has a physical layer, so Podman is unable to detect any parent-child relation. Docker is able to since it's storing these relations directly in the image store and sets them explicitly during build.

I am fairly certain that Docker won't list two images as parent/child if they aren't built locally but pulled.

My conclusion for now: fixing the dangling filter is easy. Supporting the reproducer may be a tough cookie.

@Enchufa2
Copy link
Author

Enchufa2 commented Jul 9, 2021

The reproducer is quite silly. :) I don't think any user cares about an ENV layer that takes literally no time to rebuild. But if a layer that makes some package installations is removed, that is a problem.

@mtrmac
Copy link
Collaborator

mtrmac commented Jul 9, 2021

Podman is unable to detect any parent-child relation

Podman’s parent/child heuristic IIRC does try to pair images with the same top layer using history entries. Is that completely broken, or is that only not working in the case of images with no layers at all? (I honestly thought that it isn’t even possible to construct an image with no layers.) If it’s the latter, that looks like something that should be possible to fix without that much effort.

@vrothberg
Copy link
Member

Podman’s parent/child heuristic IIRC does try to pair images with the same top layer using history entries. Is that completely broken, or is that only not working in the case of images with no layers at all?

The latter. The images do not have any (top) layer, so they cannot be matched.

(I honestly thought that it isn’t even possible to construct an image with no layers.) If it’s the latter, that looks like something that should be possible to fix without that much effort.

Can you elaborate on that? I had a thought (but didn't check if it would be correct) to do the history checks on all "empty" images.

The reproducer is quite silly. :) I don't think any user cares about an ENV layer that takes literally no time to rebuild. But if a layer that makes some package installations is removed, that is a problem.

Note that images and layers are conceptually different. So an image object can be removed while its layers remain in the storage.

@mtrmac
Copy link
Collaborator

mtrmac commented Jul 9, 2021

The latter. The images do not have any (top) layer, so they cannot be matched.

If it’s the latter, that looks like something that should be possible to fix without that much effort.

Can you elaborate on that? I had a thought (but didn't check if it would be correct) to do the history checks on all "empty" images.

Basically that. If we right now match a.topLayer == b.topLayer || a.topID == b.topLayer.Parent, and treat any "" as “no match”, we can just have "" match "" (but not look for "".Parent, of course).

@mtrmac
Copy link
Collaborator

mtrmac commented Jul 9, 2021

BTW we should also match a single-layer image against a possible parent with no layers.

vrothberg added a commit to vrothberg/common that referenced this issue Jul 19, 2021
As discussed in github.com/containers/podman/issues/10832 the definition
of a "dangling" image in Podman has historically been incorrect.  While
the Docker docs describe a dangling image as an image without a tag, and
Podman implemented the filters as such, Docker actually implemented the
filters for images without a tag and without children.

Refine the dangling filters and hence `IsDangling()` to only return true
if an image is untagged and has no children.

Also correct the comments of `IsIntermediate()`.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
vrothberg added a commit to vrothberg/common that referenced this issue Jul 20, 2021
As discussed in github.com/containers/podman/issues/10832 the definition
of a "dangling" image in Podman has historically been incorrect.  While
the Docker docs describe a dangling image as an image without a tag, and
Podman implemented the filters as such, Docker actually implemented the
filters for images without a tag and without children.

Refine the dangling filters and hence `IsDangling()` to only return true
if an image is untagged and has no children.

Also correct the comments of `IsIntermediate()`.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
@vrothberg vrothberg reopened this Jul 21, 2021
vrothberg added a commit to vrothberg/libpod that referenced this issue Jul 22, 2021
By proxy by vendoring containers/common. Previously, a "dangling" image
was an untagged image; just a described in the Docker docs. The
definition of dangling has now been refined to an untagged image without
children to be compatible with Docker.

Further update a redundant image-prune test.

Fixes: containers#10998
Fixes: containers#10832
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
vrothberg added a commit to vrothberg/libpod that referenced this issue Jul 22, 2021
By proxy by vendoring containers/common. Previously, a "dangling" image
was an untagged image; just a described in the Docker docs. The
definition of dangling has now been refined to an untagged image without
children to be compatible with Docker.

Further update a redundant image-prune test.

Fixes: containers#10998
Fixes: containers#10832
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
vrothberg added a commit to vrothberg/libpod that referenced this issue Jul 22, 2021
By proxy by vendoring containers/common. Previously, a "dangling" image
was an untagged image; just a described in the Docker docs. The
definition of dangling has now been refined to an untagged image without
children to be compatible with Docker.

Further update a redundant image-prune test.

Fixes: containers#10998
Fixes: containers#10832
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
vrothberg added a commit to vrothberg/libpod that referenced this issue Jul 26, 2021
By proxy by vendoring containers/common. Previously, a "dangling" image
was an untagged image; just a described in the Docker docs. The
definition of dangling has now been refined to an untagged image without
children to be compatible with Docker.

Further update a redundant image-prune test.

Fixes: containers#10998
Fixes: containers#10832
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
@Enchufa2
Copy link
Author

Enchufa2 commented Sep 3, 2021

No issues so far with v3.3.1 containing this. Thanks again for fixing this issue.

@vrothberg
Copy link
Member

Glad it's working. Thanks for letting us know!

@dustymabe
Copy link
Contributor

Yep - seems to be working for me too.

@vrothberg
Copy link
Member

Excellent, thanks for reporting @dustymabe

dustymabe added a commit to dustymabe/fedora-coreos-pipeline that referenced this issue Oct 19, 2021
jlebon pushed a commit to coreos/fedora-coreos-pipeline that referenced this issue Oct 19, 2021
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 21, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants