Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alpine images too big (should use multi-staged build) #339

Open
harmv opened this issue Feb 3, 2023 · 15 comments
Open

Alpine images too big (should use multi-staged build) #339

harmv opened this issue Feb 3, 2023 · 15 comments

Comments

@harmv
Copy link

harmv commented Feb 3, 2023

I noticed that the docker-postgis alpine images are far too big

eg: postgis/postgis 13-3.3-alpine a21d01173429 2 weeks ago 556MB

Cause

The Dockerfile does not use a multi-staged build.

Steps to reproduce:

Actual result

the final image contains, besides the required postgres & postgis binaries, also unneeded build stuff (g++, gcc, clang-dev, perl, autoconf, automake, etc, etc)

Expected

The final image is much smaller.
Only the required binaries are in the final image. (postgres + postgis)
The intermediate build stuff (compilers etc) are not in the final image.

@ImreSamu
Copy link
Member

ImreSamu commented Feb 3, 2023

Hi @harmv

Thank you for sending this question!
The large image size is a valid concern, but I don't know how to significantly reduce the size even more.
( Sure I have some ideas, but not in the direction of a multi-stage build. )

The upstream postgres image is also not multi-staged ( https://github.com/docker-library/postgres/blob/master/13/alpine/Dockerfile )
and does not include the build packages.

imho: There must be some way to reduce the size, so if you have some proof-of-concept PR, I'd be happy to look into it.

the final image contains, besides the required postgres & postgis binaries,
also unneeded build stuff (g++, gcc, clang-dev, perl, autoconf, automake, etc, etc)

Could you please provide me with a more detailed test description for this?
How can I detect the packages you have detected?

$ docker pull docker.io/postgis/postgis:13-3.3-alpine

13-3.3-alpine: Pulling from postgis/postgis
Digest: sha256:dea154c9000546b9bcc07cf55646563d7a7401637083c82867472b130bed27b8
Status: Image is up to date for postgis/postgis:13-3.3-alpine
docker.io/postgis/postgis:13-3.3-alpine

$ docker run -it --rm docker.io/postgis/postgis:13-3.3-alpine sh


/ # gcc
sh: gcc: not found
/ # autoconf
sh: autoconf: not found
/ # automake
sh: automake: not found
/ # perl
sh: perl: not found

 
/ # apk info
WARNING: Ignoring https://dl-cdn.alpinelinux.org/alpine/v3.17/main: No such file or directory
WARNING: Ignoring https://dl-cdn.alpinelinux.org/alpine/v3.17/community: No such file or directory
alpine-baselayout-data
musl
busybox
busybox-binsh
alpine-baselayout
alpine-keys
ca-certificates-bundle
libcrypto3
libssl3
ssl_client
zlib
apk-tools
scanelf
musl-utils
libc-utils
xz-libs
libgcc
libstdc++
libuuid
libcom_err
libffi
libverto
krb5-conf
keyutils-libs
krb5-libs
gdbm
libsasl
libldap
ncurses-terminfo-base
ncurses-libs
libedit
libxml2
libgpg-error
libgcrypt
libxslt
zstd-libs
readline
tzdata
icu-libs
icu-data-full
bash
su-exec
zstd
nss_wrapper
.postgresql-rundeps
ca-certificates
openexr
libbz2
brotli-libs
nghttp2-libs
libcurl
cfitsio
libdeflate
libexpat
freexl
geos
giflib
libsz
hdf5
hdf5-cpp
aom-libs
libde265
numactl
x265-libs
libheif
libjpeg-turbo
json-c
kealib
minizip
liburiparser
libkml
mariadb-connector-c
hdf5-hl
netcdf
unixodbc
libtirpc-conf
libtirpc
ogdi
openjpeg
pcre2
libpng
freetype
fontconfig
lcms2
libwebp
tiff
poppler
libpq
sqlite-libs
proj
qhull
libxau
libmd
libbsd
libxdmcp
libxcb
libx11
libxext
libxrender
pixman
cairo
libgeotiff
lz4-libs
librttopo
libspatialite
librasterlite2
xerces-c
gdal
llvm15-libs
pcre
protobuf-c
.postgis-rundeps
/ # 

@ImreSamu
Copy link
Member

ImreSamu commented Feb 4, 2023

Hi @harmv,

I think I found the reason for the increase in the alpine image size.

in the build log :
2023-01-30T05:23:59.3772866Z (84/136) Upgrading llvm15-libs (15.0.6-r0 -> 15.0.7-r0)
And now the image is has an extra "263 MB /usr/lib/libLLVM-15.so"

And the dive CI tool is also detected:

$ CI=true dive  docker.io/postgis/postgis:13-3.3-alpine
  Using default CI config
Image Source: docker://docker.io/postgis/postgis:13-3.3-alpine
Fetching image... (this can take a while for large images)
Analyzing image...
  efficiency: 76.2937 %
  wastedBytes: 263481244 bytes (264 MB)
  userWastedPercent: 47.9720 %
Inefficient Files:
Count  Wasted Space  File Path
    2        263 MB  /usr/lib/libLLVM-15.so
    2        428 kB  /etc/ssl/certs/ca-certificates.crt
    3        276 kB  /lib/apk/db/installed
    2         56 kB  /usr/local/share/postgresql/postgresql.conf.sample
    3         43 kB  /lib/apk/db/scripts.tar
    2        2.4 kB  /etc/passwd
    2        1.4 kB  /etc/group
    2         874 B  /etc/shadow
    3         414 B  /lib/apk/db/triggers
    3         282 B  /etc/apk/world
    2          86 B  /etc/shells
    2           0 B  /usr/lib/llvm15/lib/libLLVM-15.so
    2           0 B  /usr/bin/unlzma
    2           0 B  /usr/bin/factor
    2           0 B  /usr/bin/uniq
    2           0 B  /usr/bin/unexpand
    2           0 B  /usr/bin/tty
    2           0 B  /usr/bin/truncate
    2           0 B  /usr/bin/tr
    2           0 B  /usr/bin/timeout
    2           0 B  /usr/bin/test
    2           0 B  /usr/bin/tee
    2           0 B  /usr/bin/tail
    2           0 B  /usr/bin/tac
    2           0 B  /usr/bin/sum
    3           0 B  /usr/bin/strings
    2           0 B  /usr/bin/split
    2           0 B  /usr/bin/sort
    2           0 B  /usr/bin/shuf
    2           0 B  /usr/bin/shred
    2           0 B  /usr/bin/sha512sum
    2           0 B  /usr/bin/sha256sum
    2           0 B  /usr/bin/sha1sum
    2           0 B  /usr/bin/seq
    2           0 B  /usr/bin/realpath
    2           0 B  /usr/bin/readlink
    2           0 B  /usr/bin/printf
    2           0 B  /usr/bin/paste
    2           0 B  /usr/bin/od
    2           0 B  /usr/bin/nproc
    2           0 B  /usr/bin/nohup
    2           0 B  /usr/bin/nl
    2           0 B  /usr/bin/mkfifo
    2           0 B  /usr/bin/md5sum
    2           0 B  /usr/bin/lzma
    2           0 B  /usr/bin/lzcat
    2           0 B  /usr/bin/install
    2           0 B  /usr/bin/id
    2           0 B  /usr/bin/hostid
    2           0 B  /usr/bin/head
    2           0 B  /usr/bin/fold
    2           0 B  /usr/bin/unlink
    2           0 B  /usr/bin/expr
    2           0 B  /usr/bin/expand
    2           0 B  /usr/bin/env
    2           0 B  /usr/bin/du
    2           0 B  /usr/bin/dirname
    2           0 B  /usr/bin/cut
    2           0 B  /usr/bin/comm
    2           0 B  /usr/bin/cksum
    2           0 B  /usr/bin/basename
    2           0 B  /usr/bin/awk
    2           0 B  /usr/bin/[
    2           0 B  /tmp
    2           0 B  /lib/apk/exec
    2           0 B  /usr/bin/unxz
    2           0 B  /usr/bin/wc
    2           0 B  /lib/apk/db/lock
    2           0 B  /usr/bin/who
    2           0 B  /usr/bin/whoami
    2           0 B  /usr/bin/xzcat
    2           0 B  /bin/uname
    2           0 B  /bin/true
    2           0 B  /bin/touch
    3           0 B  /bin/tar
    2           0 B  /bin/sync
    2           0 B  /bin/stty
    2           0 B  /usr/bin/yes
    2           0 B  /bin/sleep
    2           0 B  /bin/rmdir
    2           0 B  /bin/rm
    2           0 B  /bin/pwd
    2           0 B  /bin/printenv
    2           0 B  /bin/nice
    2           0 B  /bin/mv
    2           0 B  /bin/mktemp
    2           0 B  /bin/mknod
    2           0 B  /bin/mkdir
    2           0 B  /bin/ls
    2           0 B  /bin/ln
    2           0 B  /bin/false
    2           0 B  /bin/echo
    2           0 B  /bin/df
    2           0 B  /bin/dd
    2           0 B  /bin/date
    2           0 B  /bin/cp
    2           0 B  /bin/chown
    2           0 B  /bin/chmod
    2           0 B  /bin/chgrp
    2           0 B  /bin/cat
    2           0 B  /bin/base64
    2           0 B  /usr/sbin/chroot
    2           0 B  /usr/lib/libLLVM-15.0.6.so
    2           0 B  /bin/stat
Results:
  FAIL: highestUserWastedPercent: too many bytes wasted, relative to the user bytes added (%-user-wasted-bytes=0.4797200053524177 > threshold=0.1)
  SKIP: highestWastedBytes: rule disabled
  FAIL: lowestEfficiency: image efficiency is too low (efficiency=0.7629371198317819 < threshold=0.9)
Result:FAIL [Total:3] [Passed:0] [Failed:2] [Warn:0] [Skipped:1]

(now) I can't think of a better solution than to wait until the base image ( postgres:15-alpine3.17 ) is updated.

Lesson learned: After each docker build we should run the Dive CI tool as a check to detect similar things as soon as possible.

@harmv
Copy link
Author

harmv commented Feb 5, 2023

Hm...

Reading through your comments I know realize that: My initial report is not correct.
Although you don't have a multi-stage build image, you do have a nice clean-up step, that indeed cleans everything up. And because thats in the same RUN command, you should end up with an image that only contains what is required.

That should be just as good as a multi-staged build.
Just not for the libLLVM-15.so, so you found. That unintentionally got upgraded.

So maybe a simpler solution is possible.
Prevent the duplication/upgrade of libLLVM
Or, better if possible, do we need libLLVM at all in the final image? Ensure it gets deleted.

@harmv
Copy link
Author

harmv commented Feb 8, 2023

Interestingly, it might be the case that libLLVM is not needed at all to run.
I seem to be able to startup (and use) postgis perfectly fine when manually removing that library.

For test I added

 && rm /usr/lib/libLLVM-15.so

To the RUN stage in the Dockerfile of postgis.

After that I can perfectly fine fire-up the db, and run (my own) projects gis-related unittests against it.

Is this a bug in the postgres Dockerfile?
They have llvm15-libs-15.0.6-r0 in .postgresql-rundeps. Should that be filtered out? Or is that library required?

	runDeps="$( \
		scanelf --needed --nobanner --format '%n#p' --recursive /usr/local \
			| tr ',' '\n' \
			| sort -u \
			| awk 'system("[ -e /usr/local/lib/" $1 " ]") == 0 { next } { print "so:" $1 }' \
# Remove plperl, plpython and pltcl dependencies by default to save image size
# To use the pl extensions, those have to be installed in a derived image
			| grep -v -e perl -e python -e tcl \
	)"; \

There is some trickery to manually filter out perl, python & tcl. Is such a thing also needed for llvm15-libs?

If so, that would win double the space. (The problem reported was due to the fact that libLVM size appears twice in the final image, but then instead of appearing once, it would not appear at all)

[edit] It seems LLVM is required for JIT. Which is debatable wether you need that in the alpine image.

@harmv
Copy link
Author

harmv commented Feb 8, 2023

Suggested to postgres to drop llvm from runDeps See: docker-library/postgres#1044

@harmv
Copy link
Author

harmv commented Feb 8, 2023

Anyway I think postgis would benefit from a multi-stage build in its Dockefile.
In that case you never depend on whether of not your build steps requires libraries version that a are different or not from the postgres image.
It adds never to the final image size.

I'll take a stab at this this week, and provide you with an example. (in 10 days or so)

Stay tuned...

@ImreSamu
Copy link
Member

ImreSamu commented Feb 8, 2023

[edit] It seems LLVM is required for JIT. Which is debatable wether you need that in the alpine image.

imho:
"Breaking change" is not good for existing users, so I would be very careful about that.
In the future, it is expected that the SFCGAL package will also be included and this will lead to a further increase in the alpine image size.

However, a -slim version might make sense in the future ( postgis/postgis:13-3.3-alpine-slim ) ( No SFCGAL, No LLVM, ... )

Anyway I think postgis would benefit from a multi-stage build in its Dockefile.
In that case you never depend on whether of not your build steps
requires libraries version that a are different or not from the postgres image. ....

imho:
It probably masks docker-image size growth problems like this, but at what cost should be investigated.

For example, might there be unanticipated secondary effects that could affect stability?
E.g. if we build PostGIS with LLVM 15.0.7 and then have LLVM 15.0.6 in the final image, could this cause some unexpected problems? Theoretically API and ABI compatibility is guaranteed, but if you need to debug a problem, it's not all the same. And this is just one possible problem with multi-stage build.

Although there is already a thorough PostGIS test at build time
( make -j$(nproc) check RUNTESTFLAGS=--extension PGUSER=postgres ) ,
this test must be run on the final layer - to guarantee that everything is perfect with LLVM 15.0.6.
This is also important for the future arm64 alpine image, where buildx+qemu will add another layer of complexity.. ( #312 )

Currently, only the -master image is multi-stage based but there is not such a high stability requirement.

@harmv
Copy link
Author

harmv commented Feb 8, 2023

imho:
"Breaking change" is not good for existing users, so I would be very careful about that.
In the future, it is expected that the #293 and this will lead to a further increase in the alpine image size.

Yeah, I agree. If a dependency of postgresql on llvm is to cut, they should do that upstream (postgres) and not here. (and the'll not do that probably)

For example, might there be unanticipated secondary effects that could affect stability?
E.g. if we build PostGIS with LLVM 15.0.7 and then have LLVM 15.0.6 in the final image, could this cause some unexpected problems? Theoretically API and ABI compatibility is guaranteed, but if you need to debug a problem, it's not all the same. And this is just one possible problem with multi-stage build.

I gave this some thoughts.
No there is no secondary effect. Quite the reverse.
Its the postgres binary that has a runtime dependency on libLVM, not postgis.
The postgres binaries have been compiled against LLVM 15.0.6, but you re-distribute it, unintentionally, with LLVM 15.0.7. That should work... and it does, due ABI compatibility etc.. But actually its your current build configuration that requires ABI compatibility, and not a (future, if any :) ) multi stage build.
Your postgis extensions do not link against any LLVM library, those libs are just needed during compilation, for the tools used. You (postgis) has no runtime dependency on libLVM.

So, adding a multi staged build should be preferable from a stability point of view also. (besides the size issues)
And that will leave in the dependency of postgresql on libLVM (15.0.6).

... this test must be run on the final layer

Yeah, agreed

Currently, only the -master image is multi-stage based but there is not such a high stability requirement.

Ah nice, so there is a working example of a similar multi staged build. I'll take a look.

@harmv
Copy link
Author

harmv commented Feb 13, 2023

Please check-out my attempt (directly edited Dockerfile, not the template)

harmv@4283d74
https://github.com/harmv/docker-postgis/blob/multi-stage-test-339/13-3.3/alpine/Dockerfile

pro:

  • smaller size
  • correct version of libLVM in final docker image

Sieze

postgis             test            50501359732a   11 minutes ago   475MB                     <---- multi-stage build size
postgis/postgis     13-3.3-alpine   e3a43f7cff24   2 weeks ago      556MB            <---- original size

libLVM version is correct now

/ # find /usr -name "*LVM*"
/usr/lib/llvm15/lib/libLLVM-15.so
/usr/lib/libLLVM-15.so
/usr/lib/libLLVM-15.0.6.so

is this something you would consider taking?

@ImreSamu
Copy link
Member

@harmv :

Now the latest `postgis/postgis:13-3.3-alpine˙ is 425MB and your proposal is 475MB,
And both dockerfiles are quite complex, so suddenly I don't know why there is such a big difference. ( +50Mb )

$ docker pull postgis/postgis:13-3.3-alpine 
13-3.3-alpine: Pulling from postgis/postgis
Digest: sha256:b8814d6d2bc03df56dedd40180ea30b41bb14aff7c926cd75c4dd5a06d467a9b
Status: Image is up to date for postgis/postgis:13-3.3-alpine
docker.io/postgis/postgis:13-3.3-alpine

$ docker images postgis/postgis:13-3.3-alpine 
REPOSITORY        TAG             IMAGE ID       CREATED        SIZE
postgis/postgis   13-3.3-alpine   1837b569ef1f   11 hours ago   425MB

@harmv
Copy link
Author

harmv commented Feb 13, 2023

Hm..
I do understand why the latest got much smaller (556 MB- > 425 MB), the upstream image got its LVM library upgraded.
So you no longer get the (unintentional) upgrade of libLLVM. (15.0.6 -> 15.0.7)

docker pull  postgres:13-alpine3.17

docker images postgres:13-alpine3.17
REPOSITORY   TAG             IMAGE ID       CREATED      SIZE
postgres     13-alpine3.17   55f14697b527   2 days ago   238MB

docker run -it  postgres:13-alpine3.17 sh

/ # ls -lh /usr/lib/libLL*
lrwxrwxrwx    1 root     root          13 Feb 11 05:09 /usr/lib/libLLVM-15.0.7.so -> libLLVM-15.so
-rwxr-xr-x    1 root     root      125.3M Jan 14 09:37 /usr/lib/libLLVM-15.so

I'm confused about the 50MB diff though too.

I can reproduce it

postgis             test            3120829531b9   53 seconds ago   475MB        <--- my test
postgis             org             7fea91299d0b   3 minutes ago    425MB           <--- original Dockerfile

Something wrong in my attempt, for sure.
Somehow the COPY + RUN image is 50MB larger

orig: RUN   187MB
test: COPY  115 MB
      RUN   122 MB

your trick CI=true dive .. to the rescue....

$ CI=true dive postgis:test
  Using default CI config
Image Source: docker://postgis:test
Fetching image... (this can take a while for large images)
Analyzing image...
  efficiency: 89.3331 %
  wastedBytes: 100981692 bytes (101 MB)
  userWastedPercent: 21.5920 %
Inefficient Files:
Count  Wasted Space  File Path
    2         18 MB  /usr/local/bin/postgres
    2        3.0 MB  /usr/local/lib/postgresql/bitcode/postgres.index.bc
    2        1.9 MB  /usr/local/bin/ecpg
    2        1.7 MB  /usr/local/share/postgresql/postgres.bki
    2        1.5 MB  /usr/local/bin/psql

Ok, I got it. this is caused by the line COPY --from=builder /usr/local /usr/local, that actually causes every file that was not changed to be duplicated in the layers. Some consider that a bug, there is a docker ticket for that.. I just found.

So in order to workaround this, I need a more specific COPY command, not all of /usr/local. Can you give me some hints? which contents of the postgis build/install should be copied?

I am a bit surprised about the versioning in docker images though....
postgres :13-alpine3.17 of two weeks ago it not the same as postgres:13-alpine3.17 of now.
postgis:13-3.3-alpine of two weeks ago is not the same as postgis postgis:13-3.3-alpine of now ( Its actually 131 MB smaller)
Maybe this is expected... however I just didn't see that one coming :)

@ImreSamu
Copy link
Member

I am a bit surprised about the versioning in docker images though....
postgres :13-alpine3.17 of two weeks ago it not the same as postgres:13-alpine3.17 of now.
postgis:13-3.3-alpine of two weeks ago is not the same as postgis postgis:13-3.3-alpine of now ( Its actually 131 MB smaller)
Maybe this is expected... however I just didn't see that one coming :)

There were 2 big upgrades last weekend, so the postgres:13-alpine3.17 regenerated.

And there is a weekly cron ( cron: '15 5 * * 1' ) - regenerating ALL postgis/postgis images

.. I need a more specific COPY command, not all of /usr/local.
Can you give me some hints? which contents of the postgis build/install should be copied?

not really, it's strange and unfamiliar territory for me too.

@harmv
Copy link
Author

harmv commented Feb 16, 2023

I managed to get the size down to 425 MB by adding a workaround for docker issue: 21950.

while doing that I encountered docker issue 45015

See: harmv@82325bd

End result: (13.3-alpine)

postgis test d5628d771874 19 minutes ago 425MB

@harmv
Copy link
Author

harmv commented Feb 16, 2023

My initial bug report was for the huge size.

Investigation showed that this was caused by 2 issues:

  1. inclusion of libLVM (enabling of JIT feature in postgres)
  2. version inconsistency of alpine between upstream (postgres) and postgis. (resulting in the duplication of the size of libLVM)

Given that:

  • the first issue (1) The inclusion of libLVM is intentionally added by upstream (postgres wants JIT enabled in the alpine version. -> not for here to change
  • issue 2, the version inconsistency is recently solved by a rebuild of postgres alpine docker images
  • my suggested multi-staged build solution is not smaller. (its now the same size)
  • my suggested multi-staged build implementation is rather messy. (also due to some open docker bugs)

I suggest to just close this issue, and leave everything as is.

@phillipross
Copy link
Contributor

@harmv I'll close out the issue. thanks for the investigation efforts!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants