Skip to content

Commit

Permalink
Auto merge of rust-lang#119290 - Kobzol:ci-docker-registry-cache, r=<…
Browse files Browse the repository at this point in the history
…try>

Cache CI Docker images in ghcr registry

This PR changes the way `rust-lang` caches Docker images used in CI workflows. Before, the intermediate Docker layers were manually exported from `docker history` and backed up in S3. However, this approach doesn't work any more with the Docker version used by GitHub Actions since August 2023. We had to revert to disabling Docker BuildKit to make the old caching work, but this workaround will stop working eventually, after GitHub updates Docker again and the old build backend will be removed.

This PR changes the caching to use [Docker caching](https://docs.docker.com/build/cache/) instead. There are several backends for the cache, for our use-case S3 and Docker registry makes sense. This PR uses the Docker registry backend and uses the ghcr.io registry.

The caching creates a Docker image labeled `rust-ci`, which is currently stored to the `ghcr.io/rust-lang-ci` package registry. This image appears [here](https://ghcr.io/rust-lang-ci/rust-ci). The image is stored in `rust-lang-ci` and not `rust-lang`, because `try` and `auto` builds run in the context of that repository, so the used `GITHUB_TOKEN` has permissions for it (unlike for `rust-lang`).

For pull request CI runs, the provided `GITHUB_TOKEN` reduces its permissions automatically to `packages: read`, which means that we won't be able to write the Docker image. If we're not able to write, we won't have anything to read. So I disabled the caching entirely for PR runs (it makes it slightly faster to build the Docker image if we don't have to deal with exporting and using a separate build driver). Note that before this PR, we also weren't able to read or write the cache on PR runs.

Related issue: rust-lang/infra-team#81

r? `@Mark-Simulacrum`
  • Loading branch information
bors committed Dec 29, 2023
2 parents dc450f9 + 6f90144 commit 180eb9e
Show file tree
Hide file tree
Showing 3 changed files with 37 additions and 47 deletions.
7 changes: 7 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ jobs:
CI_JOB_NAME: "${{ matrix.name }}"
CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
HEAD_SHA: "${{ github.event.pull_request.head.sha || github.sha }}"
DOCKER_TOKEN: "${{ secrets.GITHUB_TOKEN }}"
SCCACHE_BUCKET: rust-lang-ci-sccache2
TOOLSTATE_REPO: "https://github.com/rust-lang-nursery/rust-toolstate"
CACHE_DOMAIN: ci-caches.rust-lang.org
Expand Down Expand Up @@ -168,10 +169,13 @@ jobs:
if: "success() && !env.SKIP_JOB && (github.event_name == 'push' || env.DEPLOY == '1' || env.DEPLOY_ALT == '1')"
auto:
name: "auto - ${{ matrix.name }}"
permissions:
packages: write
env:
CI_JOB_NAME: "${{ matrix.name }}"
CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
HEAD_SHA: "${{ github.event.pull_request.head.sha || github.sha }}"
DOCKER_TOKEN: "${{ secrets.GITHUB_TOKEN }}"
SCCACHE_BUCKET: rust-lang-ci-sccache2
DEPLOY_BUCKET: rust-lang-ci2
TOOLSTATE_REPO: "https://github.com/rust-lang-nursery/rust-toolstate"
Expand Down Expand Up @@ -561,11 +565,14 @@ jobs:
if: "success() && !env.SKIP_JOB && (github.event_name == 'push' || env.DEPLOY == '1' || env.DEPLOY_ALT == '1')"
try:
name: "try - ${{ matrix.name }}"
permissions:
packages: write
env:
DIST_TRY_BUILD: 1
CI_JOB_NAME: "${{ matrix.name }}"
CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
HEAD_SHA: "${{ github.event.pull_request.head.sha || github.sha }}"
DOCKER_TOKEN: "${{ secrets.GITHUB_TOKEN }}"
SCCACHE_BUCKET: rust-lang-ci-sccache2
DEPLOY_BUCKET: rust-lang-ci2
TOOLSTATE_REPO: "https://github.com/rust-lang-nursery/rust-toolstate"
Expand Down
72 changes: 25 additions & 47 deletions src/ci/docker/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -74,25 +74,6 @@ if [ -f "$docker_dir/$image/Dockerfile" ]; then

cksum=$(sha512sum $hash_key | \
awk '{print $1}')

url="https://$CACHE_DOMAIN/docker/$cksum"

echo "Attempting to download $url"
rm -f /tmp/rustci_docker_cache
set +e
retry curl --max-time 600 -y 30 -Y 10 --connect-timeout 30 -f -L -C - \
-o /tmp/rustci_docker_cache "$url"

docker_archive_hash=$(sha512sum /tmp/rustci_docker_cache | awk '{print $1}')
echo "Downloaded archive hash: ${docker_archive_hash}"

echo "Loading images into docker"
# docker load sometimes hangs in the CI, so time out after 10 minutes with TERM,
# KILL after 12 minutes
loaded_images=$(/usr/bin/timeout -k 720 600 docker load -i /tmp/rustci_docker_cache \
| sed 's/.* sha/sha/')
set -e
printf "Downloaded containers:\n$loaded_images\n"
fi

dockerfile="$docker_dir/$image/Dockerfile"
Expand All @@ -103,39 +84,36 @@ if [ -f "$docker_dir/$image/Dockerfile" ]; then
context="$script_dir"
fi
echo "::group::Building docker image for $image"
echo "Image checksum ${cksum}"

# As of August 2023, Github Actions have updated Docker to 23.X,
# which uses the BuildKit by default. It currently throws aways all
# intermediate layers, which breaks our usage of S3 layer caching.
# Therefore we opt-in to the old build backend for now.
export DOCKER_BUILDKIT=0
retry docker \
build \
--rm \
-t rust-ci \
-f "$dockerfile" \
"$context"
# On PR jobs, we don't have permissions to write to the cache, so we should not use
# `docker login` nor caching.
if [ "$PR_CI_JOB" -eq 1 ]
then
retry docker build --rm -t rust-ci -f "$dockerfile" "$context"
else
docker buildx create --use --driver docker-container

# Login to Docker registry
echo ${DOCKER_TOKEN} | docker login ghcr.io --username rust-lang-ci --password-stdin

dest="type=registry,ref=ghcr.io/rust-lang-ci/rust-ci:${cksum},compression=zstd,mode=min"

retry docker \
buildx \
build \
--rm \
-t rust-ci \
-f "$dockerfile" \
--cache-from type=registry,ref=ghcr.io/rust-lang-ci/rust-ci:${cksum} \
--cache-to ${dest} \
--output=type=docker \
"$context"
fi
echo "::endgroup::"

if [ "$CI" != "" ]; then
s3url="s3://$SCCACHE_BUCKET/docker/$cksum"
upload="aws s3 cp - $s3url"
digest=$(docker inspect rust-ci --format '{{.Id}}')
echo "Built container $digest"
if ! grep -q "$digest" <(echo "$loaded_images"); then
echo "Uploading finished image $digest to $url"
set +e
# Print image history for easier debugging of layer SHAs
docker history rust-ci
docker history -q rust-ci | \
grep -v missing | \
xargs docker save | \
gzip | \
$upload
set -e
else
echo "Looks like docker image is the same as before, not uploading"
fi
# Record the container image for reuse, e.g. by rustup.rs builds
info="$dist/image-$image.txt"
mkdir -p "$dist"
Expand Down
5 changes: 5 additions & 0 deletions src/ci/github-actions/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ x--expand-yaml-anchors--remove:
CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
# commit of PR sha or commit sha. `GITHUB_SHA` is not accurate for PRs.
HEAD_SHA: ${{ github.event.pull_request.head.sha || github.sha }}
DOCKER_TOKEN: ${{ secrets.GITHUB_TOKEN }}

- &public-variables
SCCACHE_BUCKET: rust-lang-ci-sccache2
Expand Down Expand Up @@ -345,6 +346,8 @@ jobs:
auto:
<<: *base-ci-job
name: auto - ${{ matrix.name }}
permissions:
packages: write
env:
<<: [*shared-ci-variables, *prod-variables]
if: github.event_name == 'push' && github.ref == 'refs/heads/auto' && github.repository == 'rust-lang-ci/rust'
Expand Down Expand Up @@ -725,6 +728,8 @@ jobs:
try:
<<: *base-ci-job
name: try - ${{ matrix.name }}
permissions:
packages: write
env:
DIST_TRY_BUILD: 1
<<: [*shared-ci-variables, *prod-variables]
Expand Down

0 comments on commit 180eb9e

Please sign in to comment.