GitHub Actions: How to Implement Caching For Releases (Rust as Example)

tl;dr: Please find a working reference implementation on GitHub.

Update: I turned the example below into a GitHub Action, available at hendrikmaus/custom-cache-action.

If you are reading this at a later point in time, GitHub might have delivered this feature.

Context

As of today, which is November 20th 2021, GitHub Actions does support sharing caches between git tags, i.e. when building a release of v1.1.0, one cannot re-use the cache built when v1.0.0 was released. That is because the native caching system is built for branches.1

Aside: this is not entirely true. You can build and store a cache on a tag, but you can only re-use it, when building the very same tag again.

1

"(...) the cache action searches for key and restore-keys in the parent branch and upstream branches." source

Workaround

Since I like to write Rust code, and it is no mystery that Rust compile times can be long, I built a very rudimentary workaround that leverages GitHub Packages2 to store a build cache.

Please mind that the following approach lacks all the more advanced caching features, but it should still be a good start:

2

GitHub Packages supports various storage concepts and is free for open-source projects.

Now, how does it work? We'll leverage the Docker registry feature of GitHub Packages. The workflow will create a Dockerfile with a single layer, which is the cache content. The base image is scratch, so there is really nothing else in it. The image is pushed, so that subsequent releases can pull it.

Aside: Why not GitHub Actions Artifacts? They cannot be shared between workflows; they are intended to share data between jobs in a single workflow.

In order to restore the cache, the workflow will first check if it can pull the cache image. If it can, a neat trick is used to get the content out of it: One can expand the filesystem of a container image without actually running the image. In this case, there wouldn't be anything to run inside it really. Once the filesystem was overlayed, the content can be copied out.

Let's see it:

The Workflow Definition

---
name: Release

on:
  release:
    types:
      - published

env:
  CACHE_REGISTRY: ghcr.io
  CACHE_IMAGE: ${{ github.repository }}
  CACHE_TAG: release-cache

defaults:
  run:
    shell: bash

jobs:
  release:
    name: Release
    runs-on: ubuntu-20.04
    steps:
      - uses: actions/checkout@v2

      - name: Setup Rust
        uses: actions-rs/toolchain@v1
        with:
          profile: minimal
          toolchain: stable

      - name: Log in to the Container registry
        uses: docker/login-action@v1
        with:
          registry: ${{ env.CACHE_REGISTRY }}
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Restore cache
        run: |
          if ! docker pull "${CACHE_REGISTRY}/${CACHE_IMAGE}:${CACHE_TAG}" &>/dev/null; then
            echo "No cache found"
            exit 0
          fi

          tempdir=$(mktemp -d)
          cd "${tempdir}"

          echo "creating cache container filesystem"
          # this container isn't actually started; we just gain access to its filesystem
          # it also does not contain 'bash", but we need to provide some argument, which is ignored
          docker create -ti --name cache_storage "${CACHE_REGISTRY}/${CACHE_IMAGE}:${CACHE_TAG}" bash
          docker cp cache_storage:cache.tgz .

          echo "expanding cache to disk"
          tar xpzf cache.tgz -P

          echo "cleaning up"
          docker rm cache_storage &>/dev/null
          cd -
          rm -rf "${tempdir}"

      - name: Compile
        uses: actions-rs/cargo@v1
        with:
          command: build
          args: --release

      # ... additional release steps

      - name: Save cache
        run: |
          tempdir=$(mktemp -d)
          cd "${tempdir}"

          paths=()
          paths+=( "$GITHUB_WORKSPACE/target" )
          paths+=( "/usr/share/rust/.cargo/registry/index" )
          paths+=( "/usr/share/rust/.cargo/registry/cache" )
          paths+=( "/usr/share/rust/.cargo/git" )

          echo "building cache tarball"
          tar --ignore-failed-read -cpzf cache.tgz "${paths[@]}" -P

          cat <<-EOF > Dockerfile
          FROM scratch
          LABEL org.opencontainers.image.description="Release cache of ${GITHUB_REPOSITORY}"
          COPY cache.tgz .
          EOF

          echo "building cache container image"
          docker build --tag "${CACHE_REGISTRY}/${CACHE_IMAGE}:${CACHE_TAG}" --file Dockerfile .
          docker push "${CACHE_REGISTRY}/${CACHE_IMAGE}:${CACHE_TAG}"

          echo "cleaning up"
          docker rmi "${CACHE_REGISTRY}/${CACHE_IMAGE}:${CACHE_TAG}"
          cd -
          rm -rf "${tempdir}"

As mentioned before, this particular example is focussed on Rust. To adapt it to your use-case, update the list of paths to tar up and swap out the Rust-specific steps for your own. That should give you a solid place to start.

Results

In the reference implementation, I did two releases:

Once something changes in Cargo.lock, I reckon it will update the crates.io index, download the missing/updated crates and then only re-compile what needs to be.

Conclusion

This topic is a known issue, and it can be tracked at actions/cache#556. I hope that GitHub will one day natively support release caches in the existing infrastructure.

However, this workaround can serve as a solid base to speed up release builds in the meantime.

Comments

You can comment on this article using GitHub Discussions - every article is available as discussion right after its release.

Alternatively, you can reach out to me on Twitter.