Caching files between GitHub Action executions

This would be good for caching pip downloads in Python. For example, Travis CI has

cache: pip

to cache

$HOME/.cache/pip

https://docs.travis-ci.com/user/caching/#pip-cache

2 Likes

Thank you for being here, we recommend using artifacts for this specific case. Artifacts are the files created when you build and test your code. For example, artifacts might include binary or package files, test results, screenshots, or log files. Artifacts are associated with the workflow run where they were created and can be used by another job or deployed.

Our team created an action for uploading artifacts from your workflow here:

https://github.com/actions/upload-artifact

And for downloading them as well:

https://github.com/actions/download-artifact

If you have any specific questions about either of those actions, we ask that you open an issue in the respective repository as our Actions engineering team monitors both repositories.

10 Likes

I think the point of caching is to avoid downloading at all. I don’t think downloading artifact will give significant margin againts installing dependencies, and it need additional step to upload. So artifact can’t be used to speed up dependencies installation by signicant margin.

5 Likes

That sounds fine for specific files, but what about Docker cache? Are you suggesting we can upload/download the entire /var/lib/docker directory?

1 Like

That’s actually something docker buildkit allows, but very few docker repositories support just yet
And the version of docker on the nodes is rather old too

@duncan3dc, you have probably already found this out since you posted but leaving for anyone else who might happen on this discussion.

Both the upload and download of artifacts do work with whole directories. For instance, adding a second file in the upload-artifact example as…

steps:
- uses: actions/checkout@v1

- run: mkdir -p path/to/artifact

- run: echo hello > path/to/artifact/world_1.txt
- run: echo hello > path/to/artifact/world_2.txt

- uses: actions/upload-artifact@master
  with:
    name: my-artifact
    path: path/to/artifact

…does upload the whole directory with both files as a single artifact.

I can also verify that you  cannot upload  /var/lib/docker as you will receive access denied when trying to do so.

Regardless, the  upload-artifact and  download-artifact actions don’t satisfy the original request as they do not perist from action execution to action execution.

1 Like

Is the artifact unique per branch and PR, as it is with Travis caches?

https://docs.travis-ci.com/user/caching/ says:

  • Travis CI fetches the cache for every build, including branches and pull requests.
  • If a branch does not have its own cache, Travis CI fetches the cache of the repository’s default branch.
  • There is one cache per branch and language version/ compiler version/ JDK version/ Gemfile location/ etc.
  • Only modifications made to the cached directories from normal pushes are stored.
8 Likes

Based on the terse conversation at https://github.com/actions/download-artifact/issues/3, it looks like download-artifact is not a solution to caching across invocations of workflows.

My use case is a project using Go, where I would like to preserve GOCACHE from one Workflow invocation to another. This particular project is taking about 11 minutes to run go test -race ./.... If it could use the build cache from a previous run, I would expect it to take less than a minute, as many of the test results would be cached and most of the compilation results would be cached too.

Not being able to transfer GOCACHE between runs is a significant hindrance to adopting GitHub Actions.

2 Likes

We appreciate the feedback, it’s clear to us that this is necessary.  We’re working on caching packages and artifacts between workflow executions, we’ll have it by mid-November.

193 Likes

That’s great to hear! We weren’t comfortable moving over from Circle without this.

Looking forward to it.

8 Likes

Awesome! Thanks :slight_smile:

1 Like

We are also waiting this to move completely. <3

2 Likes

Glad you guys are working on this as well, I use GitHub Actions to build some intermediary packages and caching is the only way to do it since artifacts won’t really cut it for large objects. It’d take too long to upload.

2 Likes

Looking forward to it! My use case:

On a macOS instance, cache some Node global dependencies expo-cli, react-native, and @sentry/cli. Possibly also some CocoaPods.

1 Like

That’s cool man, just got accepted in the beta and really loving it!

When this is live I might get rid of CircleCI all together :smile:!

2 Likes

For those who can’t wait the official cache solution, implementing one is not that hard.

You need an external storage. I used an S3 bucket for that.

Do a aws s3 sync \<remote\> \<local\> before the build, then a aws s3 sync \<local\> \<remote\> after the build.

Build must be smart enough to use the files you download from remote location.

It’s working well for me.

6 Likes

any updates on this ? will this be a part of Nov 13 release ?
eagerly waiting for this.
upload artifacts failing for me with this error

Now that there is a release date for the General Availability of GitHub Actions, can we expect that caching support will follow soon after that or that it won’t be delayed?

I know that on a pure level, a true isolated environment for running tests require not sharing dependencies per build or even per job, but for private projects it would be a shame to waste billable minutes for the hosted runner minutes on repeatedly downloading the same files over and over again.

4 Likes

Yay! https://github.com/actions/cache

8 Likes