Caching for C++ builds

Hello all,

I am trying to setup caching for a C++ build. Since the project in question uses GNU Make, it occurred to me early on that it would probably not be wise to try to rely on the normal GNU Make mechanisms for caching due to its limitation of relying on mtime as an indicator. So instead, I am using ccache, which takes the input files into account as you would hope. This seems to work, at least on my Ubuntu build, but I have been having a bit of trouble getting it to work on a Windows build, and I think the reason why is due to a limitation in the caching. I cached a version with no files by specifying the wrong directory and I think that keeps getting picked up first.

In most caching setups shown, the mechanism to cache is to use a hash of a relevant file such as a lockfile. However, since what we’re caching here is build artifacts, there’s no meaningful way to provide a single cache key for the entire bundle of cached objects. So instead, I’ve put the commit hash into the key. Then, I put a restore path that matches the prefix.

The trouble is, I don’t think there’s any way for me to:

  • Delete stale cache entries
  • Specify that I only want to load the latest cache

Is there any workaround for this? Although it may not currently be an intended use case of GitHub’s cache action, I believe it will save significant CPU time on the runners, so I would like to find a decent solution if possible. Without any object cache, the Windows builds take ~30 minutes…

Even without a solution, this will still work to some degree as long as I can ensure there are only valid caches under the restore prefix, but it will likely be a lot less effective at saving CPU time than intended :frowning:

Thank you, any hints as to how I can improve this would be appreciated. I’m thinking if I can stick the date into the cache I can at least try to get it to pull a more recent cache in the event that there are multiple builds a day, but it would be really nice if I could find a way to get it to always pull the latest cache.

P.S. you can see my latest attempts over here, but it will likely disappear as things get restructured into another repository.

1 Like

@jchv,

About the two questions you report:

  1. Delete stale cache entries:
    GitHub will remove any cache entries that have not been accessed in over 7 days. And there is no available way to allow the cache entries to be deleted by ourselves currently.

  2. Specify that I only want to load the latest cache:
    I have a workaround that maybe you can consider it.

The cache action first searches for cache hits for key and restore-keys in the branch containing the workflow run. If there are no hits in the current branch, the cache action searches for key and restore-keys in the parent branch and upstream branches.

To make the key can hit the latest cache in the current branch or its parent branch and upstream branches, you can add a file (e.g. hashFile.txt) in these branches. Every time when you update the dependent build and save it as a new cache in your repository, you also need to update the content of the file ‘hashFile.txt’ so that the hash of this file can be updated to a new and unique value. Then use this hash of the file ‘hashFile.txt’ in the key of the cache.
For example:

- name: Cache
  id: cache
  uses: actions/cache@v2
  with:
    path: ~/.ccache
    key: ${{ runner.os }}-ccache-${{ hashFiles('hashFile.txt') }}
    restore-keys: ${{ runner.os }}-ccache-

In this way, the workflow will always use the latest hash of the file ‘hashFile.txt’ which can point to the latest cache.