Help
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Copilot Lvl 2
Message 1 of 4

GitHub Actions and Docker Builds - Removing Docker Cache Problems

I have a large Docker image that sometimes has issues building, because it's so big.

 

I'm using [Docker Build with Cache Action](https://github.com/whoan/docker-build-with-cache-action) for build caches, but sometimes this will fail because the cache plus the new layers is more than 14 GB.

 

I'm trying to catch this error and run a second time with no cache, but it seems some of the Docker things (images, layers, whatever) are still lingering. So I run a step `docker system prune --force` and it removed around 700 MB of files. This wasn't enough, so I ran a more aggressive command, `docker system prune --force --all --volumes`. This appears to have removed some of the GitHub Actions required components, as the next step fails to run, with the error:

 

```

Unable to find image 'e87b52:0c9a57e74a414c4bbe60bd043fcbf313' locally

```
 
So I'm wondering if there's a better way to clear out the previous step's Docker leftovers, by using a filter on the `docker system prune` command, perhaps.

Not sure if the key GitHub Actions required Docker images have a label set perhaps?
3 Replies
Highlighted
GitHub Partner
Message 2 of 4

Re: GitHub Actions and Docker Builds - Removing Docker Cache Problems

@alexgleith ,

 

I noticed you are using the GitHub-hosted runners to run your jobs. Typically each virtual machine provides about 14 GB of SSD disk space (see here) available, sometimes the actual space may be a little more or less.

Because the layers of your image build is more than 14 GB, so it can easily exceed the limit, and returns the error "No space left on device".

 

I think, using the "docker system prune" command to remove the unused data may be not helpful. It's unlikely to make more space that is more than 14 GB, because the disk space has been fixed when the runner startup. And as you mentioned, you may mistakenly remove some important data or dependencies.

 

Currently, as a workaround, I recommend you use self-hosted runners to run the these jobs. You can install the self-hosted runners on your local machines (or your VMs) that have more available space provided to run the jobs.

Highlighted
Copilot Lvl 2
Message 3 of 4

Re: GitHub Actions and Docker Builds - Removing Docker Cache Problems

Hey @BrightRan 

 

I'm hoping to avoid self-hosted runners unless absolutely necessary.

 

The issue I experience is that if I build the image without caching, it works, but takes 25 minutes.

 

If I build with caching, it takes 5 minutes.

 

If I build with a change early in the process, I've downloaded the caches and it runs out of disk space because the cache + the new layers is too much.

 

Here's what I'm hoping to achieve:

  1. Build the image with a cache (should take 5 mins)
  2. If 1 fails because it runs out of disk, clear the cache and free up the disk from step 1
  3. Because it's failed, build the image without using a cache (will take 25 mins)

 

What I'm having trouble with is that step 3 fails because there is left over Docker layers (or something) from step one.

If I do just build without a cache it works fine, but it will always take 25 mins. I want both cache and handling the cache being too big!

Highlighted
GitHub Partner
Message 4 of 4

Re: GitHub Actions and Docker Builds - Removing Docker Cache Problems

@alexgleith ,

I have a workaround maybe you can consider to be as a reference: dividedly run step1 and step2 you mentioned above in two jobs. That means making them run in two runners.

For example, as simple demo:

jobs:
  job1: Build the image with a cache
  job2: build the image without using a cache
    if: job1.status == "failure"

You just need to pass the status of job1 to the if conditional of job2. Because each job in a workflow executes in a fresh instance of the virtual machine, so you don't need to clean the cache and free up the disk.