Are tag-less container images deleted?

I’ve been trying out the GHCR with a very simple container, including build on Actions. So far I have the build always push to the beta tag. I noticed that after a second git push that triggered the container build and push to GHCR there is no trace left of the image that was previously tagged beta (i.e., no untagged version in the versions list).

Is this intentional? I’m not complaining about the behavior, I just want to know if I can rely on it to e.g. push a nightly tag or similar without worrying about cleaning up old versions. :wink:

There should be an untagged version left behind assuming the new build is different. If there was no difference then you’d see this behaviour. We’re not deleting tag-less containers, they stick around until you delete them.

Thanks for the explanation! Content-wise the containers are identical, just different labels, so I suppose that explains it. :slightly_smiling_face:

In that case I will have to look into cleanup for nightly builds and the like.

I experimented with this a bit more and confirmed that tag-less images are still accessible if you have the digest. However, there seems to be no way to discover them: They don’t show up on the Github package page (only tagged versions there), and the Docker HTTP API V2 documentation doesn’t mention any way to find them either.

Is there any other way I’m missing?

Can you take a look at the versions page? This is part of the settings page. I thought it had all the versions available and you could click into them.

Evidently it shows only tagged versions. Two digests I still have saved for that image, both formerly tagged beta:

sha256:8090e11e86f6a5be8138cfc5026e170123497bfa1e72ec75a6cb831135bba36b
sha256:676d25701f6af4f8e3b958392c30e18af0e71a48fcf3ccd4ef3621ab6b041ba3

I just tested that I can still pull the images using those digests (yes), but they don’t show up on the versions page, or anywhere else that I’m aware of.

This is also how it works on Docker Hub.

How would you want the un-tagged images to appear? Would you want to see the tags the image was previous tagged with?

Preferably listed via an API call (rationale below), although on the versions page (maybe behind an “show untagged images” button or something) would be nice.

Can’t hurt, but I’d be more interested in the timestamp of when the image was pushed.

What I’m thinking of is cleaning up images that aren’t needed after a while, e.g.:

  • Nightly (or even per-commit) builds that could be used for integration testing, or development on related projects.
  • Build environments: I’d like to prepare environments for use that with the Actions jobs.<job_id>.container keyword so dependencies don’t have to be installed during every build. That takes about 2 min each time, per distro tested in a matrix build (well, except Alpine which needs just a few seconds). To get updates of the underlying distro and dependencies I’d want to rebuild those environments frequently, e.g. during the first CI build of the day.

The latter would accumulate in the order of gigabytes pretty quickly, and would hardly be useful for anyone else in old versions, so I’d like to clean up after myself. Basically a sort of docker image prune. :wink:

As long as packages storage is free for open source projects it would not hurt me to leave them around, but for people with private images it could quickly become a billing issue. And evidently it became an issue for Docker Hub, too, given their recent announcement to delete unused images on free accounts after 6 months (that’d do the trick for me, too, but maybe not for people who need to pay for storage). That post also says:

In addition, Docker will also be providing tooling, in the form of a UI and APIs, that will allow users to more easily manage their images.

I haven’t seen any specifics on that yet, but to me it sound like “tools to help you clean up”, so a compatible API on GHCR might be a thing to consider.

@airtower-luna thanks for the link to the announcement.

  • Example #1: Molly, a free Docker Hub user, pushed a tagged image molly/hello-world:v1 to Docker Hub on January 1, 2019. The image was never pulled since it was pushed. This tagged image will be considered inactive beginning November 1, 2020 when the new policy takes effect. The image and any tag pointing to hit, will be subject to deletion on November 1, 2020.

Wow, this is some pretty agressive cleanup of tagged images!

  • Example #2 : Molly has another untagged image molly/myapp@sha256:c0ffee that was first pushed on January 1, 2018. This image was last pulled on August 1, 2020. This image will be considered an active image and will not be subject to deletion on November 1, 2020.

For untagged images this seems pretty reasonable.

What I’m thinking of is cleaning up images that aren’t needed after a while

So you’d want to know when they were pushed, whether they’re tagged and maybe when they where last downloaded?

Yes, exactly. I guess the most flexible solution would be to include a list of tags for the image, if any. With that information I could easily set up e.g. a scheduled Actions workflow to clean up old images that are either untagged or tagged in a certain way. :slightly_smiling_face:

And I agree that the Docker Hub policy is rather aggressive for tagged images, in principle I’d like to keep release versions around indefinitely. On the other hand depending on the image content I might worry about people using such an old image anyway, at least without vulnerability scanning (which I’d really like to see in GHCR, but that’s a separate thing). I wouldn’t want to keep track of which of the included libraries might have gotten security patches in the last half year, and whether any of the issues might be exploitable in the context of the image. :sweat_smile:

1 Like