Does GitHub ever purge commits or files that were visible at some time?

I have created a GitHub permalink to a file inside some commit. E.g. https://raw.githubusercontent.com/kubeflow/pipelines/785d474699cffb7463986b9abc4b1fbe03796cb6/components/ibm-components/commons/config/component.yaml

Under what circumstances can that permalink become unavailable?

Potential hazards are:
Deleting the commit branch (what if there was a PR for that branch?)
Force-pushing the branch(what if there was a PR for that branch?)

Can any of these or other scenarios make the link unavailable?

You can find out information about how a commit (or the data it refers to) can be removed by referring to our help article on removing sensitive data from a repository. Essentially, if the commit becomes “orphaned” (or “unreachable” it git terms), then it is possible that the commit or its data may be purged from the repository on GitHub. In practice, we try not to clean up even orphaned items unless requested by a repo owner or admin, but if a git object becomes orphaned, we can’t guarantee that it will be retained forever.

So, if it is important to you that commit is not removed from the repository, then I would recommend that you ensure a reference is maintained to it or one of its descendant commits for as long as the repository exists. One easy way to do this would be to tag that commit using the git tag command.

I hope that helps!

1 Like

What about squashed commits from a Pull Request?

Let’s say I have a feature branch with some commits that went through a PR and was squashed into master. The feature branch was deleted afterwards.

As far as I know, the original commits made on the feature branch are now unreachable. But the PR page still lists the original commits, and its even possible to restore the original branch from there.

In that case, are those original commits safe? Or is there a chance of them being garbage-collected and lost forever?

And, whatever it is, is this behavior the same on GitHub and GitHub Enterprise?

As far as the behavior on github.com, my original message still reflects our general policy:

we try not to clean up even orphaned items unless requested by a repo owner or admin, but if a git object becomes orphaned, we can’t guarantee that it will be retained forever.

Additionally, if you’re concerned about the behavior of your GitHub Enterprise instance, you may want to consult with your GitHub Enterprise admin because they may have different policies as to how orphaned objects may be treated.

2 Likes

For general git behavior (e.g. in your local repository): Git occasionally runs a garbage collection, or you can run it manually with git gc (usually unnecessary). The documentation for the git gc --prune option says that by default the garbage collection removes orphaned object older than two weeks, and that value is configurable.