Adding large folder to repository without caching?

Hi, I have a repository with about 55GB of contents, with binary files that are less than 100MB each (so no LFS mode) from a project which has almost filled up an entire hard drive. I am trying to add all of the contents to a git repo and push it to GitHub but every time I do

git add .

in the folder with my contents after initializing and setting my remote, git starts caching all the files to .git/objects, making the .git folder grow in size rapidly. All the files are binaries, so git cannot stage changes between versions anyway, so there is no reason to cache versions.

Is there any way, such as editing the git attributes or changing something about how files are staged in the git repository, to only just add indexes or references to files in the repository rather than cache them into the .git folder, while also being able to push all the data to GitHub?

1 Like

That is not possible, no. The storage inside .git/objects/ is not a cache, it’s git adding each object (file) you’re adding to its internal storage (the “real” repository, if you will). Simply put, it’s git making sure it has every version of the file the repository ever had (because that’s what it’s designed to do). Adding them before the actual commit is so git can detect change to the working copy after add, objects not used in a commit will be cleaned up later. For a much deeper explanation of how the object storage works, check the Git Internals - Git Objects chapter of the Git Book. If you do not want to keep versions, git is the wrong tool to use.

However, LFS is a way around that, were only references are stored in the repository itself, and LFS has the actual data.

2 Likes