Sudden huge growth in repo size for no apparent reason

We have a small project, Moonwards, that has just 2 active contributors. Earlier today the other contributor had to do a merge in order to add a small commit of his. He told me that caused a very large download. It really shouldn’t have. The changes since he last pulled would only be maybe a megabyte. 

When I pulled to get his changes, git downloaded 422 MB of data - several times the size of our whole repo. A third contributor tried it and the download was similar.

I checked the changes in the merge commits due to his pulls, and they look just fine. I haven’t pulled in a few weeks, and when I check the commit list, I see he made a few other small commits I was unaware of. I thought I was the only active contributor right now, and never bothered pulling. Could that be the cause of this issue?

Maybe it is relevant that we cleaned the repo a few months ago. We moved all the blender files for the project to a separate repo which is now a submodule of Moonwards (moonwards-colony-files). One of our contributors figures that somehow the history with those files present has gotten added to the repo again. I don’t know how that could have happened, as I am the only person who has them and that folder is in a separate location. I’m trying to check through the commit history to see where the issue started, but I don’t know how to go about that.

Hi @briligg This sounds like something for which you might want to email private GitHub Support. If you reach out to us at here, we can take a closer look at the details of your account and help a little better. Thanks!

Thanks Nadia

Someone on the project looked after it. They created tools to check the repo and its history for unnecessary files, create a list of them, and then eliminate those files from the repo history using BFG Repo Cleaner. Those tools are now in the ‘maintenance’ directory of the repo, with a readme that explains the steps of the process. 

We cleaned the repo a few months ago this way, but one of us somehow pushed the previous history to GitHub and then we all got it again. I am in the odd position of ultimately being the person responsible for the project, but having no coding skills to manage something like that. Fortunately other contributors are able to handle this. And then they write things like the tools and readme, so i can do it myself, so hopefully that might also be useful to someone else later. 

1 Like