Hi @mikegilchrist! Welcome to the Community!
Git and GitHub are optimized to provide version control and code collaboration predominantly on text files, which means each push of data to our servers triggers computation on our end to apply necessary metadata and structure things efficiently for that purpose.
That means there are many use cases, such as backups of non-text files, or database dumps, that are unsuitable for Git, and an inefficient strain on our infrastructure.
You can read more about this here:
To answer your questions:
Git isn’t trying to compress your .pdf files in the same way as if you were using an archiving program, exactly - it’s setting up pack files, so that rather storing each revision of your files separately, it stores one version and the computes “deltas” of what has changed in each revision. There’s some interesting discussion about this in this reddit thread!
I’m afraid you can’t prevent files from being processed, no.
We don’t do this because fundamentally big binaries shouldn’t be committed to the Git history in the first place.
I hope that explains things!