I am looking for suggestions/advice/comments on scanning a Git repository, as thoroughly as possible with a virus scanner (most likely either Mcafee or Symantec).
As opposed to a general/daily scan of the drive containing the repo(s) itself, in this case before I migrate our environment to a new network, I must ensure all data within a given repo has been proven to be virus/threat-free.
For example if there are any compressed files within the repo/database that may not be in a scannable format, i’d need to ensure they are extracted and converted in to some sort of format the scanner can access.
I’m not familiar with the structure/file types within a repo back end but i’d assume it is a mix of text/binary/compressed etc.
Any thoughts or suggestions are appreciated (perhaps as simple as, the repo by default is already scannable and no additional steps required before running the scan?).
Note we are currently running Github Enterprise 2.16.4, in a closed environment with no external Internet access.
Commerical Anti Virus solutions are generally not able to deal with the internal Git data structure, so simply scanning the repository is usually not sufficient. You could checkout each commit, going back through history, and scan the working directory for each commit. That would likely be a rather time consuming process, but technically possible.
There are two recommendations which can help mitigate the threat of spreading viruses through Git repositories:
- Client side virus scanning. If an infected file was contained in a repository and pulled onto a client desktop, it should be detected at that time and a security response would be initiated based on your client side protections.
- External Virus scanning as part of a continuous integration process. This would be similar to any other testing process that you performed upon pushes to your repositories, where code or files that contained known viruses would receive failed test statuses.
Regarding GitHub Enterprise Server specifically; In order to provide the best support experience and most stable system, we discourage the installation of an anti-virus solution or any other extra software on an GitHub Enterprise Server instance itself. If you have any additional questions regarding your specifc GitHub Enterprise environment, please reach out via our Enterprise Support Portal.
Thank you very much for the information - I appreciate it!
You could checkout each commit, going back through history, and scan the working directory for each commit. That would likely be a rather time consuming process, but technically possible.
As a proof of concept, I have published a GitHub Action that demonstrates this type of scanning workflow: