Can I search for repositories by type (executable/binary), size (<1MB) and OS (Linux, Windows 10)?

Looking for large amounts of open source pre-compiled executables (.exe for Windows and /bin for Linux) for research project. I’ve tried to filter by size but there were not many binaries coming up in my search and not sure how to search for those specifically.

GitHub repositories typically contain the source code for libraries or executables, not the libraries or executables themselves. You would have to look to see if the repositories have executable release assets. Unfortunately, there isn’t a system for searching for repository release assets.

Let us know if you have more questions.

1 Like

It’s an interesting idea, I am not sure github search api will expose what you need, depending on what you are looking for you might be able to get away with it or you might need to do a lot more manual correlation. What is the scope of the project? Does it need to be all repositories/organizations or can it be limited to specific repositories/orgs? How are you defining a executable/binary file? Do files/scripts from interpretted lanuages count even though they need an interpretor and other dependencies present? If you are only looking for executables/binaries that are standalone/compiled these are typically not included in the repo code[1] but are attached to a release on the repo or distributed elsewhere such as rubygems, pypi, os distro repos, etc. I think to accomplish this you are likely going to need to create a clear definition as to what you want and combine/correlate multiple bits of data.

Here are some ideas/indicators:

  • look at files with a [shebang](https://en.wikipedia.org/wiki/Shebang_(Unix) in the context of scripts with interpretted languages (python, ruby, etc) or other shells (sh, bash, ksh, etc)

  • permissions: git keeps track of files permissions typically system umasks do not make files executable permissions by default (which is changable but not reccomended) so in the case of a file being given an executable information the contributor or maintainer took a manual step to ensure that when the repo is cloned the file will be executable.

  • do you care about vendored dependencies?

  • binary and executable are two different things, an archive (zip, tar.gz, etc) are all binary files but they are not executable which types of binary files do you care about?

[1]: git is really designed to keep track of changes in text, when something is committed in a binary form it can bloat the size of the repository because it needs to basically include the whole binary rather than just the pointer of the previous and the delta.