Starting today, GitHub Code Search will only index repositories that have had recent activity within the last year. Recent activity for a repository means that it has had a commit or has shown up in a search result. If the repository does not have any activity for an entire year, the repository will be purged from the Code Search index. This change will enable the most relevant content for developers to surface in the code search index as well as keeping code search queries fast for all customers.
I’m disappointed to hear this, as it doesn’t match my common use-cases for GitHub code search.
I often use code search to find examples of how to use an obscure or undocumented feature of a library I am using. The solutions for these searches frequently come from old, abandoned repositories - if a library feature is being actively used by current projects it’s much more likely to have documentation which means I don’t need to turn to code search.
I also use it for my own code all the time - with the
user:simonw prefix. I have a ton of repositories from old projects that I haven’t touched in years which I still like to refer back to for examples of how I solved old problems.
I recently built my own code search engine, https://github.com/simonw/datasette-ripgrep - which I can use to find examples and solutions in my older projects. I’m still disappointed that I won’t be able to use GitHub code search for that any more.
I understand running a high quality index on this scale is expensive and complicated. Any chance I might be able to keep search code indexing of my older repositories as part of my paid GitHub plan, or as part of having a paid organization on GitHub?