How does repository indexing work in github?

I mean, I have searched by proteu filename:Makefile in github and I have obtained different results in the last minutes.
First, Github returned 1 repository: GitHub - jacksonpradolima/proteumIM2.0: Ferramenta Proteum/IM - versão 2.0
Ten minutes later Github returned 2 repositories: GitHub - jacksonpradolima/proteumIM2.0: Ferramenta Proteum/IM - versão 2.0 and GitHub - magsilva/proteum: Tool for mutation testing of C programs (both of them created several years ago)

But the most surprising thing is that I know another repository that meets the search proteu filename:Makefile and it was returned by Github a month ago and now it does not appear in the github results, however the repository exists.

I am trying to mine data as a thesis work, but I cannot understand why Github does not return all the results that meet the search string. Could anyone help me with this? Thanks a lot

1 Like

UPDATED: The third result is now showed by github. I think that I have indexed the second and third result by searching it (please, see the indexation date of results).

however, the problem persists for other searches in github with thousands of missing results. Other example is this: filename:mujava extension:jar extension:config. In December 2020 that search returned around 300 results, now, it returns 125… lot of missing repositories. Could someone enlighten me? I am very lost with this topic.

Thanks in advance

1 Like