API Search Repositories with no duplicate repos. it's possible? #24361
-
Hi, my query : Query = language:python |
Beta Was this translation helpful? Give feedback.
Replies: 6 comments
-
Hello @StephLang99 and welcome to the community. Can you give the exact URL you’re making the request to in order to receive these results? That will help us debug what’s going on here better. Let us know! |
Beta Was this translation helpful? Give feedback.
-
Hi thanks to help me. |
Beta Was this translation helpful? Give feedback.
-
Thanks for sharing the URL. I tried the following and, at least in the first 100 results, there are no duplicate repositories:
Are you certain you’re collating or parsing the results correctly? It might also be that, depending on how you paginate, that the sorted results could shift between requests and that might be the cause of the duplicates. |
Beta Was this translation helpful? Give feedback.
-
yes, you’re right. |
Beta Was this translation helpful? Give feedback.
-
When I try to query for 1000 repositories, sorted by the descending number of stars and by 'jupyter-notebook' language, I don't get unique results. I only get 931 unique results, and the remaining is repeated.
As the GitHub API limits the search up to 1000 results, I receive a 403 status code when accessing the last page of results (11th page), but that is expected. What I didn't expect was the repeated results. I couldn't find a way to avoid getting duplicated results. When I request only 100 results, they are unique. |
Beta Was this translation helpful? Give feedback.
-
Hi everyone and @di-press , I have reported this bug to Github and they indeed confirmed there is the bug in Github Search Rest API when fetching multiple pages of results, however they haven't provided any timeline to fix. Looks to me like very old issue which nobody from Github side was able to fix yet.... |
Beta Was this translation helpful? Give feedback.
Thanks for sharing the URL. I tried the following and, at least in the first 100 results, there are no duplicate repositories:
Are you certain you’re collating or parsing the results correctly? It might also be that, depending on how you paginate, that the sorted results could shift between requests and that might be the cause of the duplicates.