How to search in repos of an org for keyword

I’m trying to use the Github API to search all repos in an organization with the criteria of a keyword.

For instance, I want to search the organization called Microsoft, for all repos that contain the word “cognitive” in the repo name or description. I don’t want the search to look in the files in the repo itself. How would I do that?

Everything I try gives me way too many pages back, so I think it must be searching all pages for all instances of the word “cognitive”. But that’s a guess. I don’t really know why I get so many pages back.

When I do a manual search on Github with the “cognitive” keyword, it returns only 62 repos:

https://github.com/search?q=org%3Amicrosoft+cognitive&unscoped_q=cognitive

But I must do this search using the API so I can store the list of URLs. Here is what I have so far, but it’s too many pages. Maybe I have some parameters incorrect?

https://api.github.com/search/code?q=cognitive+org:Microsoft&per_page=100

Hi @wiazur welcome to the community! The following call returns the 62 repositories in the Microsoft org that meet your criteria

https://api.github.com/search/repositories?q=cognitive%20in:name,description+org:microsoft&per_page=100

I hope that helps for what you are after now, here is some reading on pagination which might help explain the concept better:

1 Like

Thanks so much Andrea! After much elbow grease I managed to come up with that very call as well! Good to know it’s a solid solution. I actually added a header, since I was searching with a keyword. Is that necessary for this search?

response = requests.get(f'https://api.github.com/search/repositories?q=cognitive in:name,description+org:{org}&per_page=100', headers={'Accept': 'application/vnd.github.mercy-preview+json'})

Wonderful! Sorry about the extra elbow grease :smile: Your call is :+1:

If you like to search through the topics yes, basically to view the topics property you need to provide a special value for the Accept header to call that endpoint, since it’s currently in preview. The call fetches a list includes the list of topics for each repository, so you can find the repositories with a specific topic on your end.

Ah thanks for letting me know! Actually, as far as I know, topics are keywords that the user of the repo must create, so if the user did not add (in my case) the word “cognitive” to their list of topics, then a repo that might have that word in the name or description would not show up. Is that a correct understanding? Thanks again!

P.S. I tried the call without the header and that worked ok too!

You are correct :+1: To utilize topics the repository admin will need to first add a topic to the repository.

From your example looking for the keywordcognitive in this case, I see that in Microsoft’s public repos admins have used microsoft-cognitive-services and cognitive-services as classifying topics. You can leverage our API to look for only the repo’s which have these topics as classifiers, by specifying the topic query parameter.

1 Like