Github API list of users, their languages and their followers

Hi, 

I’m collecting data on github users, the languages they use and their collaboration, i.e. who they follow, followers and people they work with. 

I am using a query similar to the one below:

https://api.github.com/search/users?q=repos:%3E12+followers:%3C1000&location:uk+language:python&page=1&per_page=100

Output:

{
  "total_count": 1347162,
  "incomplete_results": false,
  "items": [
    {
      "login": "cyfdecyf",
      "id": 344294,
      "avatar_url": "https://avatars3.githubusercontent.com/u/344294?v=4",
      "gravatar_id": "",
      "url": "https://api.github.com/users/cyfdecyf",
      "html_url": "https://github.com/cyfdecyf",
      "followers_url": "https://api.github.com/users/cyfdecyf/followers",
      "following_url": "https://api.github.com/users/cyfdecyf/following{/other_user}",
      "gists_url": "https://api.github.com/users/cyfdecyf/gists{/gist_id}",
      "starred_url": "https://api.github.com/users/cyfdecyf/starred{/owner}{/repo}",
      "subscriptions_url": "https://api.github.com/users/cyfdecyf/subscriptions",
      "organizations_url": "https://api.github.com/users/cyfdecyf/orgs",
      "repos_url": "https://api.github.com/users/cyfdecyf/repos",
      "events_url": "https://api.github.com/users/cyfdecyf/events{/privacy}",
      "received_events_url": "https://api.github.com/users/cyfdecyf/received_events",
      "type": "User",
      "site_admin": false,
      "score": 1.0
    },
    {
      "login": "robertdavidgraham",
      "id": 3814757,
      "avatar_url": "https://avatars2.githubusercontent.com/u/3814757?v=4",
      "gravatar_id": "",
      "url": "https://api.github.com/users/robertdavidgraham",
      "html_url": "https://github.com/robertdavidgraham",
      "followers_url": "https://api.github.com/users/robertdavidgraham/followers",
      "following_url": "https://api.github.com/users/robertdavidgraham/following{/other_user}",
      "gists_url": "https://api.github.com/users/robertdavidgraham/gists{/gist_id}",
      "starred_url": "https://api.github.com/users/robertdavidgraham/starred{/owner}{/repo}",
      "subscriptions_url": "https://api.github.com/users/robertdavidgraham/subscriptions",
      "organizations_url": "https://api.github.com/users/robertdavidgraham/orgs",
      "repos_url": "https://api.github.com/users/robertdavidgraham/repos",
      "events_url": "https://api.github.com/users/robertdavidgraham/events{/privacy}",
      "received_events_url": "https://api.github.com/users/robertdavidgraham/received_events",
      "type": "User",
      "site_admin": false,
      "score": 1.0
    },

This isn’t giving me what I need except the user’s name. As you can see the user’s followers and follows are also not detailed. 

I’ve been reading the documentation and looking around but to no avail - eventually I would like to turn this into a data set that has - user name, location, languages used, number of follows, number of followers.

Hope you can help out

There isn’t an easy way to extract the information you want. Also, wholesale scraping of the GitHub userbase is generally considered bad form and may be against the GitHub Terms of Service.

If you’re doing academic research, we generally recommend you use one of the datasets that GitHub has made public. These are made available on platforms where you can query them to your heart’s content and they don’t put a load on GitHub’s servers, potentially impacting other customers.

I hope that helps!

1 Like

Thank you for this information - didn’t have any knowledge of it.