Fetch all stargazers over time of a repository

I’m trying to fetch the timestamp of all the starred_at for every stargazer via GitHub REST API.

I’m passing in the following request:

headers={‘Accept’: ‘application/vnd.github.v3.star+json’}
url = "https://api.github.com/repos/tensorflow/tensorflow/stargazers?per_page=100
response = requests.get(url,headers=headers)

But on the TensorFlow GitHub repository website, the total stargazers count is 158k, whereas the ones fetched from REST API total count is 40K.

Am I missing any other query parameters to fetch the remaining 118k stargazers?

I don’t know, but I did find another OSS project which might give you the stargazer info you want: GitHub - spencerkimball/stargazers: Analyze GitHub stars

Could also be a rest API thing. Switching to GraphQL might help? Again, I don’t know for sure, just throwing out ideas.

Well. I just hit the request once. Then use the headers → links to check the number of items(stargazers) I can get. In my case it’s:
https://api.github.com/repositories/45717250/stargazers?per_page=100&page=2; rel=“next”, https://api.github.com/repositories/45717250/stargazers?per_page=100&page=400; rel=“last”

So that’s 100 items per page and 400 pages in total. We get 40k items. Whereas the actual count is ~160k

I don’t think its rate limiting, I was just wondering if it was something else. Hmmmm, I’ll ask internally and see what I can find.

Thanks. Looking forward to hearing from you. :smiley:

@mickeygousset
I just checked with other repositories which have over 40000+ stargazers. Sadly it’s the same. They all stop at page=400.

https://api.github.com/repos/vuejs/vue/stargazers?per_page=100
https://api.github.com/repos/freecodecamp/freecodecamp/stargazers?per_page=100

These respectively have 187k and 329k stargazers.

@mickeygousset Did you get a chance to ask internally?

Pinging some more people. I’ll let you know if I hear something.

@drshnchndr I found out there is a hard limit of 40k for the REST pagination. And it doesn’t seem to be documented in the docs, sorry about that. At least, I couldn’t find it.

I don’t know if switching to GraphQL to try and pull that information might work better. Just a thought.

1 Like