I’m wondering about the usage of the https://developer.github.com/v3/activity/events API, and mainly the “List public events” section.
According to the docs, Querying this API will return the last 300 public events that triggered in Github, with a 5 minute delay.
It also states that the API returns a “X-Poll-Interval” header, which should tell you how often you can poll the API.
This by default is set to 60 seconds, and can only increase during high server load.
From my short experience in testing this API, the amount of events that is received is quite enormous (as there is no filter on the request, you receive all event types available).
During my tests, the only way you can go over ALL public events (without missing any), is querying the API every second and going over all 300 events received.
Furthermore, the following research article presents the same problem (see the bottom section of " What Didn’t Work")
It indicates that querying in a 5 second interval will lead to missed events in 80% of the time.
It also mentions that crawlers such as GH archive are able to go over all events by querying the API every 0.75 seconds!
Can you please shed some light on how this API should be used?
Since there is no filter on the polled events, and querying the API according to the “X-Polling-Interval” header will surely lead to missed events, what is the suggested procedure for using this API to find specific public events without them being missed?
Thanks in advance.