sponsorableOnly: true on topic repositories causing error

Hi folks,

It looks like setting sponsorableOnly: true on the repositories field of a topic that contains a lot of repositories is causing the API to return an error, specifically, the Oh snap! HTML error page:

query {
  topic(name: "api") {
    repositories(first: 1, sponsorableOnly: true) {
      nodes {
        nameWithOwner
      }
      
      pageInfo {
        hasNextPage
        endCursor
      }
    }
  }
}
"<!DOCTYPE html>\n<html>\n  <head>\n    <meta http-equiv=\"Content-type\" content=\"text/html; charset=utf-8\">\n    <meta http-equiv=\"Content-Security-Policy\" content=\"default-src 'none'; base-uri 'self'; connect-src 'self'; form-action 'self'; img-src data:; script-src 'self'; style-src 'unsafe-inline'\">\n    <meta content=\"origin\" name=\"referrer\">\n    <title>Oh snap! &middot; GitHub</title> ...

Docs for ref: Objects - GitHub Docs

This only seems to affect large topics, as if you choose a topic with far less repositories associated, the query does work:

query {
  topic(name: "aaaa") {
    repositories(first: 1, sponsorableOnly: true) {
      nodes {
        nameWithOwner
      }
      
      pageInfo {
        hasNextPage
        endCursor
      }
    }
  }
}
{
  "data": {
    "topic": {
      "repositories": {
        "nodes": [
          {
            "nameWithOwner": "yakeing/dnspod_api"
          }
        ],
        "pageInfo": {
          "hasNextPage": false,
          "endCursor": "Y3Vyc29yOnYyOpHOCmsZDw=="
        }
      }
    }
  }
}

Cheers,
Matt.

Hey, good to see you @mattipv4 :wave:

I’ve still yet to look at our logs, but so far, I’ve not been able to reproduce the behavior in the explorer:

Has the behavior subsided, or are you able to provide some additional step to reproduce?

I’ll also take a look at logs next, to see if I can identify any related errors and report back.

Edit: I wasn’t able to find any exceptions bubbling up in either two of our logging tools. We have one that logs exceptions only, and another general logging tool. Neither have elevated matching events to what you’ve described.

Since it sounds like it’s happening in-browser, are you able to capture any network details during a failure?

Hiya!

Yes, I am still able to produce this locally using the API query, here’s the network call failing:



The request ID from the header: 90cb94a4-306b-4d7a-9a47-4951b8657861

Let me know if anything else would be useful in digging into this :slight_smile:

I’m still unable to reproduce the error myself, but also unfortunate, that request_id didn’t get caught by either our logging, or our exception catching tool =(

Is it possible to get another request_id, or two?

Here are the full response headers from a run:

HTTP/1.1 500 Internal Server Error
Server: Cowboy
Date: Tue, 23 Nov 2021 17:57:53 GMT
Connection: keep-alive
Content-Type: text/html; charset=UTF-8
X-Request-Id: 8422c019-f412-4e98-b0b7-da6b1c41e4b5
X-Runtime: 10.196246
Strict-Transport-Security: max-age=631138519
X-Frame-Options: ALLOW-FROM https://docs.github.com
X-Content-Type-Options: nosniff
X-Xss-Protection: 1; mode=block
X-Download-Options: noopen
X-Permitted-Cross-Domain-Policies: none
Referrer-Policy: origin-when-cross-origin, strict-origin-when-cross-origin
Content-Security-Policy: default-src 'none'; base-uri 'self'; block-all-mixed-content; connect-src 'self'; font-src 'self' data:; frame-ancestors docs.github.com; img-src 'self' data:; script-src 'self'; style-src 'self' 'unsafe-inline'
Content-Length: 9362
Via: 1.1 vegur

Here is the request Id header from a few more runs, all 500s:

X-Request-Id: 1d500be5-ceff-4203-a1de-0c1074094f4a
X-Request-Id: f217b1f4-72d6-4c5e-8a85-e96ad4e44679
X-Request-Id: 19e2c5f9-3bfa-4dbd-a665-ed034cb070ac

Lmk if there’s anything else I can grab for you to help!

Hey Matt,

Unfortunately, none of those 3 new request IDs came back in either our general logging, or our exception catching logs tools. Nor was I able to reproduce this behavior both with my GitHub Staff account, and with my non-staff user account.

Do you happen to be aware of any other users running into the same?

One thing that was missing from your screenshots, is the Network tab, which should show us which URL is kicking out that 500, or if it’s even reaching the site, before producing the error.

I can escalate this once we have something a bit more tangible. For now though, still nothing in logs, and still unable to reproduce =(

Hey!

I’m still able to reproduce this on my account, both in Firefox and Chrome using that API topic query.

Here are scrubbed HARs in both browsers of the failing request: chrome.har · GitHub

I isolated just the failing request that occurs when I press the run button in the explorer – let me know if full HARs of the whole page load + me pressing run are needed.

Hopefully these give you a bit more concrete info on the query request that is failing.

Matt.

Just to note, I was also able to reproduce the same failure using an alt account in incognito mode, running the same query: firefox-alt.har · GitHub

Heya,

I was able to reproduce the 500 response Matt describes above and decided to check the response cURL gets too, so here’s that.

curl -H "Authorization: bearer TOKEN" -d "{\"query\": \"{topic(name:\\\"api\\\"){repositories(first:1,sponsorableOnly:true){nodes{nameWithOwner}pageInfo{hasNextPage,endCursor}}}}\"}" https://api.github.com/graphql -v
[...]
> POST /graphql HTTP/1.1
> Host: api.github.com
> User-Agent: curl/7.55.1
> Accept: */*
> Authorization: bearer TOKEN
> Content-Length: 131
> Content-Type: application/x-www-form-urlencoded
>
[...]
< HTTP/1.1 502 Bad Gateway
< Server: GitHub.com
< Date: Thu, 09 Dec 2021 08:06:35 GMT
< Content-Type: application/json
< Transfer-Encoding: chunked
< Access-Control-Expose-Headers: ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type, Deprecation, Sunset
< Access-Control-Allow-Origin: *
< Vary: Accept-Encoding, Accept, X-Requested-With
< X-GitHub-Request-Id: CD52:28F2:44D2FD4:4635053:61B1B900
<
{
   "data": null,
   "errors":[
      {
         "message":"Something went wrong while executing your query. This may be the result of a timeout, or it could be a GitHub bug. Please include `CD52:28F2:44D2FD4:4635053:61B1B900` when reporting this issue."
      }
   ]
}
[...]

Riley.

Hey Riley and Matt!

Thanks so much Riley; that request ID did return back a captured error in our logging. I’m still so confused why Matt’s hasn’t =(

Either way, I appreciate the additional info!

Interestingly, this specific request ID: CD52:28F2:44D2FD4:4635053:61B1B900 was caught as a timeout, similar to other conversations Matt and I had in this thread.

Between then and now, there has been work done to help us better catch details during timeouts for GQL calls, and thankfully we can see it in this example.

Uniquely from that conversation about heavy GQL calls, this call alone would presumably not be considered very heavy. From what I can see in our logs, a single MySQL query is accounting for 5.87s of time spent, which seems actionable.

I’m going to elevate this to our dev team and hopefully we can get this sorted!

Thanks again for the details and for your patience :bow:

2 Likes

Out of passing curiosity, did you find out more?