Help
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Pilot Lvl 1
Message 1 of 6

[GitHub App] Invalid access token

Hello all,

 

I have a GitHub App with few thousands users (https://github.com/apps/code-inspector). I face currently one major problem: when we get the access token, we sometimes cannot checkout the repository. Sometimes, we get the error "the repository XXX does not exist". But the repository does exist since it works on other, further attempts.

 

The token seems also invalid when I try to get the list of repository, I get an authentication error.

 

If that helps, I am using PyGitHub to get the token and interact with the API.

 

Any idea where it could come from?

 

Thanks!

 

5 Replies
Highlighted
Pilot Lvl 1
Message 2 of 6

Re: [GitHub App] Invalid access token

Note that it does not seem related to the library I am using.

 

From time to time, when I try to clone the repository, I get the error  Invalid username or password.\nfatal: Authentication failed for ....

 

I am using a command like this to clone the repository git clone https://x-access-token:<token-generated-from-github-app>@github.com/<full_name>.git

 

If I clone from another machine, I have no problem. And the clone will succeed on this machine if it tried few minutes after.

 

Is there a mechanism to avoid to checkout too often?

Highlighted
Pilot Lvl 1
Message 3 of 6

Re: [GitHub App] Invalid access token

Trying to bump this message to see if anybody (maybe a GitHub staff?) can indicate if there is something wrong and/or if clone are throttled.

 

Thanks!

Highlighted
Copilot Lvl 2
Message 4 of 6

Re: [GitHub App] Invalid access token

I'm having the exact same issue in my GitHub App https://github.com/apps/skeema-io , which is written in Golang. For the past few weeks, a portion of my git clone calls (using x-access-token exactly like you) are randomly failing with `remote: Invalid username or password.\nAuthentication failed for ...`

 

According to my logs, the problem started on the night of April 27 and has become more frequent over time, especially this past week.

 

This is a guess, but so far I believe the root cause is an internal technical issue on GitHub's side, specifically either database replication lag or cache inconsistency. My suspicion is that if you create a new access token and then immediately use it to clone a repo, the token is sometimes being checked against a db/cache that is lagging -- i.e. the INSERT corresponding to the access token's row has not yet replicated to the db/cache that is being queried to perform the auth check.

 

Today I added the following work-arounds to my application, and this seems to have solved the problem so far:

  • After creating a new access token, sleep for a few hundred ms
  • If cloning a repo fails with "Invalid username or password", sleep a couple seconds and then retry with the exact same token

 

The idea is to just give the new access token time to replicate.

 

One additional mitigation measure that I'd suggest:

  • Cache access tokens in memory (or redis/memcached) for an hour, keyed by installation ID. Allow subsequent clones for the same installation ID to re-use the previous token, rather than fetching a brand new token each time.

 

The idea with this one is to reduce the number of new access tokens you need, reducing the frequency of the entire situation.

 

Hope this helps! I'm surprised more people aren't hitting this!

Highlighted
Pilot Lvl 1
Message 5 of 6

Re: [GitHub App] Invalid access token

Thanks @evanelias  for the details! I implemented a similar strategy (wich caches) and it still faces problems. Sometimes, the same token is used and then, does not work for few minutes and works again. I also implemented a threshold mechanism there I do not checkout the same repository more than once per minute.

 

However, this is becoming problematic. Can a GitHub staff provides some insights here?

Highlighted
Copilot Lvl 2
Message 6 of 6

Re: [GitHub App] Invalid access token

> Sometimes, the same token is used and then, does not work for few minutes and works again

 

Still sounds like DB replication lag to me :) If there are multiple replicas in a region, and one is lagging more than the others, it may just be random luck which replica is queried for any given auth check.

 

For context, the GitHub eng blog had a recent postmortem post indicating that they've been actively moving queries off of an overloaded master db onto replicas. And subsequently there was another outage in late April just a couple days before this error started coming up. Maybe unrelated, I'm just speculating here. But I could certainly understand if GitHub staff can't comment on it yet, if this is something they're still actively working on, e.g. the ongoing sharding efforts mentioned in the post-mortem post.