> Sometimes, the same token is used and then, does not work for few minutes and works again Still sounds like DB replication lag to me :) If there are multiple replicas in a region, and one is lagging more than the others, it may just be random luck which replica is queried for any given auth check. For context, the GitHub eng blog had a recent postmortem post indicating that they've been actively moving queries off of an overloaded master db onto replicas. And subsequently there was another outage in late April just a couple days before this error started coming up. Maybe unrelated, I'm just speculating here. But I could certainly understand if GitHub staff can't comment on it yet, if this is something they're still actively working on, e.g. the ongoing sharding efforts mentioned in the post-mortem post.
... View more
I'm having the exact same issue in my GitHub App https://github.com/apps/skeema-io , which is written in Golang. For the past few weeks, a portion of my git clone calls (using x-access-token exactly like you) are randomly failing with `remote: Invalid username or password.\nAuthentication failed for ...` According to my logs, the problem started on the night of April 27 and has become more frequent over time, especially this past week. This is a guess, but so far I believe the root cause is an internal technical issue on GitHub's side, specifically either database replication lag or cache inconsistency. My suspicion is that if you create a new access token and then immediately use it to clone a repo, the token is sometimes being checked against a db/cache that is lagging -- i.e. the INSERT corresponding to the access token's row has not yet replicated to the db/cache that is being queried to perform the auth check. Today I added the following work-arounds to my application, and this seems to have solved the problem so far: After creating a new access token, sleep for a few hundred ms If cloning a repo fails with "Invalid username or password", sleep a couple seconds and then retry with the exact same token The idea is to just give the new access token time to replicate. One additional mitigation measure that I'd suggest: Cache access tokens in memory (or redis/memcached) for an hour, keyed by installation ID. Allow subsequent clones for the same installation ID to re-use the previous token, rather than fetching a brand new token each time. The idea with this one is to reduce the number of new access tokens you need, reducing the frequency of the entire situation. Hope this helps! I'm surprised more people aren't hitting this!
... View more