Hosted runners not picking up jobs

We have jobs waiting for hours for a hosted runner and then dying 5 or so hours later.
The message in the logs is:

2021-07-22T19:06:58.3407198Z Can't find any online and idle self-hosted or hosted runner in the current repository, account/organization that matches the required labels: 'ubuntu-18.04'
2021-07-22T19:06:58.3407336Z Found online and busy hosted runner(s) in the current repository's organization account that matches the required labels: 'ubuntu-18.04'. Hit concurrency limits on the hosted runners. Waiting for one of them to get assigned for this job.
2021-07-22T19:06:58.3407366Z Waiting for a hosted runner in 'organization' to pick this job...

For example:

A hint might be:

which refers to excessive artifact usage but what does that even mean?
Any ideas of what’s going on here?

7 Likes

Same here: Waiting for a hosted runner in 'organization' to pick this job.
I have multiple idle runners all waiting for some jobs :frowning:

Having this issue now on v6.5.0 - fix broadening of ranges, node v16 support · bevry/editions@70d636c · GitHub

Any ETA of when the queue empties — are we talking hours, or days?

Same problem

Can't find any online and idle self-hosted or hosted runner in the current repository, account/organization that matches the required labels: 'ubuntu-latest'
Waiting for a self-hosted or a hosted runner to pickup this job...

Manually cancelling the workflow, then re-running it seems to do the trick.

I’ve also tried to supplement the github hosted runners with self-hosted runners, as the queue message suggests, but having no luck:

Same here

Same here: fix: properly handle delayed vehicles when inserting dropoffs · matsim-org/matsim-libs@d4d0fb4 · GitHub

Manually cancelling does not fix. Same problem

Duplicate “us too” post was made over here: ArduPilot actions all getting stuck - #2 by peterbarker

I made a dupe post here, and have since received a response from GH support

Taking a look at the logs on our end, I see Actions was hitting the organization’s concurrency limit of 60 concurrent runs.

This limit is for the entire organization and not repository-specific. In this case, I suspect that since 60 workflow runs were currently running, the remainder of the jobs remain queued. They will remain queued until concurrency frees up. The caveat is that if the job is queued for more than 24 hours, it is automatically cancelled.

I wasn’t aware of the 60 job limit and they must have only recently started enforcing it, I just never noticed the waiting jobs.

They will remain queued until concurrency frees up.

I don’t observe this to be the case. Once a job hits this state it never recovers, even when no other actions are running in my Organization.

1 Like

In my experience it always used to work how GitHub’s response describes: jobs beyond 60 would queue until concurrency freed up.

Something broke in roughly the past week and what @salexander6 describes has been my experience too since then: once a job queues it now never seems to recover. It will just time out after 24 hours despite the first 60 jobs being done long ago and nothing else currently running.

I’ve been able to unblock things by manually cancelling all queued jobs and immediately re-running, and as long as you do no more than 60 at a time, they will reliably complete.

3 Likes

This is working for me now. No changes on my end, I assume the bug has been fixed :man_shrugging:

1 Like