Very slow queuing behavior when idle runners are available

I have 256 self-hosted runners, and a workflow that uses a 4^4 matrix:

    strategy:
      matrix:
        ix1: [ 0, 1, 2, 3 ]
        ix2: [ 0, 1, 2, 3 ]
        ix3: [ 0, 1, 2, 3 ]
        ix4: [ 0, 1, 2, 3 ]

When no other builds are running (all my runners are idle), I’m seeing very delayed behavior from GH actions before my builds even start.

The UI shows “X queued checks” at a rate of around 4 per second (ie, “4 queued checks”, “8 queued checks”, etc), before it finally gets to 256 checks queued. It takes a full 1min 40sec before the first of my runners even receives a message and starts building.

I don’t even understand why there’s any queuing going on at all – shouldn’t the builds start (almost) immediately? What’s actually queuing here? All of the runners are idle and ready to receive builds.

Is there anything I can do to speedup these builds – each build itself takes much less than the time it takes to queue the checks!

FWIW, creating 4 separate workflows, each with a 4^3 matrix:

    strategy:
      matrix:
        ix1: [ 0, 1, 2, 3 ]
        ix2: [ 0, 1, 2, 3 ]
        ix3: [ 0, 1, 2, 3 ]

Results in a faster queuing time – it comes down from 1min 40sec to around 55 secs before the first runner receives a message – even though it’s the same number of checks (256). So I suspect there’s some sort of throttling happening at the matrix level, or similar.

Again, this is still quite slow – and still doesn’t make sense that there’s any queuing at all given all runners are idle.

Hi @mhart,

Thank you for being here!

According to your description, it’s recommended to raise a feedback ticket in below link where github product manger will take a review:
https://support.github.com/contact/feedback?contact[category]=actions

Or you can raise an issue ticket here for self-hosted runner, dev team will check and confirm.

Thanks

I have raised a feedback ticket.

I’m not sure that raising an action against the self-hosted runner repo is appropriate – it’s not a problem with the self-hosted runners themselves because they aren’t even receiving the message to build for many minutes until after it starts. And they’re responding very quickly after the builds have finished.

It appears instead to be a problem with the backend GitHub Actions infrastructure