Are there any recommendations around deploying scaling self hosted runners? Right now, our 2 runners live on 2 EC2 instances and sit idle most of the time. Right before releases we tend to queue up 10-15 jobs that cause a bottleneck for developers. We are using AWS under the hood, and I’d like to find a solution that auto-scale based on load. Some of the ideas I was thinking about exploring were:
setting up a Docker image that scales the number of active instances based on load (I’m not sure if this will work with the token setup)
running multiple process on the same EC2 instance, so that we don’t need an instance per runner
exploring if there is a solution around serverless (AWS Lambda) architecture where we could spin up runners only when they are requested
Has anyone else attempted to solve this, and if so what worked/didn’t work?