One of the steps in my workflow is failing 25% of the time due to external reason (Github Package Registry) is there a build in way to retry a few times before giving up?
Currently, GitHub Actions does not support to automatically retry the failed steps. However, we have an option to re-run the failed workflow by clicking the button Re-run checks. Maybe, you can try to re-run the whole failed workflow.
That’s nice but doesn’t really scale well in the long term
Right so found at least the cause for my failure, turned out that the PR was coming from dependabot. Which doesn’t have access to my GPR repo 🤦.
@brightran Has the support been provided to automatically retry the failed step ?
As far as I know not yet
Support for this would be really useful for complex workflows, especially where calls to external systems are made that may not always be 100% reliable.
Real world example use case: When building and publishing a docker image and reporting success or failure to some backend. Docker images may include installing packages or downloading potentially big pieces of data. In my experience building images of 1~3GB will have a failure rate of about 1~5 percent.
When the job fails it’s very easy to just retry it, and it would completely solve the problem, as well as prevent reporting false positives as “failure” to the backend that keeps track of the built images.
Currently we would have to duplicate 23 lines (the build and publish job) times the amount of retries.
A built-in possibility to retry failed steps would be very useful to my team also.
I have already found an action that allows you to retry a failed command: Retry Step · Actions · GitHub Marketplace · GitHub
However, the step which commonly fails for me is itself an action, so this approach is not viable (as I believe you cannot nest action calls?).
Specifically, our builds frequently fail when the Aqua Security Trivy · Actions · GitHub Marketplace · GitHub security vulnerability scanner action fails with a timeout. If we could just configure this action to retry 3 times before failing the job, and ideally could do so in a way which would work consistently across all actions/steps, that would be ideal.