Graceful job termination

When cancelling job there is chance that running software handle some resources that needs to release when job is cancelled. To make this robust way GitHub Actions should send signals for running process so that application can tear-down tasks properly. Gentle terminator would send first SIGINT, wait some seconds and if app still doesn’t die it would send another signal SIGTERM and finally SIGKILL to force terminate it eventually.

Probably GH already manage this someway, but at least I didn’t find any documentation about subject. would be good to document it properly how it behaviours at the moment so it’s easier to propose changes if needed.

Here is nice document for Jenkins about same issue: https://gist.github.com/datagrok/dfe9604cb907523f4a2f

@jupe,

Thanks for your feedback.
I have created an internal ticket to help you report this question to the appropriate engineering team for further discussion and evaluation. If they have any update, I will notify you in time, and sometimes the appropriate engineers may directly reply you here.

@jupe,

According the introduction from the engineering team, after the user click “Cancel workflow”:

  • The server will re-evaluate job-if condition on all running jobs.

  • If the job condition is always(), it will not get canceled.

  • For the rest of the jobs that need cancellation, the server will send a cancellation message to all the runners.

  • Each runner has 5 minutes to finish the cancellation process before the server force terminate the job.

  • The runner will re-evaluate if condition on the current running step.

  • If the step condition is always(), it will not get canceled.

  • Otherwise, the runner will send Ctrl-C to the action entry process (node for javascript action, docker for container action, and bash/cmd/pwd for run action), if the process doesn’t exit within 7500ms, the runner will send Ctrl-Break to the process, then wait for 2500ms for the process to exit. the runner will terminate the process tree if the process is still running.

  • The runner will try all the following steps that have condition sets to always() as many as it can within the 5 minutes cancellation timeout.

Hope this can help you understand better.