[Possible bug]: 1m job timeout triggered immediately


Occasionally, a quick running job with a 1 minute GHA timeout, times out way before the duration set by the timeout-minutes value.

Steps to reproduce

I’ve got a job like this (as part of a larger workflow):

    runs-on: ubuntu-latest
    timeout-minutes: 1
      - wait-for-clear-queue
      URL: ${{ steps.ssm-params.outputs.URL }}
      POOL_ID: ${{ steps.ssm-params.outputs.POOL_ID }}
      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v1
          aws-access-key-id: ${{ secrets.DEV_KEY }}
          aws-secret-access-key: ${{ secrets.DEV_SECRET }}
          aws-region: eu-west-1
      - name: Get Params From SSM
        id: ssm-params
        run: |
          URL=$(aws ssm get-parameter \
          --name /irrelevant_for_this_report:${{ env.STAGE }} \
          --query "Parameter.Value" \
          || echo "https://example.com/not_populated")

          echo "::set-output name=URL::${URL}"

          POOL_ID=$(aws ssm get-parameter \
          --name /irrelevant_for_this_report2:${{ env.STAGE }} \
          --query "Parameter.Value" \
          || echo "POOLIDNOTPOPULATED")

          echo "::set-output name=POOL_ID::${POOL_ID}"

There are also 2 other quick running jobs that happen in parallel. This should give you more of an idea of what’s happening.

These run in 1-2s.

Expected result

These jobs should not time out, and the rest of the workflow should be executed.

Actual result

One of these jobs fails after 0 seconds with the timeout being hit. It’s intermittent though, like a UTC minute is rolling over or something?

The elapsed time since the whole workflow is 30-40s when this is triggered (I was watching it), hence it not being a frontend reporting issue.

Like I say, this is intermittent, but it’s happened a few times, hence I thought I should report it.

Here is one screenshot around the error. I wasn’t permitted to include > 1 screenshot in a post as I’m a new user.

Here is another screenshot around the error. I wasn’t permitted to include > 1 screenshot in a post as I’m a new user.

That sounds like a bug to me! I wonder if it happens when the job starts just before a full minute, when the minute number changes? If yes, setting the timeout to 2 minutes should avoid it. :thinking:

I’ve changed all my 1 minute jobs to 2 minutes, to hopefully avoid this going forwards.

1 Like