Powershell steps fail nondeterministically

I’m writing to report a behavior I think I’ve confirmed:

  1. If you use powershell in a GitHub Actions task and any of your code writes to stderr, your builds may fail
  2. This behavior is nondeterministic
  3. This behavior cannot be disabled

For evidence of this phenomenon, see this pull request. I have stopped developing on it so it can be used in this discussion.

On that PR, I have two builds which use a powershell script to test and build an R package on Windows. One build uses MINGW, and one uses MSVC. The relevant config section looks like this

- name: Setup and run tests on Windows
  if: matrix.os == 'windows-latest'
  shell: pwsh -NonInteractive -ExecutionPolicy Bypass -Command "& '{0}'"
  env:
    COMPILER: "${{ matrix.compiler }}"
    GITHUB_ACTIONS: "true"
    TASK: "${{ matrix.task }}"
  run: |
    $env:BUILD_SOURCESDIRECTORY = $env:GITHUB_WORKSPACE
    conda init powershell
    & "$env:GITHUB_WORKSPACE/.ci/test_windows.ps1"

That script has been running successfully in Windows environments through Azure DevOps and AppVeyor for months. This script calls utilities like Rscript.exe, which write benign messages to stderr. I believe that is causing issues on GitHub Actions that I never observed locally or on other continuous integration platforms because of the behavior documented in “Workflow syntax for GitHub Actions” (the forum’s rules do not allow me to post a link to those docs :frowning:)

Fail-fast behavior when possible. For pwsh and powershell built-in shell, we will prepend $ErrorActionPreference = 'stop' to script contents.

I have found that either the advice on that page for coding around that behavior is inaccurate, or I have misunderstood it:

Users can always opt out by not using the built-in shell, and providing a custom shell option like: pwsh -File {0} , or powershell -Command "& '{0}'" , depending on need.

I have tried many interpretations of that advice, and other tricks like using -ErrorAction Continue or explicitly setting $ErrorActionPreference = 'Continue'…with no success.

To test my theory, in the pull request linked above I pushed a commit that diverts the output of two commands that were causing failures to $null with something like Rscript -e "..." > $null. I then pushed 10 consecutive empty commits to trigger 10 re-runs of the workflow.

The results were surprising:

Given all of this, I’ve concluded that:

  1. If you user powershell in a GitHub Actions task and any of your code writes to stderr, your builds may fail
  2. This behavior is nondeterministic
  3. This behavior cannot be disabled

I really hope that I’m wrong about this. Has anyone else experienced this? Is there a workaround I haven’t considered, or something fundamental that I’ve misunderstood?

Thank you in advance for your time!

Hi @jameslamb,

Thank you for reaching this out! I can reproduce the same on my side, I have raised an internal ticket for confirmation, will update once there’s a response!

Thanks

1 Like

If you have a small repro the powershell team may be able to offer suggestions to debug also.

The runner is marking the step failed because pwsh/powershell exit code is 1.

Regarding the fail-fast behavior, here is the code where the runner wraps your script.

To override the wrapping behavior, you can override the shell command. For example, shell: 'pwsh -command ". {0}"'. Here are the default formats the runner uses.

I was able to reproduce the issue with this simple build:

on: push
jobs:
  build:
    runs-on: windows-latest
    steps:
      - run: |
          $PSVersionTable
      - run: |
          cmd.exe /c "echo hello 1>&2"
          cmd.exe /c "echo world 1>&2"
          cmd.exe /c "echo asdf 1>&2"
          cmd.exe /c "echo aaaa 1>&2"
          cmd.exe /c "echo bbbb 1>&2"

I suspect this issue may have regressed. PSVersion shows 7.0.1 :frowning:

1 Like

Thank you @ericsciple! Great timing, I was just about to sit down and write a smaller reproducible example.

For my project, I’ll just keep my Windows build on AppVeyor / Azure DevOps and maybe try this again in a few weeks.

To be honest, I’m not that familiar with the Powershell community…would it be beneficial for me to post a link to this discussion on the issue you linked? Or will the maintainers there just say “that’s a GitHub Actions issue, talk to them”

no worries, i’ll play around with the sample a bit and re-open the powershell issue if it’s the same thing

filed issue https://github.com/PowerShell/PowerShell/issues/12823

Thank you for posting the issue! Does your example still result in build failures if you override shell: to code around the fail-fast behavior?

If not I’ll still try to work on an example that shows that that work around is not effective.

I added $erroractionpreference = 'continue' to the top of my script, and the script does not halt execution and does not report any error. Depends on your scenario I guess, but commonly errors should halt the script and cause pwsh to exit 1.

The problem in this case, stderr from a child process should not be interpreted as an error. Tools commonly write progress/etc to stderr. Also inconsistent - first two times not an error, third time causes error.

1 Like

Hi @jameslamb,

Looks github stuff ericsciple has already replied below, please follow the comment and check the update in the powershell issue.

Thanks.