Check pushed file changes with git diff-tree in GitHub Actions

My goal is to fetch a list of files that were modified between 2 commits (or in 1 commit) using the git diff-tree plumbing command, and I want to do this inside a GitHub Actions workflow on the ubuntu-latest runtime.

The problem is that the git diff-tree stdout / stderr never appears on screen, and I can’t pipe it to a file. I tried running the command in a step run block, python script and inside a private Docker container action to no avail. Another post describes a solution, but this did not work for me. What’s interesting is that I can see the output of other git commands like git --version or even git help diff-tree but not the output of git diff-tree ... itself.

My actions workflow configuration is basic:

  • ubuntu-latest runtime
  • push events only
  • private repository

Is GitHub preventing me from using the git diff-tree command? If so, where is this documented? Has anyone else encountered this issue?

Here is a snippet of the bash script I want to execute.

...

tmpfile=$(mktemp)

start=$(jq --raw-output .commits[0].id "$GITHUB_EVENT_PATH")
end=$(jq --raw-output .commits[-1].id "$GITHUB_EVENT_PATH")

if ["$start" = "$end"]; then
    git diff-tree --no-commit-id --name-only -r "$start" >"$tmpfile"
else
    git diff-tree --no-commit-id --name-only -r "$start" "$end" >"$tmpfile"
fi

...

Here is an example of what I see in the workflow console.

1 Like

Please remove/reset the ‘fetch-depth’ for ‘actions/checkout’, output works fine then for the command ‘git diff-tree’.

You can get/pipe the changed files via command below:

- name: get changed files
        id: getfile
        run: |
          echo "::set-output name=files::$(git diff-tree --no-commit-id --name-only -r ${{ github.sha }} | xargs)"

      - name: echo output
        run: |
          echo ${{ steps.getfile.outputs.files }}

4 Likes

Thank you for the quick response and making the extra effort of looking inside one of my repositories. A fetch-depth of 1 is definitely the issue here. Increasing it solved my issue.

Hi @weide-zhou, thanks for your response.

This however does not work for workflow where 3 coommits were added, you will only get the latest information.

Is there a way to get the global changes of the commits you uploaded to the branch?

2 Likes

Solved with:

git diff --name-only ${{ github.event.before }} ${{ github.sha }}
7 Likes

I am trying to compare against the target branch instead of previous commit. Is there something that would work like this? 

git diff --name-only ${{ github.base_ref}} ${{ github.sha }}

I want to always check the diff in absolute terms against the target branch, not just between commits. Thanks.

This almost works for me. I’m trying to find occurences of a string. Maybe you can help.

echo "::set-output name=grepValue::$( git diff --name-only ${{ github.event.before }}..${{ github.sha }} | grep -c '^proxy' )"

This works fine locally but in the builds I get an error (with dots):

fatal: Invalid revision range 9190c0ff82b6b517961997f75090bf9128eea10d..6a5d620bd047d53f38bd5e2966ef63acb1d2dee6

And without dots:

fatal: bad object 603bf6e0d69a3680413eb48049bb1f9644069965

Can you send link to repo/yaml?

In case you’re still looking for answer @shotor or if someone has the same issue, this “bad object” error is happening because of the default config for actions/checkout@v2:

Only a single commit is fetched by default, for the ref/SHA that triggered the workflow. Set fetch-depth: 0 to fetch all history for all branches and tags.

This means that you need do something like this:

# Checkout the source code
- name: 'Checkout source code'
  uses: actions/checkout@v2
  with:
     fetch-depth: 0

# Check for changes
- name: Check for changes
   id: checkForChanges
   run: |
      echo "::set-output name=grepValue::$( git diff --name-only ${{ github.event.before }} ${{ github.sha }} | grep -c '^proxy' )"

Previous answer (by @pozil) fetches full repository and it may be slow on big repositories. My solution is to fetch with depth 1 by using actions/checkout and fetch later manually. Also my solution works for both Pull Requests and normal pushes to branches.

on: [push, pull_request]

jobs:
  job1:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - name: Check for changes
        id: diff
        run: |
          if [ $GITHUB_BASE_REF ]; then
            # Pull Request
            git fetch origin $GITHUB_BASE_REF --depth=1
            export DIFF=$( git diff --name-only origin/$GITHUB_BASE_REF $GITHUB_SHA )
            echo "Diff between origin/$GITHUB_BASE_REF and $GITHUB_SHA"
          else
            # Push
            git fetch origin ${{ github.event.before }} --depth=1
            export DIFF=$( git diff --name-only ${{ github.event.before }} $GITHUB_SHA )
            echo "Diff between ${{ github.event.before }} and $GITHUB_SHA"
          fi
          echo "$DIFF"
          # Escape newlines (replace \n with %0A)
          echo "::set-output name=diff::$( echo "$DIFF" | sed ':a;N;$!ba;s/\n/%0A/g' )"
      - run: echo "${{ steps.diff.outputs.diff }}"

CPython uses similar way to run tests only if changed files are not in Docs directory (for Pull Requests). For pushes to master they always run tests.

Update:
Answering to the StackOverflow question I wrote a script that detects changed directories and generates matrix for next job.

name: Build
on: [push, pull_request]

jobs:

  generate-matrix:
    name: Generate matrix for build
    runs-on: ubuntu-latest
    outputs:
      matrix: ${{ steps.set-matrix.outputs.matrix }}
    steps:
      - uses: actions/checkout@v2
      - name: Check changed files
        id: diff
        run: |
          # See https://github.community/t/check-pushed-file-changes-with-git-diff-tree-in-github-actions/17220/10
          if [ $GITHUB_BASE_REF ]; then
            # Pull Request
            git fetch origin $GITHUB_BASE_REF --depth=1
            export DIFF=$( git diff --name-only origin/$GITHUB_BASE_REF $GITHUB_SHA )
            echo "Diff between origin/$GITHUB_BASE_REF and $GITHUB_SHA"
          else
            # Push
            git fetch origin ${{ github.event.before }} --depth=1
            export DIFF=$( git diff --name-only ${{ github.event.before }} $GITHUB_SHA )
            echo "Diff between ${{ github.event.before }} and $GITHUB_SHA"
          fi
          echo "$DIFF"
          # Escape newlines (replace \n with %0A)
          echo "::set-output name=diff::$( echo "$DIFF" | sed ':a;N;$!ba;s/\n/%0A/g' )"
      - name: Set matrix for build
        id: set-matrix
        run: |
          # See https://stackoverflow.com/a/62953566/11948346
          DIFF="${{ steps.diff.outputs.diff }}"
          JSON="{\"include\":["

          # Loop by lines
          while read path; do
            # Set $directory to substring before /
            directory="$( echo $path | cut -d'/' -f1 -s )"

            if [ -z "$directory" ]; then
              continue # Exclude root directory
            elif [ "$directory" == docs ]; then
              continue # Exclude docs directory
            elif [ "$path" == *.rst ]; then
              continue # Exclude *.rst files
            fi

            # Set $os. "ubuntu-latest" by default. if directory starts with windows, then "windows-latest"
            os="ubuntu-latest"
            if [ "$directory" == windows* ]; then
              os="windows-latest"
            fi

            # Add build to the matrix only if it is not already included
            JSONline="{\"directory\": \"$directory\", \"os\": \"$os\"},"
            if [[ "$JSON" != *"$JSONline"* ]]; then
              JSON="$JSON$JSONline"
            fi
          done <<< "$DIFF"

          # Remove last "," and add closing brackets
          if [[ $JSON == *, ]]; then
            JSON="${JSON%?}"
          fi
          JSON="$JSON]}"
          echo $JSON

          # Set output
          echo "::set-output name=matrix::$( echo "$JSON" )"

  build:
    name: Build "${{ matrix.directory }}" on ${{ matrix.os }}
    needs: generate-matrix
    strategy:
      matrix: ${{fromJson(needs.generate-matrix.outputs.matrix)}}
    runs-on: ${{ matrix.os }}
    steps:
      - uses: actions/checkout@v2
      - name: Build
        run: |
          cd ${{ matrix.directory }}
          echo "${{ matrix.directory }} ${{ matrix.os }}"