Get the sha of the target branch on worklow run

This is part of the script we used on GitLab to get the list of migration files we want to run:

  echo 'BRANCH_NAME='"$BRANCH_NAME"
  # The previous latest commit present on a branch before a merge request.
  # Only populated when there is a merge request associated with the pipeline. 
  echo 'CI_COMMIT_BEFORE_SHA='"$CI_COMMIT_BEFORE_SHA"
  # The commit revision for which project is built 
  echo 'CI_COMMIT_SHA='"$CI_COMMIT_SHA"

  MIGRATIONS=( $(git diff --name-only $CI_COMMIT_BEFORE_SHA $CI_COMMIT_SHA | grep -E 'migrations/[0-9]{4}-[0-9]{2}-[0-9]{2}-.*?\.sql') );

This generated us a list such as:

migrations/2020-05-30-add_deleted_at_to_constraint.sql
migrations/2020-06-04-add_location_id_constraint.sql

At the moment I can’t find a way to retrieve the list of files changes between the target branch and the source branch, during workflow run.

My colleague wrote a response in this thread, could you give their suggestion a try?

1 Like

I tried this Check pushed file changes with git diff-tree in GitHub Actions from the same thread because I need the list of changed files between multiple commits:

release commit (most recent commit)
migration-3 commit
migration-2 commit
new-feature-2 commit
bug-fix-1 commit
migration-1 commit
new-feature-1 commit
previous release commit (a commit from earlier)

So, in this case, I’ll need the 3 SQL files from migration-1, 2, and 3.

I’m obviously doing something wrong because I’m getting this message from the action:

That log looks like you try to analyze the repository without first running actions/checkout, so the repository is not available. And since you want to look at the history, you might want to use its fetch-depth parameter, too.

2 Likes

Thanks, I definitely miss checkout here. My current workflow looks like this:

# Github CI workflow
name: Github CI

on:
  push:
    paths-ignore:
    - '**/*.md'

env:
  CI_FLAG: "true"

jobs:
  build:
    runs-on: self-hosted
    container: my-container
    steps:
    - uses: actions/checkout@v2
      with:
        fetch-depth: 0

    - name: get changed files
      id: getfile
      run: |
        echo "::set-output name=files::$(git diff-tree --no-commit-id --name-only -r ${{ github.sha }} | xargs)"

    - name: echo output
      run: |
        echo ${{ steps.getfile.outputs.files }}

but the output is similar:

EDIT
I upgraded to git 2.27 on the instance that hosts github-runner.

@airtower-luna @github-support I made some progress.
Downgrading to checkout@v1 worked:

What I’m missing here, why it doesn’t work with v2? :thinking:

I dug into the v2 repo. Is that possible that when I first run the runner my host had no git installed, checkout@v2 downloaded git 2.17, and even if I installed 2.27 later, checkout still works with the old git?

Switching back to v2 after v1 reports the same error like before.

Hi @akoskm,

All of your steps are executed in container ‘my-container’ on self-hosted runner. Is the git initailized in the container? Typically the error is due to not found the .git dir, or lack of permission.
If you remove the container in the workflow yaml, can it be succesful with action v2?

Thanks.

Edit

I ended up removing the runner, switching between checkout v1 and checkout v2 bricked my work folder. I also removed the container section.

After reinstalling the runner, checkout v1 was still thinking that I’m running v1, switching to v2 detected git 2.27 properly.

In the meantime, I deleted manually everything from my work folder _work.

Now I can run git commands during the workflow.

@airtower-luna unfortunately, checkout in combination with fetch-depth, still returns the files changed in the current commit:

while the list of changed files between the target branch and the current commit is the following:

image

Original post below

Summary

All of your steps are executed in container ‘my-container’ on self-hosted runner. Is the git initialized in the container?

No, git (2.27) is initialized on the machine that hosts the self-hosted runner.

When I remove the container I get other errors:

Deleting the contents of '/home/ubuntu/actions-runner/_work/momentum/momentum'
##[error]Command failed: rm -rf "/home/ubuntu/actions-runner/_work/momentum/momentum/.github"
rm: cannot remove '/home/ubuntu/actions-runner/_work/momentum/momentum/.github/workflows/main.yml': Permission denied
rm: cannot remove '/home/ubuntu/actions-runner/_work/momentum/momentum/.github/ISSUE_TEMPLATE/migration.md': Permission denied
rm: cannot remove '/home/ubuntu/actions-runner/_work/momentum/momentum/.github/ISSUE_TEMPLATE/new-client.md': Permission denied

Would it help to uninstall the runner from the machine that hosts the self-hosted runner and reinstall it?

That makes sense, when given one commit ID diff-tree shows the objects changed in that commit, you’ll have to use the version with two <tree-ish> parameters, see documentation.

The next problem will be to get the commit ID before the push. From looking at context documentation I think github.event.before might work, if I understand the push event payload description correctly, but I haven’t tried that.

Because I want to run the migrations only on branch develop, I realized I can get the hash of that branch with:

git rev-parse origin/develop

So the environment variable that was ready to use in GitLab CI $CI_COMMIT_BEFORE_SHA, but was missing from GitHub - if you hardcode it to develop is:

CI_COMMIT_BEFORE_SHA=$(git rev-parse origin/develop)

$CI_COMMIT_BEFORE_SHA is available in GitHub runner as well and those are the only two hashes I needed for my old script.

Thanks @github-support, @airtower-luna, @weide-zhou. :bowing_man:

1 Like

Disregard my previous answer, this doesn’t work.

The workflow runs after I merge the PR into development.

git rev-parse origin/develop and $CI_COMMIT_SHA will be the same.

So again, is there any way to know what was SHA of the branch before the merge happened?

Would it be possible to somehow use:

on:
  pull_request:
    types: [closed]

to retrieve the list of changed files in a PR. Then I could filter those and grab the ones ending with .sql.

This was I wouldn’t rely on SHAs.

FYI I tried the solution mentioned here Get list of files on pull request merge and it didn’t work:


Run URL="https://api.github.com/repos/${GITHUB_REPOSITORY}/pulls/76/files"
jq: error (at <stdin>:4): Cannot index string with string "filename"
##[error]Process completed with exit code 5.

Hi @akoskm,

Please remove the jq code for your command firstly, make sure rest api(curl) command can get the pull request files content. Then follow the structure to add jq code to filter the filename.

Acccording to the jq error, there should be some code error in your script. I confirmed the sample code works fine on my side.

BTW, it’s recommened to check github context to get the sha…you wanted.

      - name: Dump GitHub context
        env:
          GITHUB_CONTEXT: ${{ toJson(github) }}
        run: echo "$GITHUB_CONTEXT"

Thanks.

1 Like

You’re right, I just made the following test:

while in context I have:

"event": {
    "after": "85f578ec7e036f138b09c93ad1f252da4b7d10a3",
    "base_ref": null,
    "before": "4679d7e9916dcc9a1605d4b12a8b0a86f9c40f51",

This is clearly what I need, now I just have to find a way to get to context.event.before/after - I think it’s just ${{ github.event.before }} - and send those SHA to my script.

Is there any difference these:

git diff --name-only ${{ github.event.before }} $GITHUB_SHA
git diff --name-only ${{ github.event.before }} HEAD
git diff --name-only ${{ github.event.before }} ${{ github.event.after }}

If I understand correctly all 3 would point to the latest commit in the branch.

1 Like

Hi @akoskm,

The three commands could be different, it depends on which event you used.

If you use push event, GITHUB_SHA, HEAD, {{ github.event.after }} all points to the latest commit, they have same value.

If you use pull request, they could be different:
$GITHUB_SHA: Last merge commit on the GITHUB_REF branch, while GITHUB_REF is the fake merge branch refs/pull/:prNumber/merge.

HEAD: same with GITHUB_SHA

${{ github.event.after }}: points to the latest commit on the compared(source) branch, not the fake merged branch.

Different sha will output different file list, hence, please make sure you used the correct sha for the git diff command.

Hope it clear and helpful!

Thanks @weide-zhou,

is there are reason why context values set as environment variables are not being forwarded to my script? The syntax seems to be correct because secrets is visible in the script:

   - name: Run migrations
      shell: bash
      run: ./scripts/gitlab-migration.sh "$INT_DB_URL" "$CI_COMMIT_BEFORE_SHA"
      env:
       INT_DB_URL: ${{secrets.INT_DB_URL}}
       CI_COMMIT_BEFORE_SHA: ${{github.events.before}}

I can also see the values when I dump github context:

1 Like

Hi @akoskm,

Please use ${{ github.event.before }}, no s for event.

2 Likes

Thanks @weide-zhou for catching that typo. :sweat_smile:

Works as expected:

image

2 Likes