Get old and new contents of a modified file in GitHub workflow

I have written a workflow which runs on pull_request_target event. I process all changed files in pull request object using PyGitHub. My workflow file looks like this:

on:
  pull_request_target:
    branches: ["master", "dev"]

jobs:
  validate_files:
    name: validate files
    runs-on: ubuntu-latest
    steps:

      - name: Checking out repository
        uses: actions/checkout@v2 
        with:
          ref: refs/pull/${{ github.event.pull_request.number }}/merge

      - name: run validation
        run: python3 validate.py

I’ve removed irrelevant part from this workflow file, like passing the environment variables, and passing the secrets to validate.py script.

Inside the validate.py script, I want to access all changed files of a pull requests. And in case it is modified, I want to get the old content and new content of that file. Assume that I only do this for json files. I achieve it like this:

class Validator():
    def __init__(self):
        self.repo_name: str = str(os.environ['REPO_NAME'])
        self.pr_num: int = int(os.environ['PULL_NUMBER'])
        self.github_token: str = str(os.environ['GITHUB_TOKEN'])
        self.github = Github(login_or_token=self.github_token)
        self.repo: Repository = self.github.get_repo(full_name_or_id=self.repo_name)
        self.pull_request_obj: PullRequest = self._get_pull_request_obj()
        self.changed_files: PaginatedList[File] = self._get_changed_files()

    def parse_files(self):
        for page in range(self.changed_files.totalCount):
            files: List[File] = self.changed_files.get_page(page=page)
            for _file in files:
                file_path = _file.filename
                file_status: str = _file.status

                if file_status == "modified":
                    new_content : Dict[str, Dict[str, Any]] = self._get_json_from_file_path(file_path)
                    old_content :  Dict[str, Dict[str, Any]] = json.loads(self.repo.get_contents(file_path).decoded_content.decode())
                    self._validate_json_data(new_content, file_path)
                    self._validate_file_modification(old_content, new_content)

However, I’m getting not found error in line old_content : Dict[str, Dict[str, Any]] = json.loads(self.repo.get_contents(file_path).decoded_content.decode()), when this script runs. Is there any alternate solution for this? I want to get old contents and new contents of file in github workflow, when there is a pull_request_target event.

The error message looks like this:

github.GithubException.UnknownObjectException: 404 {"message": "Not Found", "documentation_url": "https://docs.github.com/rest/reference/repos#get-repository-content"}
Error: Process completed with exit code 1.

By default actions/checkout gets only the latest commit for the ref the workflow has been called for. You can set the fetch-depth parameter to get the full history.

Your code as is invites this:

Be sure to read it and the followup:

1 Like