Get a repository's commits along with changed patches and the url to changed files using GraphQL v4

I am able to get the list of commits (with fields like commit messagesoidcommit url etc) along with the number of changedFiles, made in a repository on the master branch.
However I am not able to figure out how to get  any information about the changes themselves and the files that were changed.

In v3 of the REST API, the information about the changes were contained in files->patch, and files -> raw_url or blob_url gave info about the original file itself at that stage.

Q) In v4 of GitHub’s API using GraphQL how do I get the corresponding information?

This is the query I am stuck with right now (showing only 1 commit for brevity) -

query {
  rateLimit{
    cost
    remaining
  }
  repository(owner: "elastic", name: "elasticsearch") {
    name
    defaultBranchRef {
      name
      target {
        ... on Commit {
          history(first:1){
            nodes{
              message
              changedFiles
              id
              oid
              treeUrl
              url
              tree{
                oid
              }
            }
            pageInfo{
              hasNextPage
              startCursor
              endCursor
            }
          }
        }
      }
    }
  }
}

Output:

{
  "data": {
    "rateLimit": {
      "cost": 1,
      "remaining": 4999
    },
    "repository": {
      "name": "elasticsearch",
      "defaultBranchRef": {
        "name": "master",
        "target": {
          "history": {
            "nodes": [
              {
                "message": "Small corrections to HLRC doc for _termvectors (#35221)\n\nRelates to #33447",
                "changedFiles": 2,
                "id": "MDY6Q29tbWl0NTA3Nzc1OmEyYzIyYWQ3YWViMGY4ZDUxNDg2NzdkZDcyMjJhZDQzYWZlZTlhMTc=",
                "oid": "a2c22ad7aeb0f8d5148677dd7222ad43afee9a17",
                "treeUrl": "https://github.com/elastic/elasticsearch/tree/a2c22ad7aeb0f8d5148677dd7222ad43afee9a17",
                "url": "https://github.com/elastic/elasticsearch/commit/a2c22ad7aeb0f8d5148677dd7222ad43afee9a17",
                "tree": {
                  "oid": "4f5f11e0e55aeafc4677800959232726a2cd787c"
                }
              }
            ],
            "pageInfo": {
              "hasNextPage": true,
              "startCursor": "a2c22ad7aeb0f8d5148677dd7222ad43afee9a17 0",
              "endCursor": "a2c22ad7aeb0f8d5148677dd7222ad43afee9a17 0"
            }
          }
        }
      }
    }
  }
}

 I tried many ways to somehow extract information about the patches however I failed all the time. I have started learning about GraphQL just a couple of days ago so I don’t have any experience with it.  

I was thinking if there is a corresponding object for patch(that shows the changes in REST API v3) in GraphQL API then I might be able to reverse engineer my way through it, howver the docs have been very cryptic.

Any guidance would be much appreciated.

5 Likes

Hi @armsp

Thank you for being here! We don’t yet support that in the GraphQL API. There has been a schema request for it. I’ll update this post when we get new info about a ship.

Best,

Andrea

Hi Andrea,

should we still be using v3 for getting information about the changes in a commit? I think a comment on the v4 Commit object documentation to say that would be really helpful.

Thanks,

Mike

2 Likes

You are correct @platy and thank you for the feedback I’ll pass it along to the documentation team. Thanks again!

1 Like

any news on this? i still dont see it documented

Thanks

8 Likes

Still looking for this as well.

bump.gif

3 Likes

Hi Andrea,

can’t the technical team include the information about file changes, i.e., the paths of added, modified, deleted, and renamed files, in GraphQL’s Commit object?

This can save a lot of network resources when compared to getting the same information via the Rest API Get a single commit , which always sends patches over the network …

Thanks

To advertise GraphQL claiming it offers greater flexibility, and then not offer the flexibility in v3 does not give me warm fuzzy feelings.

Is it safe to assume that v3 will continue to be supported at least until everything I can do in v3 is possible in v4?

Is it still the case that in v4 with GraphQL, that you cannot get a list of the files changed in a commit?

And if so, what’s the tl;dr on what you need to do using the v3 api to get around this limitation?

I am writing scripts to analyze commit behavior on student software engineering projects. I need to get a list of the files changed so that we can exclude certain commits that only touch documentation and not code.

+1 on needing this functionality. I need to get a list of file extensions per commit

:wave: @mljrg: These are great points to bring up and something our product team would be interested in hearing more about. The best place to share product feedback and feature requests is through our official product feedback form.


:wave: @wbuiejr: Thanks for bringing up that point. I can see how that wording can come across and there’s definitely an opportunity for us to review the current presentation; this is something that our product team is open to hearing about as well.

Having feature parity between the GitHub REST API and the GraphQL is something that the API team is aware of, though I personally can’t speak if or when there will be an initiative to move that forward.

If it helps, we recently announced the GitHub public roadmap. This is the one place where anyone can learn about what features our team is working on, what stage they’re in, and when they’re expected to be released.

At this stage, we’re going to continue to support both the GitHub REST API and the GitHub GraphQL API.


:wave: @pconrad: The Commit object doesn’t expose a field that lists the files changed. However, it still exposes a changedFiles field indicating the number of changed files.

The REST API exposes an endpoint for fetching a single commit. The API responds with a JSON object with a files attribute listing all of the files changed. This endpoint also supports the diff and patch media types. Does this help?

In addition to using the GitHub API to obtain this information, it might also be interesting to use Git directly. One approach is cloning the repository and periodically fetching those new commits and running a script that automatically compute the list of files changed per commit per author using a command like git diff-tree or git show (passing along the appropriate options).