Querying all commits in a single repository

Hi :wave:

I’m trying to query all of the commits to a specified repository with the GraphQL API v4.

I only want to pull the dates they were committed at, in order to estimate the total time that was contributed to that repository (something along the lines of git-hours).

Here’s my initial query:

{
  repository(owner: "facebook", name: "react") {
    object(expression: "master") {
      ... on Commit {
        history {
          nodes {
            committedDate
          }
        }
      }
    }
  }
}

Naturally it returns only the latest 100 commits, because of the API’s limit of 100 nodes by connection.

However, the total node limit being much higher (500,000), I should be able to query commits in groups of 100 until I have all of them.

For example, I was able to query the latest 200 commits using this query:

{
  repository(owner: "facebook", name: "react") {
    object(expression: "master") {
      ... on Commit {
        total: history {
          totalCount
        }
        first100: history(first: 100) {
          edges {
            cursor
            node {
              committedDate
            }
          }
        }
        second100: history(after: "700f17be6752a13a8ead86458e343d2d637ee3ee 99") {
          edges {
            cursor
            node {
              committedDate
            }
          }
        }
      }
    }
  }
}

However I have to manually pass a cursor String to after on the second connection.

How can I recursively run this connection until I have a query for all the _committedDate_s of commits in a repository?

1 Like

Yes, you can query them in batches of 100 until you have all of them. The cursor value that you placed in the second100 part of the query you can obtain as part of the original query by rewriting it like this:

{
  repository(owner: "facebook", name: "react") {
    object(expression: "master") {
      ... on Commit {
        history {
          nodes {
            committedDate
          }
          pageInfo {
            endCursor
          }
        }
      }
    }
  }
}

This results in a query response that ends like this:

...
            },
            {
              "committedDate": "2019-03-07T12:39:15Z"
            },
            {
              "committedDate": "2019-03-07T00:39:39Z"
            },
            {
              "committedDate": "2019-03-06T23:50:02Z"
            },
            {
              "committedDate": "2019-03-06T22:41:45Z"
            },
            {
              "committedDate": "2019-03-06T18:17:54Z"
            }
          ],
          "pageInfo": {
            "endCursor": "b93a8a9bb8460a3d582072d3b252ecc15c6ea0f5 99"
          }
        }
      }
    }
  }
}

And the endCursor value is the one you want to pass as the after value in the next query.

I hope that helps!

1 Like