Fetching truncated files

I’ve been trying to fetch a file (eg. package.json) from each of our organisation repositories. It worked ok for package.json, but when we pick package-lock.json, the contents gets truncated. I’m having trouble working out how to then fetch the full contents of the truncated file.

Here’s the query I’ve got so far. There are other issues with it (eg. it only finds the first matching file per repo), but that’s less of a concern for me at the moment.

query {
  organization(login:"myorg") {
    repositories(first:100) {
      pageInfo {
        startCursor
        hasNextPage
        endCursor
      }
      nodes {
        name
        isArchived
        pushedAt
        packageLock: object(expression:"master:package-lock.json") {
          ... on Blob {
            isTruncated
            commitResourcePath
            byteSize
            id
            commitUrl
            isBinary
            text
          }
        }
      }
    }
  }
}

Is there another API endpoint I can call with one of the other properties in order to download the full content?

Hey @dsample!

This is a good question – I think you’ll have to drop to the v3 to get a link to the full contents of the file. Given what you’re looking at, I believe you should be able to easily build the URL yourself and use the same Authorization credentials to get what you need buy using the Repo Contents API: https://developer.github.com/v3/repos/contents/#get-contents

Given what it looks like you’re doing, I think something like GET /repos/myorg/$REPO/contents/package-lock.json should get you a download_url, which you can fetch.

This stinks! We should give you that downloadUrl right in the v4 API – let me take that back to the team. 

4 Likes

Thanks.

I thought I might end up having to use V3, but wasn’t sure which ID I’d need to use. I figured there might be a simple ‘get object’ endpoint on V3 which I could pass the OID and get either the content or redirected to the content.

It would be quite useful if the query could be extended easily to find all occurences of the file within the repo too. The current query I’ve got finds one in the base directory, but if there are others (eg. in a ‘tests’ folder) it would be good to be able to find that too. I think I can use ‘search’ instead, but that feels like quite a heavy-handed approach.

1 Like

Hey @dsample!

Good to know - I believe Search would totally be the right answer for this problem, and it might be relatively easy to do given you can scope the query down to a specific Organization and filename. I don’t think it’s too heavy-handed, and it should be able to fit your needs quite well!

1 Like

It’s been a while, but I’m trying this again. Was there any traction on the downloadUrl property? I’m still finding it hard to find the path to the files in order to use the v3 API to retrieve the file. How do I use the commitResourcePath to actually get the file path?

The example you gave, for instance, was /repos/myorg/$REPO/contents/package-lock.json, but if the file is found in another directory, for instance (or multiple within the repo), how should I be getting them?

I found the platform-samples repo, is this a question for there, to get a real-world example of how to achieve this? Doing it completely with the v3 API takes a massive amount of API calls.

A year and a half later, @nickvanw: do you know if the downloadUrl property is on the backlog of the graphql team or if there’s even an ETA? I really don’t want to fallback to V3 just for that because, well, “This stinks!”.