GraphQL - abbreviatedOid vs git 'abbreviated commit hash'

Regarding commits, git may return a string longer than 7 characters, which seems to happen with repos having a large number of commits.  Example - rails/rails

It appears that GraphQL always returns seven characters.  Could these be aligned?

1 Like

Can you give a sample query to reproduce the problem you’re seeing? This would make it much simpler to ensure that I’m seeing the same thing you are.

@lee-dohm wrote:

Can you give a sample query to reproduce the problem you’re seeing? This would make it much simpler to ensure that I’m seeing the same thing you are.

Explorer query:

{
  repository(name: "rails", owner: "rails") {
    name,
    owner {
      login
    },
    ref(qualifiedName: "master") {
      name,
      target {
        ... on Commit {
          history(first: 3) {
            edges {
              node {
                messageHeadline
                oid
                abbreviatedOid
                committedDate
              }
            }
          }
        }
      }
    }
  }
}

GraphQL returns (abrev):

"node": {
  "messageHeadline": "Re-organize `init_internals`",
  "oid": "2c0729d8cb13100ea576337ebb7703320203c548",
  "abbreviatedOid": "2c0729d",
  "committedDate": "2019-04-23T20:14:25Z"

git log command (on a local clone):

git log -n1 --format="%h %ci"

Git command returns:

2c0729d8cb 2019-04-24 05:14:25 +0900

Note the difference between abbreviatedOid (7 characters) and the git log %h string (10 characters)…

Thanks for the illustrative example!

I’m not sure they can be aligned. First, the short version of the hash can be different depending on which version of git you’re using, so we can’t always guarantee that the hash we’re going to return will always match up with whatever version of git you’re using locally. Second, even if we pin to a “known good” version of git, guaranteeing that they always aligned would essentially necessitate that we generate that abbreviatedOid by a git call (or git library call … but same difference really) for every commit on every call to a query like that. Generating an O(n) overhead like that is a quick way to kill a database :grinning:

However, given that git repositories have been growing quite large as git itself has been in use for a longer and longer period, we may want to investigate updating our algorithm. I’ll take your feedback to the developers, but I can’t guarantee when or even if this will get addressed.

Thanks very much for the feedback.

Thanks for taking the time to review the example.  GraphQL vs v3 is messy, as I kind of view GraphQL as SQL with no (or a very limited) ‘where’ clause.

My question is related to the code I’m using to update a Ruby doc site on GitHub.io:

https://msp-greg.github.io/

Several of the repos are using abreviated SHA’s of 8 characters, with Ruby & Rails having 10.

With the current setup, I’m using v3 and git commands, but I need to test/benchmark using GraphQL and possibly hitting Rubygems.org for some of the info…

Abbreviated hashes, no matter what length is used, are not future-proof, because as a repository grows collisions will become more likely using anything less than the full hash. You should only use the abbreviated hash for the text of a link and use the full hash in the URL.

You can also abbreviate the full hash yourself to whatever arbitrary length you choose.

I hope that helps!