Questions regarding the Node ID migration

Around 3 months ago there were an official blog post regarding migration to the new GraphQL Node IDs.

There were no more details since then, but we accidentally encountered the new Node ID in one of the GraphQL responses. It might be a bug, but in any case it’s clear that the transition has started.

There are still a few questions unanswered, though:

  1. When we could expect the first rollout of the new IDs format for one of the node types?
  2. What will be the migration path for existing data? Will there be any tooling that can translate old IDs into new one offline (without requesting all the data again)?
  3. Is there any chance the ID format will be documented? The format was changed for efficiency reasons, and those reasons can be extended to the community, not only Github itself. On our side we’d like to see at least a way to determine the node type by looking at the ID, as Github API does internally.
2 Likes

Hey @dennwc :wave:

Thanks so much for starting this conversation! Right now, the answers to your questions are incomplete. The efforts to update our various APIs with this updated format are ongoing.

I did reach out to the project manager overseeing these tasks, and I want to forward what information I can.

  1. When we could expect the first rollout of the new IDs format for one of the node types?

Unfortunately, we can’t provide any firm dates, but I definitely understand the interest! There are some objects that have had this work completed, but end users should have a seamless experience for the time being. Old ID formats will continue to be accepted, and if/when new IDs are noticed, they should not block any existing workflows.


  1. What will be the migration path for existing data? Will there be any tooling that can translate old IDs into new one offline (without requesting all the data again)?

Again, I can’t speak to the timeline as the forecasted timeline we projected has been pushed back. Though yes, there will be migration tools, and while we are in a transitional state, folks using legacy IDs will still function and be provided with the deprecation notice until the migration is completed.


  1. Is there any chance the ID format will be documented? The format was changed for efficiency reasons, and those reasons can be extended to the community, not only Github itself. On our side we’d like to see at least a way to determine the node type by looking at the ID, as Github API does internally.

Yes! We will update our docs page for Global Node IDs here. Though I think the intent of this question is to document more than just the format that users will expect, but to perhaps expose our internal API usage? Maybe I misunderstand, but for that, I couldn’t speak to it. It’s doubtful that we will be publishing say, how our internal APIs are leveraging this new ID format.


Going forward, we are still planning to issue a new blog post to document the progress made, and what to expect in the future. I’m not going to say that updates are coming “soon,” but after my discussion with the PM, I want to reiterate that efforts are ongoing, even as I type this.

Please feel free to keep the conversation going and maybe consider subscribing to the blog’s RSS feed:

1 Like

Thanks, it answers most of it, but I wanted to clarify this part:

Though I think the intent of this question is to document more than just the format that users will expect, but to perhaps expose our internal API usage? Maybe I misunderstand, but for that, I couldn’t speak to it. It’s doubtful that we will be publishing say, how our internal APIs are leveraging this new ID format.

No, sorry, we are not expecting to know the internals of the API itself, but rather at least some information that can be decoded from those new IDs. We are mostly interested in:

  • Getting node type directly from node ID without querying GraphQL.
  • Getting SHA and repository ID from Commit node IDs (and back).
  • Getting some “primary key” for other nodes.

The reason I mentioned the Github API internals is because ID format is designed in a certain way to contain enough information for the backend to locate the node of any type. And I believe the information about this format/encoding can be useful for other developers like us as well. Hence the question regarding the documentation. This is an optimization we already use to reduce request rates to GraphQL API with the current node ID format.

2 Likes

and if/when new IDs are noticed, they should not block any existing workflows.

Today our workflow was broken because new IDs were silently introduced for Deployment.

Is it possible to have at least some dates? Or at least know a day or two upfront when the new ID is about to be exposed for existing types?

Or a list of ID v2 prefixes with correspondence to node types would be ideal. Currently we’ve seen:

PSH_ = Push
DE_ = Deployment

If the same silent change will affect any of the core node types like Repository, PullRequest, Ref, Commit, etc it will severely affect our application. We would really appreciate some specific details about the migration path. Thanks.

Hi @dennwc

Today our workflow was broken because new IDs were silently introduced for Deployment .

Big ouch…

This goes against what I understood about this transitional phase and I’m sorry that this was the experience you’ve had.

To investigate further, could you let me know which repo/workflow/application this happened to?

The application is called Athenian, and these are the kinds nodes we’ve seen so far:

PSH_lAHOA3oJoM4NjdrwzwAAAAGo-ToA
DE_kwDOAA1TcM4XDzWL
DES_kwDODJFr5M4iuPyg

(DeploymentStatus was encountered a few hours ago as well)

I think it should be enough to identify the case, but please let me know if there’s any other information that is required.

And thanks for looking into it!

1 Like

@dennwc do your workflows run in the same repo where your app is hosted?

Edit: I now realize you might not have meant an Actions workflow and that you might just be referring to your literal flow of pushing to your codebase. Let me know if that’s the case, because otherwise, I was totally going down the wrong path.

Yes, sorry, I didn’t mean Actions, but an Application itself.

I cannot directly answer the questions about the Actions workflow in the codebase, because these codebases belong to our customers.

Hi @nethgato

Are there any updates on this issue?

Hey @dennwc – was on a bit of holiday, but back now.

So for now, I’m not sure what to say. The entirety of the migration itself is ongoing and there have been a number of objects that have been updated. We were supposed to have had another public announcement/blog post with progression and what users should expect, but there were delays.

While I’m still updating myself on the progress of the migration, I do now know that users with hard coded values from GraphQL ID’s in the past, will need to update them. So if your application doesn’t have a way to dynamically pull IDs, you will see the kind of development breaking experience you expressed earlier in the thread.

I’ll do my best to update when I have more to share.

1 Like

Recently we’ve seen a huge spike of new node types using new node IDs.

I guess there are no news about the migration plan still? If so, I’d like to ask again if it’s possible for the dev team to provide a table of ID prefixes mapped to the old types?

{
		"PSH":  "Push",
		"DE":   "Deployment",
		"DES":  "DeploymentStatus",
		"IC":   "IssueComment",
		"A":    "App",
		"T":    "Team",
		"LE":   "LabeledEvent",
		"CS":   "CheckSuite",
		"CE":   "ClosedEvent",
		"ME":   "MergedEvent",
		"CC":   "CommitComment",
		"RR":   "ReviewRequest",
		"AE":   "AssignedEvent",
		"SE":   "SubscribedEvent",
		"DEE":  "DeployedEvent",
		"RRE":  "ReviewRequestedEvent",
		"RDE":  "ReviewDismissedEvent",
		"RTE":  "RenamedTitleEvent",
		"RRRE": "ReviewRequestRemovedEvent",
		"HRDE": "HeadRefDeletedEvent",
		"UNLE": "UnlabeledEvent",
		"ASEE": "AutoSquashEnabledEvent"
}

Something like this for all currently migrated types (or ideally future ones) would help us enormously with the transition.

1 Like

If you are wondering why we need this, I would suggest to ask the Github engineering team “why you decided to add these prefixes?” :slight_smile:

The reason is simple: if you have a ton of node types, it would be great if you can look at the ID and instantly know in what table/database to look for this node.

What I’m trying to explain is that this optimization is not Github specific. Analytical app like Athenian needs a lot of data from the Github API, so we also use this optimization in our code.

Thus, we would really appreciate if you could give us more details about these prefixes. We are currently working hard to migrate our data away from Github IDs, but we still need this information to do it efficiently, especially for new node types that appear quite regularly (and no official timeline mentioning it).