GitActor.user should be type Actor, not User. Cannot distinguish between bots/users otherwise.

Using the GraphQL endpoint, when retrieving a commit’s actual user within GitHub (if present), it only allows for the type User, when it should in fact be Actor to support the notion of bots/users. 

e.g.

Commit.author(GitActor).user(User)

should probably instead be…

Commit.author(GitActor).actor(Actor)

so then we could do a query a la ... on User or ... on Bot to tell what type of account it actually is. 

Right now, the info returned contradicts other areas of the API, where bot users are actually being returned as regular users. If adding in __typename to the query, it even says “User” 

3 Likes

Seems this is now implemented.

{
  nodes(ids: ["MDEyOk9yZ2FuaXphdGlvbjE3NDExODg=", "MDM6Qm90Mjc4NTYyOTc=", "MDM6Qm90MjcyOTUwMDU="]) {
    __typename
    ...on User {
      login
    }
    ...on Organization {
      login
    }
    ...on Bot {
      login
    }
  }
}

gives

{
  "data": {
    "nodes": [
      {
        "__typename": "Organization",
        "login": "Guake"
      },
      {
        "__typename": "Bot",
        "login": "dependabot-preview"
      },
      {
        "__typename": "Bot",
        "login": "artsy-peril"
      }
    ]
  }
}
1 Like

No, it’s not. Commit.author is still a GitActor, and GitActor still has the following schema:

"Represents an actor in a Git commit (ie. an author or committer)."
type GitActor {
  ...
  "The GitHub user corresponding to the email field. Null if no such user exists."
  user: User
}

Yes, if you make a separate query for the actual actor/user node, you can probably do what you did. But one of the key purposes of GraphQL is to be able to query for exactly what you want, in a single query.

1 Like

You can access the account type information directly from the author (Git Actor) through the User link.

For example, you can go from a commit to a Git Actor (say the author) to information on their GitHub account such as the account type in the same query.

{
  node(id: "MDY6Q29tbWl0MTExNjA0ODEzOmYyODc2NGE3NDIwMGQ5ZDQwMzhhOWY4ZjQ3MzNmNjBjNGI0NTE1Mjg=") {
    ... on Commit {
      id
      oid
      committedDate
      author {
        user {
          login
          __typename
        }
      }
    }
  }
}
{
  "data": {
    "node": {
      "id": "MDY6Q29tbWl0MTExNjA0ODEzOmYyODc2NGE3NDIwMGQ5ZDQwMzhhOWY4ZjQ3MzNmNjBjNGI0NTE1Mjg=",
      "oid": "f28764a74200d9d4038a9f8f4733f60c4b451528",
      "committedDate": "2019-09-25T10:15:42Z",
      "author": {
        "user": {
          "login": "gcushen",
          "__typename": "User"
        }
      }
    }
  }
}

@nosferican that is exactly the problem that this post was originally created for. I’ve always been able to get the author information as type User. The problem is that User is a concrete implementation of the Actor interface. That means you could have a query such as the following:

query {
    organization(login: "moltenbits") {
        repository(name:"renovate-gradle-bug") {
            pullRequest(number: 1) {
                author {
                    __typename
                    login
                }
                commits(first: 10) {
                    nodes {
                        commit {
                            abbreviatedOid
                            author {
                                user {
                                    __typename
                                    login
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}

which results in…

{
  "data": {
    "organization": {
      "repository": {
        "pullRequest": {
          "author": {
            "__typename": "Bot",
            "login": "renovate"
          },
          "commits": {
            "nodes": [
              {
                "commit": {
                  "abbreviatedOid": "0f6337c",
                  "author": {
                    "user": {
                      "__typename": "User",
                      "login": "renovate-bot"
                    }
                  }
                }
              }
            ]
          }
        }
      }
    }
  }
}

That’s because the pullRequest.author property is correctly of the Actor interface type. But the same is not true of commit.author. It’s important to understand the distinction between the Actor interface and the GitActor concrete type (which only allows for User concrete types. A commit should be able to be authored by a User, Bot, or Mannequin which are all implementations of the Actor interface and you should be able to tell which it is by using inline fragments. But since it’s a concrete GitActor, you cannot use inline fragments and it will always (often times incorrectly) say that it’s a User, even though there are many instances where it should be a Bot.

Because of this issue, what should be the same actor account is now represented as two separate accounts.

1 Like

I see. Indeed that should / could be better. For now, I would suggest capturing the ID of the authors and then querying the account type through the first method. You can process 50,000 GitHub users per request so it should not be too bad. You might also want to delay getting the user information as well so you can query the account types on the distinct set based on all the commit data of interest rather than collecting the many duplicate values.

1 Like

@nosferican unfortunately that’s still not a work around, because the issue still results in two separate Actor accounts being given for what is ultimately a single account. E.g. Dependabot has a “dependabot” Bot account and a “dependabot[bot]” User account. Renovate has a “renovate” Bot account and a “renovate-bot” User account. As you can see the “bot” moniker within their name isn’t even consistent, so you can’t simply do string parsing to match them either.

In our case we have been using the __typename of the User and then merging that with the GitActor.User using the global node ID (Commit.author). I would avoid doing any merging based on the login (always use the Global node ID). We had noticed the inconsistent [bot] and such suffixes and were puzzled by that; your comment does shine light into why that is happening.

I don’t see any good workaround in the case of a PullRequest.author since that one doesn’t provide an account node ID… :neutral_face: Regardless, that PullRequest.author should expose the account info.

@nosferican PullRequest.author does provide a node ID, you just have to use an inline fragment to get at it. But again, they are still two unique accounts, each with unique identifiers. That is why I mentioned string matching, because it’s the potential way to pair them up at the moment. I’m well aware that it’s a terrible approach, that’s why I created this thread.

For example:

query {
    organization(login: "moltenbits") {
        repository(name:"renovate-gradle-bug") {
            pullRequest(number: 2) {
                author {
                    __typename
                    login
                    ... on Bot {
                        id
                        databaseId
                    }
                    ... on User {
                        id
                        databaseId
                    }
                }
                commits(first: 10) {
                    nodes {
                        commit {
                            author {
                                user {
                                    __typename
                                    id
                                    databaseId
                                    login
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}

returns…

{
  "data": {
    "organization": {
      "repository": {
        "pullRequest": {
          "author": {
            "__typename": "Bot",
            "login": "renovate",
            "id": "MDM6Qm90MjkxMzk2MTQ=",
            "databaseId": 29139614
          },
          "commits": {
            "nodes": [
              {
                "commit": {
                  "author": {
                    "user": {
                      "__typename": "User",
                      "id": "MDQ6VXNlcjI1MTgwNjgx",
                      "databaseId": 25180681,
                      "login": "renovate-bot"
                    }
                  }
                }
              }
            ]
          }
        }
      }
    }
  }
}
1 Like

In that particular case, there seems to be a case as to why the distinct entities.

{
  nodes(ids: ["MDM6Qm90MjkxMzk2MTQ=", "MDQ6VXNlcjI1MTgwNjgx"]) {
    id
    __typename
    ...on User {
      email
      url
      createdAt
    }
    ...on Bot {
      url
      createdAt
    }
  }
}
{
  "data": {
    "nodes": [
      {
        "id": "MDM6Qm90MjkxMzk2MTQ=",
        "__typename": "Bot",
        "url": "https://github.com/apps/renovate",
        "createdAt": "2017-06-02T07:04:12Z"
      },
      {
        "id": "MDQ6VXNlcjI1MTgwNjgx",
        "__typename": "User",
        "email": "renovate@whitesourcesoftware.com",
        "url": "https://github.com/renovate-bot",
        "createdAt": "2017-01-17T16:55:44Z"
      }
    ]
  }
}

Seems the Bot is the GitHub App which was created at a different time. In terms of matching commit authors to Bots I suspect that would never happen as the Bot don’t have an email account and instead use those “User” accounts for their commit activity. If that’s the case, it seems not possible to identify bot commits until that is addressed. For example, at least some way to match Bot and User… Indeed not ideal.