Commit a resource that already exist but with a different type (blob, tree, ...) succeeds when it should fail?

Before March 3, when I commit a file at the same level as a path with the same name that is already in the repo, I used to get the error “GitRPC::BadObjectState”, which fails the commit.
On and after March 3, the error is no longer returned. GitHub records the commit with 0 changes and the ref is updated to point to that new commit.

on GitHub: path1/file1
new Commit: path1

Similarly, committing a path that conflicts with a file on GitHub will succeed, but this time overwriting the file on GitHub

on GitHub: file1
new Commit: file1/file2

I use egit-github v3.6.0 library, which uses the header “application/vnd.github.beta+json”. However, this is also reproducible using curl with the same header, and with “application/vnd.github.v3+json”.

Shouldn’t the request fail when a file or path with the same name already exist, and we’re sending a different type of that resource (blob exist and committing a tree, and vice versa)?

Hey there @alhaddad-nasry :wave:

I’ve been doing some digging, trying to find out if there were any updates to our Commits API that might relate to collision/conflict handling and haven’t been able to come up with much. At least not in the timeframe that you’re suggesting.

So that I can reproduce behavior, would you mind sharing an example CURL that you’re using? Being mindful to redact any sensitive information (like PATs, etc.) when you do.

As a bit of an aside, I’m also curious to know if you can manually upload similar resources with varying types via the UI to produce the same behavior?

At any rate, I’d love to reproduce this myself and an example would be helpful to make sure I’m following similar steps as you.

Hello @nethgato

The scenario used in the script below is as follows:

  1. Create the file “test1/test2” manually on
  2. Run the script below, which will do the following:
    a. Gets the latest commit from github
    b. Creates a blob with the name “test1”
    c. Creates a tree containing the new blob “test1”
    d. Creates a new commit with the new tree
    e. Updates the branch to point to the new commit
  3. Check the commits on, and you can see that the commit was created with 0 changes.

Doing step 2 on (that is, create the file “test1”) will fail with the message:

A file with the same name already exists. Please choose a different name and try again.

Could it be a change in the Trees APIs?
Comparing the newly created tree with the tree before the commit, they turned out the same, that is, the path “test1” is of type tree in both, even though I called the trees API passing the BLOB “test1”, but it didn’t fail (shouldn’t it?).
This explains the empty commit, but doing the opposite scenario through the APIs, have a file “test1” on github and then commit a file “test1/test2”, will also succeed, but this time the file “test1” is lost, it is replaced by the committed file “test1/test2”.

Thank you.

The Script

## Set the following three variables before running the script

## Switching between the API_VERSION and USER_AGENT does not change the behavior

## Retrieve the current HEAD SHA
echo "Retrieving HEAD on master"
currentHead=`curl -v -H "Accept: $API_VERSION" -H "User-Agent: $USER_AGENT" -u $OWNER:$TOKEN$OWNER/$REPO/git/refs/heads/master`
currentHeadSha=`echo $currentHead | sed -n 's/.*"sha": *"\([a-z0-9]*\)".*/\1/p'`
echo "currentHead: $currentHead"
echo "currentHeadSha: $currentHeadSha"

## Retrieve the latest tree SHA to be used as the base tree of the new tree to be committed
echo "Retrieving tree-sha of HEAD on $REPO"
currentCommit=`curl -s -H "Accept: $API_VERSION" -H "User-Agent: $USER_AGENT" -u $OWNER:$TOKEN$OWNER/$REPO/git/commits/$currentHeadSha`
currentTreeSha=`echo $currentCommit | sed -n 's/.*"tree": *{ *"sha": *"\([a-z0-9]*\)".*/\1/p'`
echo "currentCommit: $currentCommit"
echo "currentTreeSha: $currentTreeSha"

## Just getting the current tree to be compared later to the new tree
echo "Retrieving tree of HEAD on $REPO"
currentTree=`curl -s -H "Accept: $API_VERSION" -H "User-Agent: $USER_AGENT" -u $OWNER:$TOKEN$OWNER/$REPO/git/trees/$currentTreeSha`

## Create the new tree containing the BLOB "test1"
echo "Creating new tree with new blob"
data=$(echo "{\"tree\":[{\"path\":\"test1\",\"mode\":\"100644\",\"type\":\"blob\",\"content\":\"bbbbbbbbbbb\"}],\"base_tree\":\"$currentTreeSha\"}")
newTree=`curl -s -X POST -H "Accept: $API_VERSION" -H "User-Agent: $USER_AGENT" -u $OWNER:$TOKEN$OWNER/$REPO/git/trees -d "$data"`
newTreeSha=`echo $newTree | sed -n 's/{ *"sha": *"\([a-z0-9]*\)".*/\1/p'`
echo "newTreeSha: $newTreeSha"

## Create the new commit with the new tree
echo "Creating new commit with new tree [$newTreeSha]"
data=$(echo "{\"message\":\"message\",\"tree\":\"$newTreeSha\", \"parents\":[\"$currentHeadSha\"]}")
newCommit=`curl -s -X POST -H "Accept: $API_VERSION" -H "User-Agent: $USER_AGENT" -u $OWNER:$TOKEN$OWNER/$REPO/git/commits -d "$data"`
newCommitSha=`echo $newCommit | sed -n 's/{ *"sha": *"\([a-z0-9]*\)".*/\1/p'`
echo "newCommitSha: $newCommitSha"

## Update the branch to point to the new commit
echo "Updating branch to point to new commit [$newCommitSha]"
data=$(echo "{\"sha\":\"$newCommitSha\"}")
newRef=`curl -s -X PATCH -H "Accept: $API_VERSION" -H "User-Agent: $USER_AGENT" -u $OWNER:$TOKEN$OWNER/$REPO/git/refs/heads/master -d "$data"`

## Get the new HEAD SHA
newHeadSha=`echo $newRef | sed -n 's/.*"sha": *"\([a-z0-9]*\)".*/\1/p'`
echo "newHeadSha: $newHeadSha"

## If the new HEAD is the new commit, then the commit was created successfully
if [[ "$newHeadSha" == "$newCommitSha" ]]; then
	echo "Success"
	echo "Failure"
	echo "newRef: $newRef"

## Print the tree before the commit (currentTree) and the tree after the commit (newTree)
echo $currentTree | python -m json.tool
echo "###############################################"
echo $newTree | python -m json.tool

Hi @alhaddad-nasry :wave:

Just wanted to update you and let you know I’m working on reproducing this. Since I am representing GitHub, I need to focus on the API endpoints and their expected behavior.

I spent a few hours manually walking through the steps in your script. So far, I’ve not been able to reproduce the behavior you are mentioning, but I’m not satisfied with my results just yet.

I will update this thread if anything relevant surfaces as I find the time to get back to this.

In the meantime, could you express the impact that this new/different result means for your actual workflow? It sounds like we might be doing some bug hunting, which I’m happy to do! Though it would also be helpful to know the impact of this behavior is having on your day-to-day.

Thanks! :bow:

1 Like