Help
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Copilot Lvl 2
Message 1 of 5

Contributions page is displaying changes from renamed files as deletions + additions

Hello,

 

I'm having a question/request regarding the Contributions page in a repository, and the way that renamed files appear in there.

 

I've created a demo repository with 2 commits to demonstrate this:

  1. First commit, adding a README file with 50 lines:
    https://github.com/maksimovic/contributions-demo/commit/4469bb4504289cddabbbd1517a1acb6dc5c092b2
  2. Second commit, renamed the file, (note that it says "0 deletions & 0 additions"):
    https://github.com/maksimovic/contributions-demo/commit/4bf98ba6e9bd478bc277258f9e497cbf2a12b059
  3. Now, look at the contributions page, which says "100++ and 50--":
    https://github.com/maksimovic/contributions-demo/graphs/contributors

 

Now, this may not seem to be significant at all, but the way this works can totally screw up contributions in large repositories where larger changes of this type can happen.

 

For example, recently I've moved a lot of files from one place to another in a huge monolith repository, as part of a bigger refactoring process. Before that, my contributions were around 67k "++", and those were real additions. However, after these refactoring changes went to the master branch, my additions went up to 154k "++" (deletions increased by the same amount as well).

 

Is there any way for such changes not to appear like this?

4 Replies
Highlighted
Community Manager
Message 2 of 5

Re: Contributions page is displaying changes from renamed files as deletions + additions

All modern version control systems are based ultimately on detecting and tracking the changes to files. Version control systems for source code are specifically designed for tracking changes to text files. The method that is used to do that is colloquially referred to as a "diff" algorithm, an algorithm that is designed to solve the longest common subsequence problem for the lines of text in the text file. This algorithm, given two sequences of items, finds a longest sequence of items that is present in both original sequences in the same order. Once you have such a sequence then, assuming that you want to determine how to change the first sequence into the second:

 

  1. Anything that is not in the sequence but is present in the first original must have been deleted
  2. Anything that is not in the sequence but is present in the second original must have been added

 

So, all changes to any text file in source control is expressed as a set of additions and deletions. For example, changing this text:

 

The quick brown
fox jupppped
over the lazy dog.

 

to this text:

 

The quick brown
fox jumped
over the lazy dog.

 

requires one deletion and one addition and is described thusly by the `diff` program on my machine:

 

2c2
< fox jupppped
---
> fox jumped

 

Even if one were to simply reorder functions within a file, it would generate many deletions and additions. Because of this, it is almost certain that your ~67k additions are not all "real additions".

 

Much like SLOC, these metrics are designed to give an estimate of the churn of code in a project or system, not be able to exactly compare the contribution of two developers. This is especially true given that two different developers may solve the same problem more or less efficiently than the other, so the one that "added" more lines of code is not necessarily adding more benefit.

 

I realize that's probably not the answer you were wanting. But I hope it helps!

Highlighted
Copilot Lvl 2
Message 3 of 5

Re: Contributions page is displaying changes from renamed files as deletions + additions

No, it not the answer I was after, and it doesn't help either :) I already did know that (in more or less details); my question was more along the lines of:

 

If you're able to suppress the diff when file rename/move is detected in a PR/commit, then why is the Contributions page not being "aware" of that?

 

So, there is a logic to figure it out, obviously - it's not like it's "ok we can figure out just additions and deletions, but nothing more". GitHub is able to tell (with very high probability) when something has been moved elsewhere, but it's not consistent.

 

You are developer yourself; so, given the ability of one of your sub-systems (Module A) to detect when Event A happens, then you'd probably be more than capable of making another sub-system (Module B) being aware of that event's existence, and then use it to tweak the Module B's output based on Event A's occurences and their properties. Right? :)

 

Of course those 67k additions were not all "true additions", but they were more real than the current figure; they used to provide some sense, now they provide none.

 

Anyway, just wanted to bring it up in case someone is watching. Probably the change would not be trivial to make - even if anyone "above" cared that much :D

 

Edit: God, these smileys are terrible!

Highlighted
Ground Controller Lvl 1
Message 4 of 5

Re: Contributions page is displaying changes from renamed files as deletions + additions

Hi, I also noticed this issue. My addition and deletion in the contributor page are way more than the actual numbers I can count from every single diff just because I moved some files. How shall we escalate this issue to Github?

Highlighted
Ground Controller Lvl 1
Message 5 of 5

Re: Contributions page is displaying changes from renamed files as deletions + additions

CONGRATULATIONS!!! You have done a fantastic job, dear VUAs, and THANK YOU for that. You have secured, and shown to us, how to keep the way and the future of Linux what it once was: free, stable, reliable, - simply the best. I take my hat off to you, also because I can..... I'm a hat wearer