Use nbdime as diff tool for Jupyter Notebook

Below are direct quotes from nbdime’s documentation page, as it eloquently states the problem and how it solves it.

Jupyter notebooks are useful, rich media documents stored in a plain text JSON format. This format is relatively easy to parse. However, primitive line-based diff and merge tools do not handle well the logical structure of notebook documents. These tools yield diffs like this:

diff example using traditional line-based diff tool

_ nbdime , on the other hand, provides “content-aware” diffing and merging of Jupyter notebooks. It understands the structure of notebook documents. Therefore, it can make intelligent decisions when diffing and merging notebooks, such as:_

  • eliding base64-encoded images for terminal output
  • using existing diff tools for inputs and outputs
  • rendering image diffs in a web view
  • auto-resolving conflicts on generated values such as execution counters

nbdime yields diffs like this:

example of nbdime's content-aware diff

Here are my own words on the problem:

Jupyter Notebook is an extremely popular file format on Github; as of the time of writing, there are several millions of them hosted on Github. However, the generic diff tool used by Github when doing version control is not very effective, as seen above.

Github has done a great job to display Jupyter Notebook in rendered form instead of its source code when viewed in repository online. Could it do it one more time and use nbdime as the default option of file comparison instead of diff? It would make the life of developers much easier when collaborating on Jupyter Notebook.

34 Likes

it really would make a lot of sense for github to show differences in notebooks using nbdime

+1 on this one

8 Likes

:wave: I work on the product team at GitHub. I really like the idea of being able to do a context aware diff of notebooks. I’m very aware of nbdime and how it can do diffs. 

I’d love to learn more about how you or anyone else uses notebooks in their workflows and how GitHub can better support that workflow. 

Please feel free to reach out to me [my github handle] @ github.com.

Hi neovintage,

I just sent you an email with a sample workflow that we use for notebook based experiment collaboration. Let me know if you need further details & I am open to collaborating for Github to implement this feature.

– d

You might like to checkout reviewnb.com for “content-aware” diffs of Jupyter Notebooks.

ReviewNB is a GitHub App that you can install on your repositories and use it for notebook diff & commenting workflow on GitHub commits & pull requests.

Disclosure: I built ReviewNB.

1 Like