Git for Mathematicians (1): Preliminaries

Published
Updated
Part 1: Preliminaries
/post/git-1-preliminaries
Part 2: The Theory
/post/git-2-theory
Part 3a: The Practice
/post/git-3-practice

This post is the first in a series in which I will try to explain how to use Git to write papers, with an audience of professional mathematicians in mind. I know that there are a lot of material online about learning Git, but as far as I can tell, none are tailored specifically for mathematicians’ needs (which differ a bit from programmers’ needs). Here, I will try to explain why one would even be interested in Git to begin with.

Why do I need a (distributed) version control system?

Git is a distributed version control system. Behind this barbarous name is a rather simple idea. Suppose that you are writing an article. This article is typically composed of a LaTeX file (I did say I was writing for mathematicians), a bibliographical database, sometimes figures, maybe even some code, etc. As your work advances, you are faced with various challenges that need to be solved:

Does this sound like a lot? That’s because it is! There are various ways to solve some or all these problems. They range from “everyone writes one section of the paper alone and the paper gets merged in the end” to “we have a copy of the paper in a Dropbox folder and we hope that nobody works on it at the same time” through “we send each other versions of the paper by email until the process converges”. This can certainly work, and a combination of these has been the standard way to collaboratively work on math papers since the invention of the Internet, I guess. But each approach has its pitfalls.

Why Git?

Distributed version control systems (DVCSs) are essentially an answer to all these challenges and the pitfalls of the other approaches. I will not bore you with the details of the history of version control systems. You may have heard of some of them, like the old-school CVS or Subversion, or the other DVCSs like Mercurial or GNU Bazaar. Git, which was created by Linus Torvalds (who also created the Linux kernel) in 2005.

While there used to be some contention in the early days (Subversion supplanted CVS a while ago, then DVCSs came along and the battle raged between Git and Mercurial), it is generally accepted today that Git is the most popular one. In 2018, the StackOverflow developer survey listed Git as the most-used VCS with a 87.2% market share. Newer editions of the developer survey do not even include the question anymore, as Git is completely dominating the scene. Even Microsoft is using Git to manage the source code of Windows.

So how does a distributed version control system like Git work? There are two operative phrases in the name, that can be explained at a high level:

Next steps

Update: Part 2 (“Theory”) is now available here.

In the next post, I will try to explain how one uses Git. In the meantime, here are some links:

And here’s a little teaser of what you will hopefully be able to do by the end of all this!