Taking version control to the next level
After using Subversion for a couple of years, it’s time for me to look to the next generation of source control management systems.
What’s wrong with Subversion?
Before I start with this section: this isn’t meant as a rant. Nor do I want to call Subversion users ugly or stupid. Subversion remains a great improvement compared to CVS. However, there are a couple of things I miss in my daily work.
My main issue with Subversion is that I need the central repository on
the server. Not just to make commits, but also when I want to see what
happened in the past (review the logs or annotate a file with svn blame
). This can be a problem:
- As a consultant I travel frequently. Most of the times I take the train and try to get some work done. But whenever there is the need to access the repository, I’m dead in the water.
- The communication with the server can be slow. I do not care whether it is because I do not have a broadband connection at that moment or that I am not the only one trying to connect to the server; I just don’t want to wait too long for the result.
- The server could be unreachable. Coincidentally I’ve encountered this twice in the last period. One time the Apache configuration of our company server had a problem. The other time there was a hardware problem on the server where one of our clients hosts their repository. In both cases I could not continue to work on the project I had to work on.
Another annoyance of Subversion is that merging is required before you commit. Assume that I am working on a certain file. Let us further assume a co-worker committed a change that also updated that same file. Now before I can commit, Subversion requires me to update the file. This isn’t a big problem if there are no conflicts, but if there are, I can only commit my changes after I resolved them.
In other words: the changes as I intended them are never committed. I first need to make more changes. The only way to prevent this is by working on a branch. Then I can commit my changes and will only need to resolve the conflicts if I decide to merge my branch back. But while creating branches is easy in Subversion, merging can be painful. I know this is supposed to be better in Subversion 1.5, but I still have to talk to version 1.4 repositories.
Distributed version control systems to the rescue
For quite some time now, distributed version control systems (DVCS) like Bazaar, Git and Mercurial are available. By design these systems should take care of my number one problem with Subversion. At first glance all three of the DVCSs I just mentioned seem suitable. But which one is the best solution for me?
Mercurial
Mercurial (or “hg”) is one of the contestants.
But since there is little to no chance of convincing all my co-workers to switch
from Subversion, I need to be able to talk to our Subversion repository. There
is a set of scripts to do this, called hgsvn,
but it has the limitation that there is no straightforward way to push
back changes to the Subversion repository
according to the project
page.1 There are also other options,
but these seem very laborious. This is a showstopper for me.
Bazaar
On to the next candidate: Bazaar (or “bzr”) does have a plugin to access Subversion repositories: bzr-svn. This keeps Bazaar in the race.
Git
The final DVCS I investigated was Git. Git natively supports bidirectional operation with Subversion.
The decision
Although both Bazaar and Git seem to provide the most important features I’ll need, I chose Git. The first reason for not choosing Bazaar was the way it handles branches and revision numbers. Although I admit that I’m new to DVCS, if feels more natural to me to consistently use globally unique revision numbers than having local revision numbers and branches with their own view of history.
The other reasons for selecting Git over Bazaar are speed and repository size. Robert Fendt did recently did research and this confirms the results of other speed and repository size tests.
Git: additional benefits
I have worked with Git for a little while now and there are some additional benefits of it over Subversion:
- Stashing changes allows me to e.g. store local changes and go back to a clean working directory to work on something different for a while.
- Changing history may be a bit controversial in version control, but
it can be very useful. It allows me to, for instance, squash commits
while
merging,
rearrange the order of commits with
rebase
or add something to the previous commit with
“
git commit --amend
”. Obviously you don’t want to do this when you’ve already published your changes, but it has served me well already. - I can create branches on my local repository to work on features or an experiment, without bothering others with it.
- Committing is really fast. Although I still regularly push my changes to the Subversion repository, a ‘normal’ commit is blazing fast. Where a commit used to be a pause in my workflow, they now hardly have any impact. This makes it easy to commit more often and thus have commits do only one thing at a time.
- The last two benefits can be combined: since the commits are initially only local, I don’t have to postpone committing until the code is in a workable state. I can for instance create a failing test, commit it and then continue to write the code to make it pass, without having to worry about co-workers running into the failing test.
(Note that other DVCSs also have (most of) these advantages. I’m only comparing Git with Subversion here.)
All in all I am very enthusiastic! Granted: using Git is more complex than Subversion and there were some problems I had to overcome in my day-to-day work. (I’ll talk about them in a next post.) But the flexibility I gained! Incredible!
-
Update (2021-07-14): for the record, the limitation about pushing back to Subversion is apparently solved since it is no longer listed in the “limitations” section of the documentation. ↩︎