Posts tagged with “git” on Mark van Lent’s weblog

Open tabs

2021-10-26T18:57:58Z

Currently I have about 30 tabs open in the browser on my phone. Quite a bunch of them I have open because I want to read the article in the future, already have read the article and want to reread or act on it, or a combination of the above. In this article I list the open tabs (and some notes) so I can close them on my phone, but still have a reference to them.

Development

Better Git configuration: Some tips from Scott Nonnenberg to improve your Git configuration.
My Python Development Environment, 2018 Edition: A good description by Jacob Kaplan-Moss of how he uses pyenv, pipenv and pipsi for Python development.

Operations

BorgBackup documentation: Something I want to play around with—and perhaps use—to make backups.
Ops School Curriculum: A very comprehensive resource to learn to be an operations engineer.
Serverless Ops: What do we do when the server goes away?: Tom McLaughlin writes about the changing role of DevOps/Operations engineers in a ‘serverless’ world.
Ask HN: How do you back up your site hosted on a VPS such as Digital Ocean?: A bunch of comments with suggestions on how to arrange backups for a VPS. (I need some inspiration for my own VPS.)
Securely Using Amazon S3 Buckets For Server Backups: See above; this is one of the candidates.
Awesome Sysadmin: A curated list of amazingly awesome open source sysadmin resources.

Security

Automatic Server Hardening: Server hardening tips plus Chef, Puppet and Ansible modules. (Source: Cron weekly, issue 94)
Decent Security: Information on how to secure your devices (Windows, routers).

DevOps

DevOps README.md: A curated list of things to read to level up your DevOps skills and knowledge by Chris Short. (Source: DevOps’ish, issue 043)
Monitoring and Observability: A great post by Cindy Sridharan explaining the difference between monitoring and observability.
A Model for Scaling Terraform Workflows in a Large, Complex Organization: An article by Ryan Lockard and Hibri Marzook about scaling your Terraform working practices.
Site Reliability Guide for mybinder.org: This might contain useful information about how mybinder.org sets things up and how to write this kind of documentation.
Terraform, VPC, and why you want a tfstate file per env: Another Terraform article, this time by Charity Majors.
Testing in Production, the safe way: Lots of information in this article by Cindy Sridharan.
Working with Terraform: 10 Months In: Perhaps this article by J.D. Hollis might save me some headache (if I get around to read it in time :) ).

Miscellaneous

Dark Matter: A book recommendation that I still need to check out. This was the first link that popped up when I Googled the title.
Raspberry Pi Data Logger with InfluxDB and Grafana: An article by John Whittington as input for my (almost dead) side project to collect and graph data from my smart meter.

Merge a separate Git repository into an existing one

2021-08-20T20:23:14Z

When I started on a project it seemed to make sense to put a part of the project in a separate Git repository. In hindsight that wasn’t such a smart move. Here’s how I fixed it.

The old situation

In the old situation I had two Git repositories: and . In this case was the project repository and contained only one part of it. (For those interested: is a Python package which I included into the project using mr.developer.) A simplified version of the situation looks like this:


├── bootstrap.py
├── buildout.cfg
└── src
    └──

For several reasons I wanted to merge the repository into the repository in the src/package path.

There are several ways to approach this. I wanted to end up in a situation where in my day-to-day work I would not notice that the two repositories were separate up until a certain point.

What I did not do

At first I tried the approach outlined by Jason Karns in his article Merge Two Git Repositories Into One. That is, I did not create a new empty repository to merge the two existing repositories in. I just merged one existing repository into the other, essentially only doing the second set of steps he described.

After I finished, I discovered that I could not easily use “git log” to see the history of a file. Sure, I could use the “--follow” option but that only works for a single file—not a complete directory. Apparently this is caused by the “git read-tree” step. And although you can fix this, I wanted to avoid the situation.

In his article Merge Git Repositories and Preserve Commit History, Scott W. Bradley describes a way to do the merge without using the “git read-tree” command. However, the result is similar due to the “git mv” step that is in there.

The method I used

What I wanted apparently was a bit more complex. As a result the process is also a little more complex. Thankfully I could combine the previously mentioned articles with a helpful answer on Stack Overflow. This resulted in the following ‘recipe’:

First clone the repository and go to that directory:

$ git clone ssh:// /tmp/package
$ cd /tmp/package

Just to be sure we do not commit something in the original repo, remove the remote:

$ git remote rm origin

Then use “git filter-branch” to rewrite the existing commits so that the files are already in the right directory (src/package in my case):

$ git filter-branch --index-filter \
      'git ls-files -s | sed "s-\t\"*-&src\/package/-" |
        GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
        git update-index --index-info &&
        mv "$GIT_INDEX_FILE.new" "$GIT_INDEX_FILE"
      ' HEAD

(Note that according to Frederik you have to replace the \t in the sed command with Ctrl-V + tab when using OS X.)

You can now verify that everything is still all-right, the history is preserved and all files are located in the new directory.

Now make a fresh clone of the repo:

$ git clone ssh:// /tmp/project
$ cd /tmp/project

Add the clone as a remote:

$ git remote add -f package /tmp/package

Next, merge the new remote:¹

$ git merge --allow-unrelated-histories package/master

Cleanup time: you can remove the temporary remote:

$ git remote rm package

By now all code should be in the same place as it was before we started, but now it’s in a single repository. Now would be a nice time to run your tests to verify that everything went well.

If everything checks out, don’t forget to push the result to the repository. (What you do with the repository is up to you. I would probably remove it.)

Update (2017-05-03): I have added --allow-unrelated-histories, which is needed since Git 2.9. Thanks to Josef, Maurits and Duncan for pointing this out. ↩︎

Distributed Version Control Systems (presentation)

2021-07-18T14:38:11Z

On 19 February I held a presentation for my colleagues about distributed version control systems (DVCS). My main goal was to inform them on what I think is the next logical step in source control.

My presentation can be found on slideshare (or as a Keynote file on this site), a summary can be read on Maurits' weblog.

A small disclaimer: my original plan was to create an implementation agnostic introduction to DVCS for my co-workers at Zest. However, while creating the presentation I found it easier to compare DVCS to Subversion.

Also note that the last couple of slides talk about git and git-svn specifically. This is because my colleagues were interested in the way I currently use Git to work on our projects. The git-ext-svn-clone command I refer to in slide 39 can be found on github.

Git in action (feature branch after the fact)

2021-07-16T07:25:56Z

This blog entry is about a real life example of how the flexibility of Git made my life easier. It’s a story about how I stopped developing a feature halfway to try out an alternative, without throwing away anything or cluttering up the (Subversion) repository.

Last week I was working on a set of features for one of our clients. In an attempt to be a proper agile developer, I was refactoring while coding. Halfway through developing a feature I realised that my approach may not be the best solution. Still, throwing away the work that already had been done wasn’t an option because the alternative could also have turned out to be a bad idea. To make matters more complicated: the history made it hard to create a branch somewhere in the past: I would either have to throw away useful code, or mess up the history with code I wouldn’t ever use.

Repository at the start of this story

Luckily I was using Git and I hadn’t pushed the relevant code to the Subversion repository yet. (A simplified graph of the history is shown in the image. The blue boxes represent the commits and the green the references.)

The first action was to get the history sorted out. Since I made small commits and thus not mixed features in the commits, I could easily reorder them. Running “git rebase --interactive ” with the SHA1 of the right commit popped up the editor where I changed the order of the commits and I was done.

The next step was creating a branch from the current HEAD. Since, as far as I understand, a branch is just a reference pointing to a certain commit, this action made sure my first attempt to implement feature Y was saved. Still, I wanted to work on the code without my first attempt being there. By resetting the current HEAD to an earlier commit without the feature Y changes, this was possible.

The repository after rebasing, branching and resetting

Now, not falling into the same trap twice, I created a new branch from master to try out the new way of implementing the new feature. Happily committing away on this new branch I was able to make up my mind about which approach would be the best and quickest solution. In the end I decided to go with the new approach and merged it with master.

The repository right before merging

Now for the anticlimax of the story… The whole exercise was about trying out a new way of implementing a feature without messing up the Subversion repository. Although Git helped me all the way, the human again proved to be the weakest link. By mistake I pushed the branch of the half baked implementation to the repository. A quick “svn merge” restored the situation and I pushed the master branch to the Subversion repository after all. (I probably could also have used Git to undo the commits, unfortunately I was in a hurry and didn’t know how from the top of my head.)

Lessons learned:

Git is really flexible and, as Ryan Tomayko states, it means never having to say “you should have”.
You still have to do the thinking yourself. :)

Using Git when developing Plone applications

2021-07-16T07:25:56Z

While I’m enthusiastic about Git, I still have to communicate with Subversion repositories like the Plone Collective. I also like my editor (Emacs) to help me interact with Git. In this blog entry I’ll explain how I setup my work environment.

Choosing a distributed version control system was step one. Step two is incorporating it in my working life. This starts with retrieving and storing the source code for the projects I’m working on.

Git-svn

One of the reasons I chose Git was the “bidirectional flow of changes” that will be necessary. The Git repository on my computer will have to pull in the changes from the Subversion repository. Likewise, I have to make my changes available for others by pushing them back to the central repo.

Git-svn allows me to clone the necessary part of a Subversion repository. For instance, to clone the buildout of project X I can easily do:

git svn clone https://svn..../projectX/buildout -s

This will clone (checkout) the project X buildout. By adding the “-s” parameter I tell Git that the buildout directory has the standard Subversion layout. (In other words: it contains trunk, branches and tags directories.) There is plenty git-svn documentation out there, so I won’t describe it any further here. For more information see for example the documentation I linked to above or blog posts like Howto use Git and svn together and Effectively Using Git With Subversion.

svn:externals

Okay, we’ve got the buildout. Now at Zest we basically have two types of buildout configurations. We either include the products for the policy, theme, et cetera by using the svn:externals property in the src directory, or we include those products by using infrae.subversion.

I haven’t found a proper solution for projects that use the latter approach (other than restructuring the buildout that is). At the moment I just use Subversion instead of Git. However if the project collects all the products with the svn:externals property, there are options…

Personally I use the git-svn-clone-externals script that can be found on GitHub. To be precise, I use the fork by Paul J Stevens. By starting this script in the root directory of the Git repository (in my case the buildout directory) it finds the products in src and clones each of them.

Since I have a couple of buildouts with more than five products as svn:externals, I got tired of manually making sure all changes in them are committed and pushed back to the subversion repository. Therefore I forked the git-svn-clone-externals repository and added two scripts that help me with this. By running the git-svn-externals-check script in the src directory I can be pretty sure everything is back in Subversion so my co-workers can access it.

Emacs

I use Emacs to code, thus I also wanted to use it to help me with the version control stuff. For Subversion I use psvn.el and I was looking for something similar. I first tried git.el (which comes with Git) because the key bindings were similar. But although it got me started quickly, it didn’t feel quite right. For instance, I could not find a way to work with staged changes. And this is a feature I really started to like and use.

To make a long story short: I switched to Magit for the moment. Although it took me a while to get used to the key bindings, I actually really like it! It allows me to work with Git from Emacs and the command line in a similar fashion. Actions taken in one of them do not get in the way of the other.

I’m not completely settled yet, but I do love working with Git. I hope to be able to use it on more and more projects.

Taking version control to the next level

2021-08-20T19:33:29Z

After using Subversion for a couple of years, it’s time for me to look to the next generation of source control management systems.

What’s wrong with Subversion?

Before I start with this section: this isn’t meant as a rant. Nor do I want to call Subversion users ugly or stupid. Subversion remains a great improvement compared to CVS. However, there are a couple of things I miss in my daily work.

My main issue with Subversion is that I need the central repository on the server. Not just to make commits, but also when I want to see what happened in the past (review the logs or annotate a file with svn blame). This can be a problem:

As a consultant I travel frequently. Most of the times I take the train and try to get some work done. But whenever there is the need to access the repository, I’m dead in the water.
The communication with the server can be slow. I do not care whether it is because I do not have a broadband connection at that moment or that I am not the only one trying to connect to the server; I just don’t want to wait too long for the result.
The server could be unreachable. Coincidentally I’ve encountered this twice in the last period. One time the Apache configuration of our company server had a problem. The other time there was a hardware problem on the server where one of our clients hosts their repository. In both cases I could not continue to work on the project I had to work on.

Another annoyance of Subversion is that merging is required before you commit. Assume that I am working on a certain file. Let us further assume a co-worker committed a change that also updated that same file. Now before I can commit, Subversion requires me to update the file. This isn’t a big problem if there are no conflicts, but if there are, I can only commit my changes after I resolved them.

In other words: the changes as I intended them are never committed. I first need to make more changes. The only way to prevent this is by working on a branch. Then I can commit my changes and will only need to resolve the conflicts if I decide to merge my branch back. But while creating branches is easy in Subversion, merging can be painful. I know this is supposed to be better in Subversion 1.5, but I still have to talk to version 1.4 repositories.

Distributed version control systems to the rescue

For quite some time now, distributed version control systems (DVCS) like Bazaar, Git and Mercurial are available. By design these systems should take care of my number one problem with Subversion. At first glance all three of the DVCSs I just mentioned seem suitable. But which one is the best solution for me?

Mercurial

Mercurial (or “hg”) is one of the contestants. But since there is little to no chance of convincing all my co-workers to switch from Subversion, I need to be able to talk to our Subversion repository. There is a set of scripts to do this, called hgsvn, but it has the limitation that there is no straightforward way to push back changes to the Subversion repository according to the project page.¹ There are also other options, but these seem very laborious. This is a showstopper for me.

Bazaar

On to the next candidate: Bazaar (or “bzr”) does have a plugin to access Subversion repositories: bzr-svn. This keeps Bazaar in the race.

Git

The final DVCS I investigated was Git. Git natively supports bidirectional operation with Subversion.

The decision

Although both Bazaar and Git seem to provide the most important features I’ll need, I chose Git. The first reason for not choosing Bazaar was the way it handles branches and revision numbers. Although I admit that I’m new to DVCS, if feels more natural to me to consistently use globally unique revision numbers than having local revision numbers and branches with their own view of history.

The other reasons for selecting Git over Bazaar are speed and repository size. Robert Fendt did recently did research and this confirms the results of other speed and repository size tests.

Git: additional benefits

I have worked with Git for a little while now and there are some additional benefits of it over Subversion:

Stashing changes allows me to e.g. store local changes and go back to a clean working directory to work on something different for a while.
Changing history may be a bit controversial in version control, but it can be very useful. It allows me to, for instance, squash commits while merging, rearrange the order of commits with rebase or add something to the previous commit with “git commit --amend”. Obviously you don’t want to do this when you’ve already published your changes, but it has served me well already.
I can create branches on my local repository to work on features or an experiment, without bothering others with it.
Committing is really fast. Although I still regularly push my changes to the Subversion repository, a ‘normal’ commit is blazing fast. Where a commit used to be a pause in my workflow, they now hardly have any impact. This makes it easy to commit more often and thus have commits do only one thing at a time.
The last two benefits can be combined: since the commits are initially only local, I don’t have to postpone committing until the code is in a workable state. I can for instance create a failing test, commit it and then continue to write the code to make it pass, without having to worry about co-workers running into the failing test.

(Note that other DVCSs also have (most of) these advantages. I’m only comparing Git with Subversion here.)

All in all I am very enthusiastic! Granted: using Git is more complex than Subversion and there were some problems I had to overcome in my day-to-day work. (I’ll talk about them in a next post.) But the flexibility I gained! Incredible!

Update (2021-07-14): for the record, the limitation about pushing back to Subversion is apparently solved since it is no longer listed in the “limitations” section of the documentation. ↩︎