quilt instead git for maintaining forks

Marcos Dione

2015-12-30 18:37

Update

At the end this methd didn't really work, because to fix a patch that broke becaus of a conflict with upstream you need to pop all the patches above it, which means that when rendering your attempts at fixing it either don't have all the changes in, or you push them back in, test, pop back up and change, rinse repeat. Still, I'll keep this post for historic reasons and because I think that quilt is a useful tool to learn.

Ever since I started working in Elevation I faced the problem that it's mostly a openstreetmap-carto fork. This means that each time osm-carto has new changes, I have to adapt mine. We all know this is not an easy task.

My first approach was to turn to osm-carto's VCS, namely, git. The idea was to keep a branch with my version, then pull from time to time, and merge the latest released version into my branch, never merging mine into master, just in case I decided to do some modifications that could benefit osm-carto too. In that case, I would work on my branch, make a commit there, then cherry pick it into master, push to my fork in GitHub, make a Pull Request and voilà.

All this theory is nice, but in practice it doesn't quite work. Every time I tried to merge the release into my local branch, I got several conflicts, not to mention modifications that made some of my changes obsolete or at least forcing me to refactor them in the new code (this is the developer in me talking, you see...). While I resolved these conflicts and refactorings, the working copy's state was a complete mess, forcing me to fix them all just to be able to render again.

As this was not a very smooth workflow, I tried another approach: keeping my local modifications in a big patch. This of course had the same and other problems that the previous approach, so I gained nothing but more headaches.

Then I thought: who else manages forks, and at a massive scale? Linux distributions. See, distros have to patch the packaged software to compile on their environments. They also keep security patches that also are sent upstream for inclusion. Once a patch is accepted upstream, they can drop their local patch. This sounds almost exactly the workflow I want for Elevation.

And what do they use for managing the patches? quilt. This tool is heavily used in the Debian distro and is maintained by someone working at SuSE. Its documentation is a little bit sparse, but the tool is very simple to use. You start doing quilt init just to create the directories and files that it will use to keep track of your changes.

The modification workflow is a little bit more complex that with git:

You mark the beginning of a new patch with quilt new <patch_name>;
Then either tell quilt to track the files you will modify for this patch with quilt add <file> ... (in fact it just needs to be told before you save the new version of each file, because it will save the current state of the file for producing the diff later),
Or use quilt edit <file> to do both things at the same time (it will use $EDITOR for editing the file);
Then do your changes;
Check everything is ok (it compiles, passes tests, renders, whatever);
And finally record the changes (in the form of a patch) with quilt refresh.

In fact, this last 4 items (add, edit, test, refresh) can be done multiple times and they will affect the current patch.

Why do I say current? Because quilt keeps a stack of patches, in what it calls a series. Every time you do quilt new a new patch is put on top of the stack, and in the series just behind the current patch; all the other commends affect the patch that is currently on top. You can 'move' through the series with quilt pop [<patch_name>] and quilt push [<patch_name>]. if a patch name is provided, it will pop/push all the intermediate patches in the series.

How does this help with my fork? It actually does not save me from conflicts and refactorings, it just makes these problems much easier to handle. My update workflow is the following (which non incidentally mimics the one Debian Developers and Maintainers do every time they update their packages):

I quilt pop -a so I go back to the pristine version, with no modifications;
I git pull to get the newest version, then git tag and git checkout <last_tag> (I just want to keep in sync with releases);
quilt push -a will try to apply all the patches. If one fails, quilt stops and lets me check the situation.
quilt push -f will try harder to apply the troubling patch; sometimes is just a matter of too many offset lines or too much fuzzyness needed.
If it doesn't apply, a .rej wil be generated and you should pick up from there.
In any case, once everything is up to date, I need to run quilt refresh and the patch will be updated[1].
Then I try quilt push -a again.
If a patch is no longer useful (because it was fixed upstream or because it doesn't make any sense), then I can simply quilt delete <patch>; this will remove it from the series, but the patch file will still exist (unless I use the option -r[2]).

As long as I keep my patches minimal, there are big chances that they will be easier to integrate into new releases. I can even keep track of the patches in my own branch without fearing having conflicts, and allowing me to independently provide fixes upstream.

Let's see how this works in the future; at least in a couple of months I will be rendering again. Meanwhile, I will be moving to a database update workflow that includes pulling diffs from geofabrik.

Being an old timer VCS user (since cvs times), I wish they would call this command update. ↩
Why not --long-options? ↩