strugee.net

Posts from October 2016

GitHub's 'squash and merge' default considered harmful

Recently GitHub launched two new ways to merge Pull Requests: "rebase and merge" and "squash and merge". That means that projects now have three ways to merge PRs:

  • Merge - creates a merge commit merging the branch (even if it's fast-forwardable)
  • Rebase and merge - rebases on top of the target branch and fast-forwards
  • Squash and merge - rebases on top of the target branch, squashes all commits into a single commit, and fast-forwards

Now, the default is "squash and merge", because apparently that's what people find to be the "prettiest" history. This bothers me for one simple reason: a squash and merge default means a history destruction default[1].

The whole point of commit squashing is to destroy history. Sometimes that's fine - for example, I might squash a typo fix into an earlier commit, because who cares that I typo'd vare instead of var? However, anything less trivial than typo fixes is valuable information about how the project evolved. Even if all of the commits just add stuff, and don't change what's happened earlier in the branch (i.e. even if the direction the implementation's taking doesn't change part-way through) the history contained in the branch is still valuable, because the branch's shortlog will give you a nice overview of exactly what changes happened in the branch. Now, you could of course make the argument that commit squashing shows that same information because by default, in both Git and GitHub, the commit messages being squashed are included in the suggested final commit message. I prefer keeping the individual commits, but that's a valid argument.

However, that doesn't change the fact that in cases where the implementation direction does change part-way through, GitHub's default is actively promoting the irrevocable[2] destruction of valuable history. Lots and lots of people use the GitHub Merge Button, especially those who are new to Git. This default is causing those people to unwittingly destroy valuable information. Sure, it looks nicer in the commit log, and I totally advocate for using squash and merge when it makes sense. But those cases are few and far between - it's basically just a small changes, plus a couple typo fixes or additions. And besides, I think it's far better to have a default of an ugly history rather than a default of an incomplete history. The former may not be the prettiest to look at, but the latter has the potential to actively stop people from doing their jobs[3].

For those curious, here's when I use each mode of the GitHub Merge Button:

  • Merge - when I have a long-running branch that made significant changes and/or diverged significantly from the target branch. In this case, it's valuable to clearly distinguish what's part of the project and what isn't. Rebase and merge is no good because then it's not clear in the history when the branch started and ended. This is particularly evident when looking at git log --graph.
  • Rebase and merge - what I use most of the time. I use this when there were a couple small commits that were interesting enough to preserve the difference, but the overall change wasn't so huge that it needs to be clearly distinguished in the history. This provides a nice and pretty commit graph.
  • Squash and merge - I rarely use this. When I do, it's because all of the commits on some branch are so trivial, they really don't matter. Mostly this means that the overall change is tiny, and the only additional commits that are added are small additions to the first.

So there you have it. How I use GitHub's Merge Button, and why I think the "squash and merge" default should be considered harmful.

Footnotes:

[1]: I'd like to point out that this is only a problem in Git. Mercurial has (or will have shortly) Changeset Evolution, which keeps track of how changesets evolve over time. I.e. when you rewrite history, you're not losing any information.

[2]: I'm sure some of you are about to excitedly tell me about a fantastic tool called the reflog, and I really should read Pro Git because it's a fantastic book and has an entire chapter on data recovery. I know. The reflog is not the right answer for this; not only is it local to (likely) a single developer's machine, but it only stretches back a couple months and only works if the old, dangling commits aren't garbage-collected. By the time someone might be interested in looking at the history that was lost, it's probably far, far too late.

[3]: Another rarely-encountered but very serious problem with both "squash and merge" and "rebase and merge" is when people merge upstream changes in a PR. This is a perfectly legitimate workflow - PRs are great for discussing changes, etc. (although a lot of people think they're the only way to merge things, so they open PRs and then immediately merge them - this, IMHO, is very much not a legitimate workflow) - but if you do anything but merge (which includes both GitHub's "Merge" option and fast-forwarding locally on the CLI), you may have a Very Bad Time the next time you go to merge upstream changes to your fork. Why? Because in Git's view, the changes you merged the first time haven't actually been merged. After all, commit SHAs are nowhere to be found in the tree, since when you rewrote history you changed those IDs! Git has no way of knowing that your rewritten commits and the supposedly "unmerged" commits are basically equivalent. (Mercurial, on the other hand, would've kept track of this information and would have no problem at all.)


Android freedom

Recently I backed up and restored both my Android phone and my Android tablet. There were a couple reasons for both of these. The tablet had been borked for quite a long time (any time I tried to upgrade it from the Android 5.0 build it was running, it failed - also the thing just froze randomly). The phone was on the CyanogenMod nightly channel and I wanted to switch to the snapshot channel, plus within the past couple days both WiFi and the cell network data connection straight up stopped working so it was pretty unusable. At first I wanted to switch to CopperheadOS on both devices. CopperheadOS doesn't support GApps and will probably never support GApps (for very good reasons), and I said, maybe I can make it work. Sadly, I couldn't - I still regrettably need stuff from the Google Play Store. (The tablet ended up back on stock because I want fast upgrades, and the phone ended up on CyanogenMod because they have the fastest upgrades while still offering root.)

The whole experience made me think, though - what would it take to create something that functioned like GApps, but respected your freedom? I'm sure some people reading are already scrambling to link me to their favorite Google Play Services reimplementation, but this isn't the only thing that's in GApps. You gotta think about the user experience, too. Such a system should be able to:

  • Provide the nice APIs that Google Play Services does
  • Store your photos in the cloud, like Google Photos
  • Related to the above, automatically backup and restore apps and their data
  • Transfer from other devices, similar to the above item
  • Support functionality like Google Now
  • Ditto for Google Assistant
  • Integrate into the initial device setup to configure all this stuff

I'm sure there are more that I've missed.

Honestly, we're actually quite close to this. The first bullet can be mostly accomplished with something like microG. Automatic backup and photo storage needs a UI, but fundamentally can be accomplished with any generic WebDAV implementation. F-Droid can be used as the app store. Imagine this: you take a bunch of photos and install a bunch of freedom-respecting apps on your phone. Then, you get a new one. When you set it up, the phone prompts you to sign in to your WebDAV account (which could be e.g. ownCloud, or a WebDAV implementation on Sandstorm) and then automatically reinstalls all your apps from F-Droid, as well as retrieving their data from ownCloud. When you open the built-in gallery app, all your photos are already there because they're seamlessly backed up to the cloud. Your cloud. Transfer can be accomplished in a lot of ways, but I can easily see it building on the above.

Supporting something like Google Now is non-trivial, but I've even already proposed a feature for Huginn that would make this possible. Google Assistant would be very, very difficult, but even without that, we'd have come a long way.

This reality is not that far off. What's missing is some UI pieces and a nice ZIP that can be flashed on top of ROMs, similar to how GApps are flashed today. So who's going to put it all together?

(I suppose I've just volunteered myself - oh well... I'll just add it to my endless list of projects.)


~