Show only posts from 2014 2015 2016 2017

Zero-downtime restarts have landed

I'm thrilled to announce that zero-downtime restarts, which I've been hacking on for the past week or two, have just landed in master!

Zero-downtime restarts require at least two cluster workers and MongoDB as a Databank driver (we'll eventually relax the latter requirement as we continue to test the feature). Here's how it works:

  1. An administrator sends SIGUSR2 to the master process (note that SIGUSR1 is reserved by Node.js)
  2. The master process builds a queue of worker processes that need to be restarted
  3. The master process picks a random worker from the queue and sends it a signal asking it to gracefully shut down
  4. The worker process shuts down its HTTP server, which causes it to stop accepting new connections - it will do the same for the bounce server, if applicable
  5. The worker shuts down its database connection once the HTTP server is completely shut down, meaning that it's done servicing in-flight requests
  6. The worker closes its connection with the master process and Node.js automatically terminates due to there being no listeners on the event loop
  7. The master recognizes the death of the worker process, replaces it, waits for the new worker to signal that it's listening for connections, and repeats from step 3 until the queue is empty

This works because only one worker is shut down at a time, allowing the other workers to continue servicing requests while the one worker is restarted. We wait until the new worker actually signals it's ready to process requests before beginning the process for another worker.

Such a feature requires careful error handling, so there are a lot of built-in checks to prevent administrators from shooting themselves in the foot:

  • If there's a restart already in progress, SIGUSR2 is ignored
  • If there's only 1 cluster worker, the restart request is refused (because there would be downtime and you should just restart the master)
  • The master process will load a magic number from the new code and compare it with the old magic number loaded when the master process started - if they don't match, SIGUSR2 will be refused. This number will be incremented for things that would make zero-downtime restarts cause problems, for example:

    • The logic in the master process itself changing
    • Cross-process logic changing, such that a new worker communicating with old workers would cause problems
    • Database changes
  • If a worker process doesn't shut itself down within 30 seconds, it will be killed
  • If a zero-downtime restart fails for any reason, the master process will refuse SIGUSR2 and will not respawn any more cluster workers, even if they crash - this is because something must have gone seriously wrong, either with the master, the workers, or the new code, and it's better to just restart everything. Currently this condition occurs when:

    • A new worker died directly after being spawned (e.g. from invalid JSON in
    • A new worker signaled that it couldn't bind to the appropriate ports

While these checks do a lot to catch problems, they're not a silver bullet, and we strongly recommend that administrators watch their logs as they trigger restarts. However, this is still a huge win for the admin experience - the most exciting part of this for me is that it's the first step we need to take towards having fully automatic updates, which has been a dream of mine for a long while now.

Admins running from git master can start experimenting with this feature today, and it will be released during the next release cycle - i.e. with the 5.1 beta and stable, not the current 5.0 beta. Since this is highly experimental, we want this to have as much time for testing as possible. You can also check out the official documentation on this feature.

I hope people enjoy this! And as always, feel free to report any bugs. 5.0 beta released

I'm excited to announce that 5.0.0 is now officially in beta!

This is another big release and makes a wide variety of improvements. Here are some highlights from the changelog:

  • More complete documentation
  • Small improvements to the administrator experience
  • A better web UI, including some user experience polishing as well as an upgrade to more performant and better-licensed libraries
  • A fix for crashes related to "login with remote account" (although this one was backported in 4.1.1)
  • Significant security improvements in the systemd service shipped with the package
  • Lots of internal refactoring and simplification made possible by dropping Node 0.10/0.12 support

Many of these changes - particularly the systemd changes and the fact that (as previously announced) Node 0.10 and 0.12 are no longer supported - will require administrator intervention. Be sure to read our upgrade guide for details on how to deal with these changes.

All of these features add up to make 5.0 beta the most stable and secure release yet. As always, it will go through our beta period for about a month before being released as a fully stable version. If you try it out, the community would love to hear about it - and be sure to report any bugs you encounter!

Graduation 2017: reflections on 365 days of gap yearing

Tonight marks the end of the high school careers for everyone in Seattle Academy's Class of 2017. Congratulations to everyone who graduated tonight - you deserve it, seriously. To my friends in particular, I'm so proud of you guys! You're completely amazing, you've done such incredible things and I love you very much.

Attending the Class of 2017's graduation was super strange for me too, honestly, because I'm such a radically different person than back when I was on that stage.

This year I put out several major releases (and a couple minor ones too) of, the decentralized social networking software I maintain; I became an Invited Expert at the World Wide Web Consortium and I wrote the software that powers this blog - Stratic - from scratch. I spoke at some major technical conferences on and Stratic, too. But mostly what I was thinking about tonight was all the personal development I went through. As some who are close to me in real life know (particularly those who were there), a year ago I was relatively seriously depressed. There were actually a lot of reasons for this, but one of the most important was the trouble I had dealing with change - the biggest change, of course, being leaving high school: somewhere familiar, somewhere with friends.

I vividly remember sitting in the seats just under the stage in McCaw Hall the day of graduation as the SAAS people running the show showed us where to walk and what to do. I leaned over to my friend and said, "I don't think I'll understand what's happening here for a very long time," to which she said, "what, like where we walk and stuff?" I chuckled and said nevermind, because what I was really referring to was what was happening in a grander sense - what this event really meant and was for, on the scale of years and decades (I couldn't find words for this at the time).

I don't really know what it felt like to walk out on stage for other people (both this year and last year) but for me, it seemed almost trivial, like a non-event. It didn't feel nearly as momentous as it seemed like it should. Based on my description my therapist would later refer to it as a foregone conclusion, a description that stuck with me given how accurately it seemed to verbalize what I was feeling. It just didn't feel big, but I knew that it was. In the grand sense, I just really didn't understand what was happening.

Tonight, watching the Class of 2017, I think I started to get it. The crux of my personal development was being at the Recurse Center. In fact, applying to and attending the Recurse Center was without a shadow of a doubt the highlight of not just my year but my entire life. I love my friends and teachers at SAAS very deeply, and I still think going to SAAS was a great choice. But at the Recurse Center, I felt at home, like I belonged, in a way I just never felt in high school. Plus, it felt pretty great to live on my own in New York City, feeling like I knew my way around the subway system[1], how to get food for myself (whether at the supermarket to cook or from nearby restaurants), and just what it felt like to live in such an amazing place.

There was a moment in senior year, in Jason's English class, when I was thinking about my then-sophomore friends and wondering if I'd see them in ten or even twenty years. After all, they'd probably have separate class reunions. And right as I started to wonder what we'd all even be like then, it hit me that the idea of "growing up" is bullshit. No one is ever truly "grown up"; people just slide along a scale from toddler to wise elder. Every human always will be and always has been a work in progress - always growing, always changing. I am incredibly proud of all the technical work I did this year. I am unbelievably grateful and happy to have made so many amazing friends at the Recurse Center, and I feel very lucky to have such good mental health - mentally I'm probably in the best place I've ever been in my whole life. But even with all that, I know I'm still young. I still have lots of room to grow and there are more exciting opportunities ahead of me than ever.

When everyone's in the middle of something, I think they get lost in the moment. In my senior year, my whole life was structured around being in senior year; at the Recurse Center, my whole life revolved around the Recurse Center. Essentially, I'm describing the act of putting your head down and concentrating on something. So maybe what graduation (and important events like it) is really about is a chance to suspend time; to not be lost in the moment. A chance to, just for a second, not have your life revolve around anything in particular and instead, look at yourself and the way you've changed and continue to change over your lifetime. Like a character arc. It really is amazing, and in a way, isn't that implicitly what graduation's saying anyway? Graduation is an event designed to celebrate everything that the people on stage have accomplished - and in order to celebrate something, you have to sit back and look at it.

It was honestly wild to watch the people on stage talk about each other and their lives for the past four years, recalling both the time when I felt the exact same way they do now as well as just how much I, and my perspective, have changed. To the Class of 2017, as someone who was in your shoes a year ago - I know I sound stupid and clichéd, but your world is about to become so much bigger than you can imagine. SAAS, which once seemed like such a monumental, immovable part of your life, will instead become small (though still important). At least, that's what happened to me. I hope it does for you, too. And I'm really excited for you guys.

Congratulations once again. You guys freaking did it.


[1]: the operative word being "feeling", because I almost certainly didn't know my way around nearly as well as I felt I did

How I accidentally started maintaining a social network with thousands of users

As some of my readers (particularly Recursers) know, a couple of weeks ago I became an Invited Expert at the Social Working Group at the W3C (World Wide Web Consortium). The W3C is a standards body. That means it's responsible for defining things like how things work on the web, such as how web pages are styled using CSS and how web developers can protect their apps from security vulnerabilities using Content Security Policy.

My first thought when I got the email that my application had been accepted was, "WHOOOOOOOOO!" It was probably one of the most thrilling moments of my whole life. My second thought was, "how in the world did I get here!?" The truth is, it was almost an accident.

It started when I got involved in the project., for those who haven't heard me talk about this endlessly (e.g. at RC), is a decentralized social network. That means that there can be multiple servers run by different people that are part of the social network, but the users on those servers can interact with each other in just the same way they could if it was just one big centralized server[1]. I first got involved in the project in August 2015. I was experimenting with different social networking software and decided to deploy on my server. When I did I realized that pump... well, it didn't work very well. The web UI was kinda basic[2], everything was pretty buggy, and there were a lot of problems with the overall user experience. In fact, I know the exact day I set up (August 12th) because all throughout the experience I was filing bugs on things needing improvement. It was a shame, I thought, because this software seemed really neat. I thought it had a lot of potential.

After about two weeks it became clear that there was no activity in the upstream project. So after some deliberation, I ended up forking it (briefly). You can watch this talk around 16:00 to hear me talk about this a bit, though to be honest it's kind of just a footnote in the project's history. In the end Evan Prodromou,'s author, ended up handing off some commit rights to community members.

Well, I thought, that was the end of that. Everything's smooth sailing from here on out! There were some big problems, though: the people who now had commit rights all were involved in other things and, more importantly, none of them knew JavaScript or Node.js! This makes me chuckle to this day, honestly.

So I started triaging issues. When people sent Pull Requests, I'd review them since it seemed like no one else was going to do it. #1114 was, as far as I can tell (or remember), the very first of these "unofficial" PR reviews. I kept going; I even reviewed Menno Vossen's epic PR which fixed all the tests (fixing the tests being a feat which, having tried to start that work myself, I am to this day in awe of and incredibly thankful for). For that last one in particular, you'll note that I merged it, not Chris Webber. At some point in January(?), he asked me in on IRC if I'd like write access to the repository, to which I said (paraphrased) "heck yes!" So he made it happen.

I never really intended for that to happen. However, I was the one doing almost all of the work. After a while it just made sense. This is what, among other things, I find so incredible about freedom-respecting software: you can just do things. I didn't ask anyone for permission to do those reviews. I just saw the need for a reviewer, and decided I'd help out.

Fast-forward to today, and I'm now an owner of the organization on GitHub. I make technical decisions about what to prioritize and what should go into core. I do a lot of the day-to-day work running the project, too, and setting up technical and policy infrastructure (with a lot of help from the community, of course, plus input from Evan). That, too, just made sense, as did my becoming an Invited Expert - I was pretty deeply engaged with the SocialWG's ActivityPub specification already since it's based on the protocol, and I was really excited about said protocol being standardized. So I was participating pretty heavily and I think it just made sense to people in the Working Group for me to join. In fact, that also kinda happened by accident. I couldn't get edit access to the W3C wiki so we were speculating in #social on the W3C IRC server that it might be because I wasn't a "W3C member" or something. So some people at W3C were pinging the sysops team, etc., trying to mark me as a "trusted" user when someone - Sandro Hawke, I believe - said, "the other option is for you to just join the Working Group." To which I said, "well, but I'd have to join as an Invited Expert, and I don't think I qualify as an expert." Chris Webber's response? "You're just as much of an expert as me when I joined!"

tl;dr how in the world did I get here? I tried some software and got annoyed at it, so I just kind of "did some stuff" that led to me doing code reviews. That led to me getting involved in the decentralized social web which led to me "doing some more stuff" that got me involved in standards. Then because of that, I tried to edit a wiki and ended up being invited to apply as a W3C Invited Expert.

I mean, what the hell? Honestly. I can't emphasize enough that I didn't plan ANY of this. It just sort of... happened. And that, I think, is what's so cool about the free software community. It isn't about who you are, where you come from, or what your goals are. It's only about, do you show up? Do you show up and do awesome stuff?

I showed up, kind of by accident, and I now run a decentralized social network with thousands of users called

What will happen if you show up?

Thanks so much to Anja and Julia for providing feedback on a draft version of this post.

[1]: I really hope this explanation makes sense and if it doesn't, I apologize - I use diagrams to explain this in real life.

[2]: Still is, but that should improve now that the technical debt work I've been focusing on for the past year is now basically done!