New self-hosting experiments

posted | about 8 minutes to read

tags: self-hosting fediverse mastodon pleroma matrixchat jitsi

As I’ve detailed previously on this blog, I mostly maintain my own web services. Stuff like my website, mail server, so on and so forth - I’m handling it with my own preferred architecture rather than outsourcing to a managed provider. Recently, I’ve been on a kick looking at what else I could take on on my domain, or what makes sense for me to explore. Some stuff has worked out very well. Some has not.

One of the things that I do not handle for myself is synchronous chat; I have a Discord account that’s my primary driver for that need, since most of my friends are on there. Still, I thought it would be nice to spin up a self-hosted chat server so I could be reached outside of a corporate service - and especially since Keybase’s recent acquisition by Zoom, there just weren’t that many options left in the space, especially for an encrypted chat. Last week, I tried installing a Synapse server for Matrix chat. It ended up working out pretty well, but unfortunately since the rest of my database stuff was MySQL, I could really only test the server using sqlite1. I tried migrating my existing database workloads from MySQL to Postgres, and queries slowed down so much that I just couldn’t justify making the switch. It could have been a tuning thing, but I’m not familiar enough with Postgres to make the effort.

Anyway, Synapse worked - kind of. There was some trial and error involved, certainly, especially around domain delegation, but I got there. I also finally got IPv6 up and running on my hosting server as part of troubleshooting federation with my friend’s server, which honestly was probably well overdue anyway - so as a side benefit, a lot of my websites got updated to have AAAA records during this time as well. I’m considering writing up everything I discovered as well as the “gotchas” I ran into2, but unfortunately I’m not sure if I can justify the effort because I ended up shutting the darn thing down after a couple days. Without a dedicated database backend - at least, I’m assuming this was the reason - every time I tried to join a channel on another Matrix server, I was putting myself at risk of crashing my entire hosting box due to out-of-memory issues. At the end of the day, it just wasn’t worth it.

While I was working with Matrix, I also looked into self-hosting a Jitsi video chat box since Matrix can integrate with Jitsi for videoconferencing. I won’t even bother going in depth on this one except to say that some of the requirements that the documentation listed were way, way too onerous for me to really consider hosting the server to be a realistic thing. Stuff like “you must make your box’s hostname match the hostname that you want to use for Jitsi”, instead of just having a configuration option, just felt ludicrous to me - so this effort really died before it got off the ground.

That said, though I did have one success. I recently spun down my Jenkins CI server in favor of leveraging Github Actions instead3, and that meant I had an AWS EC2 instance reservation lying around unused. I decided to take advantage of it and try spinning up my own fediverse instance; while I’d called home on the fediverse for a long time, self-hosting was always on my eventual roadmap. Of course, when I went looking at interoperable server software, you guessed it - everything required Postgres again. I still felt it was worth giving things a shot, though - having a separate instance meant I could take the time and play around with things and not worry about blowing up my existing hosting. After researching, I stated out giving Pleroma a shot - what stood out to me the most the first time I looked at it was how lightweight it was. The other major option, Mastodon, just seemed like it might be too much for the compute capacity I was hoping to use.

Spinning up Pleroma ended up being a breeze. Sure, I had to put Postgres locally on the server, but all of the configuration and the initial setup was very smooth4, and once I had it up and running the web administration felt pretty good. The problem that I ran into, though, was that I was very used to how Mastodon federated, which feels a lot more aggressive about populating the federated timeline and making connecting with people on other instances a lot easier. Spinning up a fresh instance to a completely empty timeline just felt wrong, so despite how well Pleroma ran on even an AWS t3.nano instance, I decided to try Mastodon instead.

Sure enough, Mastodon did take more juice to get started. I expected that much. I ended up scaling the server up to do the actual build5, just because otherwise it would completely lock up when I tried to even install Ruby. That said, once I did scale it up, I had no problem doing the initial install6. I set up the account migration - much easier on Mastodon than on Pleroma - and everything just kind of worked. Stuff populated correctly in the federated timeline, I was able to migrate my followers and my following list over, it really was easy. I did notice something very strange, however - none of the page images were loading! I could upload an avatar - I checked the logs, and definitely saw stuff getting pushed up to S3 - but nothing was coming through on page loads. I tried another computer and it worked fine, so I must have been doing something wrong, but I couldn’t quite figure it out - until I remembered that I had was sending the Strict-Transport-Security header, including the preload directive, on Sure enough, the other computer hadn’t visited my website, so it never picked up the preload - once I loaded my website for the first time, everything stopped working, which made perfect sense. Still, the fix was easy enough, once I figured out what was actually going wrong; all I had to do was issue myself an SSL certificate, spin up a CloudFront distribution, and stick my S3 bucket behind that. As soon as I did that, sure enough, everything started working immediately.

All that was left was to see if, after the install, I could downsize the instance to run at less cost. Turns out the answer was yes (I’m currently running on a t3.micro), but with caveats; for the last couple days, I had still been running into OOM issues where the box would just go dead. I looked deeper, and the default configuration of Postgres on the server was for a much larger box than anything I was running at the time. Fortunately after tuning Postgres a bit to better match my actual capacity, I’m seeing much better memory usage and hopefully I’ve now dispensed with the server just disappearing on me. This one ended up being very positive at the end of the day, and I definitely think I’ll keep it around - so with that in mind, I’ve updated my Mastodon link in the site footer and you can now follow me at @alli@fedi.ajl.io7. I still have some tweaking to do - among other things, I want to look into glitch-soc - but I don’t have any plans to get rid of the instance or migrate anywhere else. I really like the idea of managing my own data for this.

More to come soon. I really want to dig deep on the Jenkins CI > GitHub Actions migration I did, because it went so well and has really made the CI pipelines way more flexible for a couple of my projects, so expect that post in the next couple months.

  1. A concerning trend I’ve run into with these new services is that most of them seem to have first-class support for Postgres, and … that’s it. I really wish there was more abstraction of their database stuff so I didn’t feel so constrained by those decisions - maintaining two separate database backends costs more money. ↩︎

  2. The biggest gotcha: If your mail server doesn’t allow TLSv1.0, the Synapse server - at least, at the time I built it, won’t connect to it to send mail (see the Github issue for more details on this). It was an underlying library issue, but it was a real frustration for me since I didn’t want to compromise my mail server’s configuration for just this one thing. ↩︎

  3. I have a separate blog post planned on this, and it should be a doozy. Suffice to say it was really fun and not having to maintain a whole Jenkins server for one CI pipeline is just much more relaxing for me generally. ↩︎

  4. The docs were pretty good; I used this page and it got me most of where I needed to go. I had to do a little more research to make Pleroma use my mailserver to send notifications, but it wasn’t particularly complicated at the end of the day - just a couple of lines in the config file. ↩︎

  5. I did the build on a t3.small. I would recommend that anyone else thinking about this do their build tasks on something like a c5.large. You’re gonna be scaling it down very quickly anyway so the additional costs are negligible. ↩︎

  6. Mastodon actually has a fantastic install guide, which I followed line by line. Not a single issue to be found - and the initial configuration generation was fantastic, even configuring my S3 asset caching and SMTP server info for me. Really a smooth experience. ↩︎

  7. I’m still federating all my Mastodon posts to Twitter, so they can be seen there, but direct interaction is much more likely to be found on the Fediverse. ↩︎