An outline for designing multi-layer system architecture

posted | about 5 minutes to read

Recently, I was tasked with improving site performance and page load speed by refining our server architecture at one of my positions. I had given this some thought in the past, but this gave me an opportunity to reconsider the way I look at infrastructure. In one of the projects I’m working on, I had to think a lot about purpose differentiation – having different processes on different servers that communicate between each other as opposed to having one big behemoth of a program that requires a monster server to run on. I came up with a solution that I’m very happy with, not only because it lets us troubleshoot really well, but also because it allows for easy scaling out of any bottlenecks and maximizing server efficiency. You can think of it like service-oriented architecture, but in the sense of server roles instead of just application services; actually, working with SOA principles in another one of my projects originally gave me the idea for what I did here.

Starting Out

To start thinking about this, let’s talk a little about server architecture. If you’re just starting out in the Linux world, or if you’re not working on a large scale, it’s very possible that you’re going to be running a LAMP, LEMP or similar stack, meaning a web server, PHP, and a database running on one Linux server.

Now, this is all well and good, especially when you’re operating on a smaller scale. In fact, this is still how I run my personal sites – since I’ve only got 10 or so, a unified server works just fine. However, running a larger-scale infrastructure requires something slightly more complex. To that end, let’s think a bit about that discrete-service concept I touched on at the beginning of the post, and talk about how it applies to the first step that most sysadmins will take when scaling out their operation.

When a basic LAMP stack starts to run into performance issues and no amount of tuning will help, most people will spin off the MySQL server into its own discrete server. This tends to help with performance quite a bit, because MySQL and other database providers can be pretty memory intensive, making what process to spin off first an easy decision. Once you’ve done that, you can start doing fun stuff like MySQL clustering to improve database performance as well. I think of this as the first step – now, you’re treating MySQL as a discrete service which you can scale without having to touch your web server.

Scaling Further

The problem arises when you outscale that. At some point, the rest of your stack is going to run into a problem, especially if you (as I did) start implementing more technology like Varnish or Redis in addition to everything else. Sure, you can scale the whole box – but what problem does that really solve? You’re duplicating a lot or processes that don’t need to be duplicated to give more juice to the one that’s slowing you down – plus, it’s a lot harder to figure out exactly which process is bottlenecking you.

Instead of going down the path of infinite horizontal scaling for a full stack, let’s think about how we could apply the discrete service philosophy to our remaining processes. In most cases, the caching stuff like Varnish or Redis is going to be very easy to spin off, so we may as well do that – leaving us with just Apache and PHP processing local files.

There’s still two more processes that we can make discrete, though. First, we can move the storage onto another server. I’m using DRBD in my setup (thanks to Toki Winter for a wonderful writeup on this), which allows for high-availability storage accessible over a network. If you’d rather use a SAN, that’s also a great solution.

We’ve still got one last piece, though, and that’s the piece that I just recently spun off – and that’s PHP. In a web environment, your web server really isn’t what’s causing the load in most cases. Rather, it’s PHP which is eating your resources. With that in mind, it’s worth looking into spinning off your PHP process into a FPM cluster. (Thanks have to go out to Jamie Alquiza‘sincomparable writeup on spinning off FPM. He mentioned using rsync and local file storage on the PHP-FPM cluster, which would certainly be a performance boost as well).

Optimizing Your Environment

Now that we’ve got all of our different processes on their own servers, it becomes trivial to find performance bottlenecks. Site running slow? Check your Zabbix dashboard (shameless plug!) and see what server’s getting pegged. From there, it’s a simple task to spin up another server and stick everything behind a HAProxy load balancer (preferably set up with high availability using heartbeat or pacemaker/corosync, depending on your environment) – and voilà, instant bottleneck reduction! Not only that, but this allows you to more efficiently allocate your resources in a virtualized environment. If you spin off all your services and find that only certain servers are really hitting their upper limits on CPU or memory, you can reallocate excess resources directly to the processes that need it.

Now, this is all something where your mileage might vary, and solutions might vary too. For example, instead of just spinning off MySQL, I’ve been working on getting a Galera solution working in our development environment - or, if you’re on AWS, maybe moving to Aurora might be right for you. In other cases, problems might be solved via tuning; maybe a PHP memory limit needs to be adjusted, or Apache needs to be tuned. Finally, if you’re already pressed for system resources, spinning off into more systems and adding slightly more OS overhead may not be the best option for you. In a typical situation, though, I think it’s possible to really increase your efficiency with a philosophy like this one.