Building a comment system for a static site, part 1: Requirements, technologies, and database design
| about 5 minutes to read
I recently promised a writeup of adding dynamic content on your static website. Probably one of the more useful things for this is a comment system for a personal blog. Since I didn’t have one after moving to my Hugo site, I decided this would be a fruitful area of exploration - and sure enough, it’s been really great and I’ve learned a lot (you can now see the results below each post on my site, including this one). I’m going to take the next couple of posts to discuss how exactly this can be set up using serverless technologies like AWS Lambda and DynamoDB, but to start, I want to talk about how I decided what my comment system should look like and what it needed to accomplish, and how I picked the technologies I did to create it.
My design goals for the comment system were very simple. I’ve listed them out below, along with my initial thoughts on how to accomplish those goals. Some of them are super obvious, others less so, but I feel that it’s worth outlining my thought process here.
Quick digression: the concept of “minimum viable product” is core to a lot of product development, and I think it’s important to think about when you start out actually building stuff. At the end of the day, what I needed was a comment system - nothing more. That meant that there are a few features that I could strip out for my first iteration, and to that end I divided my list up into “must haves” and “nice to haves”. All that I worried about for the first build were two things: storing and displaying comments. Anything else could wait for iteration.
“Must haves:” - Users should be able to comment on posts - Web form to accept comment input and save to a database - Users should be able to view other comments on posts - Retrieve existing comments from the database and sort them appropriately
“Nice to haves”: - Users should be able to reply to other comments - Comment threading or another system (@ mentions?) - Users should be safe when visiting my site - Comment text needs to be web-safe - Provide effective moderation options to avoid spam and other malicious comments
When I design my applications, my initial thought is always almost immediately data structure. I knew I needed to balance flexibility with cost effectiveness, and after thinking a little bit about it I settled on Amazon’s DynamoDB. It was a natural choice, since I’m already pretty tightly integrated with AWS, and after giving it some thought I felt like despite its limitations it would be possible to work within them to build a system that met my requirements.
Staying in the vein of cost-effectiveness, I knew I wanted to leverage AWS Lambda to actually do the heavy lifting of handling and retrieving comments. That would keep me off of servers but still allow me to pull in dynamic data. I chose Node as my language for this project because I had some familiarity with it already, there were supplemental packages I could leverage for some of the heavier technical lifts, and I didn’t want to try and learn too many new technologies. I decided to split into two Lambda functions: one to retrieve data from the database, and one to handle input. It could be done as one with some extra logic to do different things on a GET versus a POST, but I find that having each function do only one thing helps clarify way more than having one big omnibus function, and I’m sure if I looked hard enough it’d probably be an actual best practice with Lambda, since part of optimizing serverless costs is minimizing your processing time.
This is where DynamoDB came in. The question of data structure is so important when figuring out how to build a solid system, and this was my first time working with a NoSQL database, so I was a little overwhelmed starting out. I did eventually figure out what I was doing, though - once I did enough reading, I got to a pretty good place.
Another digression: every DynamoDB table basically has two major keys - the “partition key”, which is how data is distributed across the table, and the “sort key”, which is how you can organize data when retrieving it. The one thing that it took me a while to grasp is that if you want to sort data based on multiple fields, a best practice is to concatenate those fields together and then use that concatenation as your sort key. If I wanted to incorporate comment threading, I knew there were two fields that I would need to sort on - the ID of the parent blog post and the timestamp - so I made sure to incorporate that into the starter code I was starting to put together. I didn’t have to worry about any other fields, since this was NoSQL - I didn’t have to define those fields when creating the database.
One gotcha that I noticed when I was setting up my DynamoDB table - it defaults to provisioned mode. If you’re following along and building your own stuff, you may want to switch it over to on-demand - that way you’re only billed for the queries you make, which may be better for certain use cases, especially when you’re still in development. (In my case, my blog traffic’s low enough that I’m not going to get much benefit out of provisioning).
This left me with all of my “must haves” checked for an MVP: write comments to the database, view them on the page. At this point, I had a plan to move forward, and potential solutions for each of my defined requirements. All that was left was to…actually build things. Next up in part 2: I’ll start diving deeper into the technical details for that and start posting code.