Leveraging GitHub Actions for CI/CD

posted | about 6 minutes to read

tags: github actions continuous integration continuous delivery devops django

I did promise a couple posts ago that I would be writing a post on this, and while I know it’s been almost a month since then, I wanted to make sure I took the time and really did the topic justice.

Until very recently, I was leveraging Jenkins CI for my continuous integration/continuous delivery needs for the web projects that I maintain. It worked fine, but it really was way too much for what I was using it for, and I wanted to stop paying for a separate CI server1. I was already leveraging serverless CI for my blog using AWS CodeCommit, but I really wanted to keep my primary project publicly accessible, and the best way to do that for me was to leave it on GitHub. Since GitHub had a free, built-in CI option in the form of GitHub Actions, that seemed like the best option in this case - plus, being able to leverage other people’s prebuilt actions was a very neat thing that allowed me to really granularize the deployment process and build more robust workflows than what I had had previously.

The project that I was working on was written in the Django framework, which I will admit I am not as familiar with as I would like to be. I only really came to understand writing Python for the web as a direct result of becoming the maintainer for this project, so I kind of started out very piecemeal with integrating the stuff built into Django itself; stuff like migrations, instead of using the built-in Django migration functionality, I was just writing raw SQL queries and using some patched-together Bash script to actually apply the migrations. There wasn’t any testing2, there wasn’t any code linting, nothing - just a simple deploy and a few quick curls to act as a half-baked “smoke test”.

At the very least, though, the existing state being so bare-bones really provided both room and incentive to improve, and to do so in a modular way. When I started out migrating, I decided that instead of just ripping Jenkins out, I’d do one piece at a time - leaving some responsibilities in Jenkins while I transferred others over to GitHub. Under that philosophy, my first step was actually just to implement code linting - and then go through the 2000-plus-line report that was generated from running the CI and get the code to a passing state before moving forward with any of the actual deployment steps. This took, like, a month, but I think at the end of the day it really improved code quality and readability - even if I did ignore some of the warnings that the linter raised, I followed most of the recommendations and was really able to make things better.

Anyway, this was one of the steps I used a prebuilt action for - being able to leverage prewritten stuff was really excellent and the results I got were really comprehensive. It also made the actual deployment at this point very simple; in fact, this was all there was in the first iteration of my GitHub Actions yml file:

on: push
name: Nonprod CI
    runs-on: ubuntu-latest
      - name: Checkout
        uses: actions/checkout@v2
      - name: Lint
        uses: wemake-services/wemake-python-styleguide@0.13.3

Once I had this squared away, next steps were to get “as-is” CI ported over. A lot of this was pretty easy - since it’s web files, I just have Git on my webservers and do git checkout $revision (or, in GitHub Actions, the builtin $GITHUB_SHA) for a deploy, then run pip install -r requirements.txt in case there were requirements updates. This is a pretty simple shell script that I didn’t feel a ton of need to modify - simpler than transferring build artifacts, anyway. The twist, though, was that I didn’t really have a way to get from my GitHub Actions container to the webservers, and the official guidance was “eh, just whitelist all of Azure", which I wasn’t really about. Fortunately, there are workarounds! I configured an API key on the AWS side with the specific permissions to modify the security group that my webservers were in, created a user on the server for SSH3, and then added the access key and secret key for the AWS account to the repository secrets. I then used another prebuilt package to grab the public IP of the container, and then ran some AWS CLI commands to whitelist that IP for the duration of the deployment, e.g.:

      - name: Configure AWS
        uses: aws-actions/configure-aws-credentials@v1
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: ${{ secrets.AWS_REGION }}
      - name: Add IP to deployment SG
        if: ${{ success() }}
          SECURITY_GROUP_ID: ${{ secrets.SECURITY_GROUP_ID }}
        run: |
          aws ec2 authorize-security-group-ingress --protocol tcp --port 22 --cidr ${{ steps.ip.outputs.ipv4 }}/32 --group-id $SECURITY_GROUP_ID
          if [ "${{ steps.ip.outputs.ipv4 }}" != "${{ steps.ip.outputs.ipv6 }}" ]; then
            aws ec2 authorize-security-group-ingress --protocol tcp --port 22 --cidr ${{ steps.ip.outputs.ipv6 }}/128 --group-id $SECURITY_GROUP_ID
          sleep 10

Once the deploy ran, then I just closed the door behind me afterwards with aws ec2 revoke-security-group-ingress. Simple enough, and it worked flawlessly in testing. I ported over the raw SQL database migrations the same way, added an explicit step at the end for my “smoke test” curls (to eventually be replaced later), and everything was good! I split off the production deployment workflow as well, and with that I was able to shut down my Jenkins server in favor of operating entirely off of GitHub Actions. Next step was process improvement.

Top of my list for “make things better” was to trash the old system for database migrations. The shell script I wrote just took raw SQL, checked a table in the database for the current migration version, and applied any new files with a version number higher than the version in the database. Simple, yes, worked, yes, but it wasn’t robust and it didn’t allow for easy rollback. Django migrations, on the other hand, meet those criteria4, so I got my databases up to snuff, added some environment variables so Django could run the migrations from the CI container, and then just started right in using them. Turns out it was actually very easy to make it work, and while project-specific implementation of Django migrations is a bit outside the scope of this guide, at the end of the day I had replaced my shell script with the simple one-liner pip install -r requirements.txt && python manage.py migrate --noinput. Ended up working first time, as long as I remembered to actually run python manage.py makemigrations and commit the results5.

Obviously, I’ve still got plenty of room for improvement - this isn’t the end. Next steps for me include implementing an actual test framework including both unit tests and smoke tests, migrating away from SSH connections to do my git checkouts on the webservers, and continuing to work on codestyle improvements and enforcing those standards in CI. Still, even just getting to this level of automation was the biggest thing for me, and I’m excited to push the limits of what’s possible with this technology. If you’re interested in seeing what I’ve got in place now, or to follow along as I continue to improve processes, this is the current dev workflow for the project, which gets run on non-release branches; there’s a separate but very similar workflow for production releases.

  1. I ended up repurposing it for a Mastodon instance instead, but that’s not the point ↩︎

  2. There still isn’t, technically. One thing at a time. ↩︎

  3. Of course, there’s room for improvement here leveraging Systems Manager, but at this point I was focused on migrating the existing workflow. ↩︎

  4. Much like the many other web frameworks that implement migrations which I also completely neglected to learn about when developing in them. ↩︎

  5. I do remember, most of the time! Adjusting processes I’ve had for 3 or 4 years is challenging sometimes. ↩︎