Using Docker to Parallelize Rails Tests

Written by: Nick Gauthier
4 min read

I previously published this article on my personal blog, but thought I would share it today with our Codeship readers. Enjoy!

Docker is a new way to containerize services. The primary use so far has been for deploying services in a very thin container. I experimented with using it for Continuous Integration so that I could run our rails tests within a consistent environment, and then I realized that using docker provides excellent encapsulation to allow for parallelization of rails test suites.

There are three things we'll have to do to run a Rails test suite in parallel using docker:

  1. Create a Dockerfile for a system that can run our rails tests, and build the container

  2. Create a script that breaks our tests across multiple containers

  3. Create a script to run the rails tests using docker

Dockerfile and Container Dockerfiles are quite simple. The majority of the commands are usually RUN commands that run instructions on the container. Let's take a look:

from tianon/debian:sid maintainer Nick Gauthier
run apt-get update run apt-get -qy dist-upgrade
run apt-get install -yq postgresql-9.3 libpq-dev nodejs ruby2.0 ruby2.0-dev build-essential
run gem install bundler --no-ri --no-rdoc --pre

The FROM line bases our container off of debian unstable. I'm using this source because it has ruby 2.0 and postgresql 9.3 right in apt, so the installation is minimal and fast.

Then, we update the system, install postgresql, node (for assets), ruby, and building libraries for gem extenions.

Finally, we install bundler.

Now, we can build our container via:

docker build -t username/appname .

That will build the current directory's container and tag it with username/appname (so you should replace that with your name and your app's name). I am not sure yet how to do this in a more portable and anonymous fashion.

Parallelization Script Using Docker

Next, we're going to write bin/docker-ci. The goal of this script is to split our rails tests across multiple containers, and ultimately call bin/ci within the containers using docker.

# !/usr/bin/env bash
set -e
# Make our tmp directory for gems
mkdir -p /tmp/docker
# Docker options:
# Mount the current directory to /data/code
# Mount the temp directory to /data/gems
# Set GEM_HOME to the data directory
# Set the working directory to the code directory
# Use our built container
opts="-v `readlink -f .`:/data/code
   -v /tmp/docker:/data/gems
   -e GEM_HOME=/data/gems
   -w /data/code
# Bundle the gems (once, serially)
docker run $opts bundle --quiet
# Spread test files in large groups, and pass them into the
# container's bin/ci method
ls test/*\*/\*_test.rb | parallel -X docker run $opts /data/code/bin/ci

The -v options allow us to share the current machine's code directory with the container. One issue here is that any file system operations within the code folder could conflict across containers.

We're using GNU parallel with the -X flag, which will spread the test files into larger chunks, as opposed to one job per test file. I don't think this perfectly utilizes all the cores on my machine, so some more tweaking could be done here.

At this point, bin/ci is run with one or more test files as parameters.

Test running script

The bin/ci script needs to run a set of test files, and it will also need to initialize the container so that the suite can run.

# !/usr/bin/env bash
set -e
# Start postgresql
service postgresql start
# Create the db
su -c "createuser root -s" postgres
# prep the db
bundle exec rake db:test:prepare
# require test files from the arguments given to this script
ruby -I.:test -e "ARGV.each{|f| require f}" $*

We have to boot up and initialize postgresql because containers don't preserve running services, they are simply file systems that can be booted up. We also want to do this each time because we'd rather load the db than build it into the container and have to rebuild the container when our schema changes.

I'm using ruby with require, but here you could substitute any way that says "run the following test files". rspec's binary would work well, and also the m binary. I just stuck a simple ruby script here that should be suite-agnostic.


And that's it! It's actually fairly simple, but it took me a while to stick everything together. I think there are certainly some refinements to be made to generalize it a bit better. For example, you could use any container from the docker index that provides a good rails base for your app. That way you wouldn't have to maintain a Dockerfile in the project.

Also, I'm not currently seeing any performance improvements due to the parallelization, but that's because it's a very short suite, so the overhead of doing a bundle check and db initialization outweighs the savings of parallelism.

Try it on your app, I'd love to hear the results.

Disclaimer: This article was originally written in October of 2013, and a lot has changed in Docker since then. The exact examples listed in this blog post may no longer work but the technique is still quite relevant. That's why I decided to post it on the Codeship Blog. If you find a bug or have an update please let me know in the comments! Thanks, Nick.

Discuss this article on Hacker News:

Stay up to date

We'll never share your email address and you can opt out at any time, we promise.