Orchestrating Workflows with Jenkins and Docker

As part of the rich support for the Docker container system appearing in CloudBees Jenkins Enterprise (CJE) 15.05, we are bundling a new CloudBees Docker Workflow plugin. This plugin combines the flexibility and power of the Jenkins Workflow systemwith the convenience and efficiency of Docker for building and deploying applications.

In a nutshell, this new plugin adds a special entry point named (surprise!) dockerthat you can use in any Workflow Groovy script. It offers a number of functions for creating and using Docker images and containers. Broadly speaking, there are two areas of functionality: using Docker images of your own, or created by the worldwide community, to simplify build automation; and creating and testing new images. Some projects will need both aspects, and you can follow along with a complete project that does use both: see the demonstration guide.

Before getting into the details, it is helpful to know the history of configuring build environments in Jenkins. Most project builds have some kind of restrictions on the computer which can run the build. Even if a build script (say, an Ant build.xml) is theoretically self-contained and platform-independent, you have to start somewhere, and say what tools you expect to use. If you are lucky, a responsible developer documented this clearly.

To build me, you need to use Ant 1.9.0 or later, with JDK 7.

If you are less lucky, that documentation may be missing, or run to several pages of dense instructions.

Make sure you have installed the the developer package of libqwerty before running ./configure. But not the 2.0 version that comes with newer Ubuntu distributions! That one is broken. Use 1.3.

Well, OK, your long-suffering development tooling engineer sets up a computer with libqwerty-devel-1.3.deb and starts a Jenkins slave agent on, giving this slave the label libqwerty and making sure the project building these sources requests that label expression.

What about Ant 1.9.0 and JDK 7? Well, these tools are easy enough to download and install anywhere. On the other hand, you probably do not want to keep complete copies of them in your project’s source code. So you want to tell Jenkins, “hey, wherever you wind up building this, download http://archive.apache.org/dist/ant/binaries/apache-ant-1.9.0-bin.tar.bz2 and unpack and run from there”. Since a lot of people needed to do this, back in 2009 I worked with Kohsuke Kawaguchi and Tom Huybrechts to add a facility to what was then called Hudson for “tools”. Now the Jenkins administrator can go the system configuration page, and say that Ant 1.9.0 and JDK 1.7.0_67 should be offered to projects which want them, downloading and installing from public sites on demand. From a traditional job, this becomes a pulldown option in the project configuration screen, and from a Workflow, you can use the tool step:

node('libqwerty') {
  withEnv(["PATH=${tool 'Ant 1.9.0'}/bin:${env.PATH}"]) {
    sh 'ant dist-package'
  }
  archive 'app.zip'
}

A little better, but this still leaves a lot of room for error. What if you suddenly need Ant 1.9.3—do you need to wait for a Jenkins administrator? Or for libqwerty 1.4, are you going to need (gulp) shell access to the build machine? If you want to scale up to hundreds of builds a day, who is going to maintain all those machines?

A more insidious problem becomes apparent after you have run hundreds of builds a day, for weeks. It turns out your test cycle starts up some test servers, and shuts them down at the end. Well, is supposed to shut them down at the end, except for a few hours last week when someone accidentally deleted the “shut down” part of the script, and a few builds left some processes running until someone noticed the mistake and fixed it. Fortunately Jenkins tries to identify and kill every process a build spawned when it is done, but unfortunately these did not get respond to the shutdown signal. So your build server ran out of memory and needed to be rebooted.

As your tests grow more involved, the two-minute build you used to have is now two hours, so you ask Jenkins to run multiple builds in parallel so developers do not need to wait so long for results. This works great for a while, except once in a blue moon the tests fail mysteriously. After some digging, you find out that the test server was being asked to listen on a random TCP port, and one day two builds running at the same time on the same computer happened to ask for the same port, so the second died. Fine, you change your test code to ask the server to find an available port first, then inform the test which port it picked, rather than the other way around.

Oh, and it turns out that a test library you are using scans /tmp/doodles for XML files, which a part of your build creates, and once in a while one build will be in the middle of writing one of these files at the exact instant another build is doing the scanning, which then fails with a nasty message about malformed content. This happens three hours before your release deadline.

The traditional fix for all these nasty problems of reproducibility and interference is to use virtualization. You create a virtual machine image that has all the tools and libraries you need, and add in Java and SSH so that Jenkins can connect to it as a slave. Now the Jenkins administrator just creates a “cloud” based on this image, and ideally sets it to a so-called “one shot” mode: every time a build requesting this cloud is scheduled, a virtual machine is booted from a pristine snapshot, Jenkins starts a slave agent on it when it is ready, the build runs, and then the machine is destroyed. CJE includes a VMWare plugin to create such a cloud based on a vCenter installation, or you can use the popular Amazon EC2 plugin, and more.

The oft-mentioned drawback of virtualization is performance: each VM typically consumes a fixed amount of RAM, most of which is overhead or unused; booting a VM, even “resuming”, takes on the order of seconds or minutes; and since the VM image could be hundreds of megabytes, updating it to use that slightly newer libqwerty release can be a bear. Anyway creating these images can require pretty complicated tooling.

Containers to the rescue! Docker makes it very easy for the project developer to try a stock development-oriented image on Docker Hub, or write a customized one with a short Dockerfile:

FROM webratio/ant:1.9.4
RUN apt-get install libqwerty-devel=1.4.0

So how do we get our build to run in this thing? Well, whoever last edited that Dockerfile can build an image and push it to the company Docker registry whenever they make an edit. Then you can use the Docker plugin to define a Docker-based cloud using this image. (CloudBees Jenkins Operations Center 15.05 lets you set up this cloud just once and share it among several Jenkins servers.)

Now the project developer is in full control of the build environment. Gone are the days of “huh, that change compiled on my machine”: anyone can run the Docker image on their laptop to get an environment identical to what Jenkins uses to run the build. Unfortunately, if other projects need different images, the Jenkins administrator will have to get involved again to set up additional clouds. Also there is the annoyance that before using an image you will need to tweak it a bit to make sure it is running the SSH daemon with a predictable user login, and a version of Java new enough to run Jenkins slaves.

What if all this hassle just went away? Let us say the Jenkins administrators guaranteed one thing only:

If you ask to build on a slave with the label docker, then Docker will be installed.

and proceeded to attach a few dozen beefy but totally plain-vanilla Linux cloud slaves. Finally with CloudBees Docker Workflow you can use these build servers as they come.

// OK, here we come
node('docker') {
  // My project sources include both build.xml and a Dockerfile to run it in.
  git 'https://git.mycorp.com/myproject.git'
  // Ready?
  docker.build('mycorp/ant-qwerty:latest').inside {
    sh 'ant dist-package'
  }
  archive 'app.zip'
}

That is it. Click Build Now and you get a slot on one of the shared slaves. (It is already up and running, so that is pretty much immediate.) Your latest source code is checked out, a complete virtual Linux environment is created based on the Dockerfile, and within seconds the right version of Ant is building your program. When Ant is done, everything is cleaned up, and concurrent builds will never interfere with one another (maybe the machine will slow down a little).

Want to try upgrading to libqwerty 1.9.5 at two in the morning? Edit the Dockerfile, build it on your laptop, try the Ant build locally in that container, and if everything looks good commit and push. Your Jenkins administrator remains sound asleep in bed.

Embedded in a few lines of Groovy instructions is a lot of power. First we used docker.build to create a fresh image from a Dockerfile definition. If you are happy with a stock image, there is no need for even this:

node('docker') {
  git 'https://git.mycorp.com/myproject.git'
  docker.image('webratio/ant:1.9.4').inside {
    sh 'ant dist-package'
  }
  archive 'app.zip'
}

docker.image just asks to load a named image from a registry, here the public Hub. .inside asks to start the image in a new throwaway container, then run other build steps inside it. So Jenkins is really running docker exec abc123 ant dist-package behind the scenes. The neat bit is that your single project workspace directory is transparently available inside or outside the container, so you do not need to copy in sources, nor copy out build products. The container does not need to run a Jenkins slave agent, so it need not be “contaminated” with a Java installation or a jenkins user account.

The power of Workflow is that structural changes to your build are just a few lines of script away. Need to try building the same sources twice, at the same time, in different environments?

def buildIn(env) {
  node('docker') {
    git 'https://git.mycorp.com/myproject.git'
    docker.image(env).inside {
      sh 'ant dist-package'
    }
  }
}
parallel older: {
  buildIn 'webratio/ant:1.9.3'
}, newer: {
  buildIn 'webratio/ant:1.9.4'
}

So far everything I have talked about assumes that Docker is “just” the best way to set up a clear, reproducible, fast build environment. But the main use for Docker is course to simplify deployment of applications to production. We already saw docker.build creating images, but what then? You would want to test them from Jenkins too. To that end, you can .run an image while you perform some tests against it. And you can .push an image to the public or an internal, password-protected Docker registry, where it is ready for production systems to deploy it.

This is getting a little long to write here, so look at the demo script to see all of those things happening. The demo highlights that you can use multiple containers running concurrently to test the interaction between systems. In the future we may want to build on Docker Compose to make it even easier to set up and tear down complex assemblies of software, all from a simple Jenkins workflow script making use of freestanding technologies. You can even keep that flow script in source control, too, so everything interesting about how the project is built is controlled by a handful of small text files.

By this point you should see how Jenkins and Docker can work together to empower developers to define their exact build environment and reproducibly produce application binaries ready for operations to use, all with minimal configuration of Jenkins itself. Download CJE 15.05 or install CloudBees Docker Workflow on any Jenkins 1.596+ server and get started today!

 

Jesse Glick
Developer
CloudBees

 

Add new comment