UPDATE: With January 1st, 2017 we rebranded our hosted CI Platform for Docker from “Jet” to what is now known as “Codeship Pro”. Please be aware that the name “Jet” is only being used four our local development CLI tool. The Jet CLI is used to locally debug and test builds for Codeship Pro, as well as to assist with several important tasks like encrypting secure credentials.
For some, the practice of continuous integration (CI) and continuous delivery/deployment (CD) is part of daily life and comes as second nature. However, as I learned while attending a couple of conferences recently, there are still many who aren’t utilizing any form of automated testing, let alone CI/CD. Those not already practicing CI/CD expressed the desire to but either didn’t know where to start or lacked the support of their employers to invest time in it.
This article is intended to serve as an introduction to CI/CD, exploring the definitions of the various terms and providing helpful strategies that will make CI/CD easier and more effective.
If you’re already practicing CI/CD, this may be a refresher, but please add any additional tips to the comments below, and together we can make this a more valuable resource for everyone.
Continuous Integration and Automated Testing
Continuous integration (CI) is the automated process of integrating code from potentially multiple sources in order to build and test it. Exactly what happens during this phase depends on the type of the application, but it most often refers to running one or more types of automated tests.
Automated testing comes in many flavors, such as unit tests, service/API tests and functional/GUI tests. I’ll provide a brief overview of each here, but you can easily find more information about them elsewhere online.
Unit: Performs tests at a low level in the code, typically testing individual functions. The goal of unit testing is to ensure that individual functions do what they are supposed to. Some professionals argue that unit tests should not touch other systems such as databases, but others, myself included, are okay with that. Most of my unit tests involve database models. Testing that they interact with the database and with each other as expected is important. In an MVC (model-view-controller) application, unit tests cover the model layer as well as any helper/utility functions.
Services/APIs: Tests code at integration points such as APIs to ensure API logic is sound, functions as expected, and does not expose any security risks. In an MVC application, controllers typically provide the integration points, such as RESTful APIs or the interaction between forms in a browser and the server-side application. API tests are different than unit tests in that they are not aware of what code is being executed, only that a particular endpoint is called which could execute any number of backend functions and the response is what is expected.
Functional/GUI: Most applications provide some sort of interface for end-users to interact with, whether this is a web page, a desktop application, or a mobile application. Functional/GUI tests simulate users interacting with that interface to ensure it works under a variety of conditions. For web applications, this is often something like Selenium being used to simulate users on multiple browser and operating system combinations. In an MVC application, these tests primarily cover the view layer; however, depending on how you build your view, it may test the controller and model layers as well.
Having automated tests alone is great, but their real value comes into play when they are connected with your source control platform to automatically run every time you push updated code. This connection and automation is what makes continuous integration possible.
Occasionally, I run tests locally before I push updated code to GitHub, but I usually don’t because I know they’ll be run automatically anyway and I’ll be notified if they fail. Not having to run them locally every time I make changes saves me a lot of time.
My first experience with automated testing and continuous integration was in 2006, using CruiseControl to execute PHPUnit tests and refresh development and QA environments. At the time, the idea of also automatically deploying to production was a pie-in-the-sky idea that only a few crazy startup companies were doing. Now I can’t imagine working in an environment without CI/CD. The thought of manually testing and deploying something just brings back painful memories.
The integration between source control and a continuous integration service is usually done using webhooks, which allow the source control system to notify CI services any time the code is updated. Some CI systems can also be configured to poll the source repo for changes or be scheduled to run on a given interval, such as daily.
Choosing CI services
When it comes to choosing a CI service, there are many options and considerations to take into account. Some allow you to run the service on your own servers in your own data center, but there are also many hosted CI/CD services that provide compelling benefits.
For example, most hosted CI services run builds/tests in dedicated single-use containers, which guarantees a consistent and fresh environment for every build. They also take over the burden of maintaining the services, which is really helpful for small teams who don’t have the resources or the desire to manage additional services themselves.
Full disclosure: I use Codeship and love it. I use both Codeship’s classic infrastructure for executing tests and deployments for non-Docker based applications as well as Codeship Pro, Codeship’s new Docker infrastructure to build, test, push, and deploy my Docker-based applications.
Each form of automated testing that I talked about earlier has different implications for continuous integration. For example, if unit tests require a database to perform their work, the CI environment must provide that database. Thankfully, most hosted CI services have all the major database platforms available by default.
For API testing, the CI service needs to allow you to run a server that can receive and process the API calls. This may require the ability to install additional packages and services into the CI environment; when you're choosing a provider, be sure you understand how to run your software in it and whether or not you can install missing packages.
Functional testing can be the most awkward form of testing to perform in a CI environment, considering the wide range of software and platforms needed to perform the tests. My team has been running Selenium tests locally for some time now to do browser testing, but recently we took the time to learn how to use Sauce Labs, a hosted Selenium testing service, to execute functional browser tests against our code during CI.
Sauce Labs has a product called Sauce Connect that establishes a tunnel between your CI environment and their servers to allow them to make web requests to your application. So when our tests are running on Codeship, we start a web server for our app and then use Sauce Connect to establish a tunnel with Sauce Labs, and finally we run our Selenium tests.
This was a great enhancement for us. Now, whatever OS a developer is working on, their pushed code will be tested by several versions of Windows, Mac, and Linux in half a dozen browsers, all automatically and without tying up their computer while the tests run.
Speeding things up with parallel testing
Running all these forms of automated tests can be time consuming, especially the Selenium tests. A core value in CI is that builds are fast so that developers get feedback as quickly as possible.
Planning for parallel execution of tests can help speed up the testing process significantly. Rather than running unit tests, then API tests, then functional tests in serial, some CI providers support parallel testing which allows all three to run simultaneously.
At DockerCon this year, I was able to attend a talk by Codeship’s own Laura Frank on “Efficient parallel testing with Docker” in which Laura walks through the principles and challenges of parallel testing. While she talks about how Docker helps in this process, the principles involved aren't limited to Docker alone and would be useful for anyone interested in parallel testing.
Okay, that was a lot about continuous integration, so let’s move on to continuous deployment.
Continuous Delivery/Continuous Deployment
The CD in CI/CD can refer to either continuous deployment or continuous delivery. In some contexts, what is being built, tested, and delivered does not involve deployment of applications, so therefore the term "deployment" doesn’t make sense.
For example, libraries and artifacts are typically not deployed to running systems but rather built and pushed into a repository for other applications to consume. So if you maintain a PHP library, you would “deliver” it to packagist.org after a successful CI run. If you’re writing a NodeJS library, you would “deliver” it to npmjs.org.
Automatically deploying or delivering something can be scary at first. The key to making it less scary is having a robust testing and integration process that you can trust, as well as a strategy for what gets deployed and when.
Achieving 100-percent test coverage for your application can be really difficult. However, you can make it sustainable by ensuring that you are testing the most important components of your application and that you’re ready to write new tests as issues are discovered.
I’ve found that testing my API can be more valuable than just unit testing. By testing my APIs, I’m also testing all the supporting functions behind them. I still test both, but I focus more on the API tests and any permutation of them I can think of for functionality and security.
Exploring a sample deployment strategy
Another way to reduce risk and increase confidence is through peer code review and deployment strategies based on source control branching. We use a version of the Git Flow workflow by Vincent Driessen.
In our environment, the develop
branch maps to our staging environment and the controller
branch to our production environment. This means that any time code is pushed to the develop or controller branches, a CI phase is initiated. Once that's successful, the CD phase will begin and move the code to the appropriate environment.
We do all of our development in feature
branches that come off of develop
. Whenever we push to feature
branches, only the CI process is kicked off to perform tests, but the code is not automatically deployed anywhere. When we believe the code in our feature
branch is ready to be integrated into develop
and go through the release process, we create a pull request to initiate and facilitate a peer review.
This way we're never deploying code to production that others in the team have not seen and reviewed. We’ve been able to identify many potential bugs in this process, write tests to watch for them in the future, and save ourselves from the embarrassment of those bugs making it to production.
Of course, we’re not perfect. We still release bugs to production. But when we fix them, we also put a test in place to prevent regression bugs in the future.
Depending on the size of your team, the nature of who is contributing code to your project, and your overall trust factor, both Bitbucket and GitHub support setting permissions that can limit who is allowed to push code into specific branches. This can allow for a smaller group of release managers to have that responsibility, in order to prevent developers from pushing to a CD-enabled branch and deploying code accidentally.
Planning for CI/CD
If you haven’t planned for CI/CD in the process of developing your application, you may have a harder time implementing it. What I mean by this is that applications today rarely stand alone without dependencies on any external services. Integration with external services from a CI environment may be dangerous or even prohibited.
For example, several years ago I worked on an application that integrated with a couple LDAP servers, Google Apps APIs, and Gmail. When I wrote my unit tests, they were able to run locally and interact with these external services. However I hadn't planned ahead -- when I tried to move the app to a hosted CI service, I quickly learned some of my tests wouldn't work and others were just a really bad idea to run.
The test environment LDAP servers I needed to connect to were only accessible from the LAN at my office, so I couldn't connect to them from a hosted service. The tests running against Google Apps and Gmail were a bad idea because both of those environments are production. I ended up creating lots of test accounts in our Google Apps instance and got our account blocked a couple of times due to rate limits.
Thankfully, I’ve learned from that. Now we follow better design patterns in our applications and utilize dependency injection strategies so that in a CI environment, we can inject mocked versions of services to prevent any calls to external services.
This has the added benefit of speeding up our tests because we don’t have to wait on external API calls. For a while we even used a hosted API mocking service. We eventually had to replace it with local mocks because they would occasionally block us for rate limits or fail for their own reasons. This in turn would cause our builds to fail, which isn't good. I only want builds to fail if it's our fault.
Keeping environmental and secret information out of our source code is always a good idea, but you may need to include some default values for test execution. We use a combination of environment variables and other secret-injection techniques to provide configuration information to our applications in each environment. For testing, however, we include files in the source control repository that we use only during automated tests. We’ve found this to also be helpful in getting others up and running with a copy of the application quickly using test configuration with mocks and stubs.
Planning ahead and building out this configuration as you build your app helps prepare for CI/CD, provides documentation for needed configuration, and gives you something to run with before integrating with real services.
The Business Value of CI/CD
As I mentioned earlier, automatically deploying updates can be scary. However, the flip side of this is when things do go wrong and bugs are deployed, getting a fix deployed is quick and easy.
I came from a very large enterprise that did everything manually and only released updates quarterly. For them, problems in production were a major effort to address. And since their processes were so manual, they were also more error-prone. The time to deploy a fix was several times longer than fixing the code itself.
With an automated process, there's less chance for human error and results are more consistent. With a fully automated CI/CD process, getting updated code to production can happen in minutes, not hours, not days, and certainly not weeks.
Automated deployments also mean the cost to perform a deployment is very small. By cost, I'm referring to the time, effort, and resources required to perform the work. With a low cost for deployment, you're free to deploy as often as you like. When deployments can happen at will, it's easy to deploy very small changes at a time. The fewer the changes, the lower the risk.
CI/CD can also cut down on the time-consuming process of onboarding new team members. I can remember a time when it would take several days (or more) to get a new developer fully set up with their own development environment, contributing code, and learning the deployment process.
On my current team, we have a goal of Day One deployments. That means when a new member joins our team, we want them to be able to deploy code to production on their first day. Of course, this extends beyond the scope of CI/CD as it involves processes before committing code, but CI/CD is essential to reach this goal. You can imagine how encouraging it is for a new person to see their work go live on their first day and the sense of value they feel knowing they're already contributing.
On that note, the final benefit I’ll attribute to CI/CD is how much happier developers and operations people are when they don’t have to mess around with tedious manual processes. Encouraging teams to automate these things sends the message that their time and energy is valued, and when employees feel valued, they perform better. When employees perform better, companies perform better. In the end, CI/CD helps foster a culture of continuous improvement that benefits employees, companies, and customers alike.
So what's holding you back?