Cluster Orchestration (The DevOps 2.0 Toolkit)

Written by: Viktor Farcic

When I was an apprentice, I was taught to treat servers as pets. I would treat them with care. I would make sure that they are healthy and well fed. If one of them gets sick, finding the cure was of utmost priority. I even gave them names. One was Garfield, and the other was Gandalf. Most companies I worked for had a theme for naming their servers. Mythical creatures, comic book characters, animals and so on. Today, when working with clusters, the approach is different. Cloud changed it all. Pets became cattle. When one of them gets sick, we kill them. We know that there's almost an infinite number of healthy specimens so curing a sick one is a waste of time. When something goes wrong, destroy it and create a new one. Our applications are built with scaling and fault tolerance in mind, so a temporary loss of a single node is not a problem. This approach goes hand in hand with a change in architecture.

If we want to be able to deploy and scale easily and efficiently, we want our services to be small. Smaller things are easier to reason with. Today, we are moving towards smaller, easier to manage, and shorter lived services. The excuse for not defining our architecture around microservices is gone. They were producing too many problems related to operations. After all, the more things to deploy, the more problems infrastructure department has trying to configure and monitor everything. With containers, each service is self-sufficient and does not create infrastructure chaos thus making microservices an attractive choice for many scenarios.

With microservices packed inside containers and deployed to a cluster, there is a need for a different set of tools. There is the need for cluster orchestration. Hence, we got Mesos, Kubernetes and Docker Swarm (just to name a few). With those tools, the need to manually SSH into servers disappeared. We got an automated way to deploy and scale services that will get rescheduled in case of a failure. If a container stops working, it will be deployed again. If a whole node fails, everything running on it will be moved to a healthy one. And all that is done without human intervention. We design a behavior and let machines take over. We are closer than ever to a widespread use of self-healing systems that do not need us.

While solving some of the problems, cluster orchestration tools created new ones. Namely, if we don't know, in advance, where will our services run, how to we configure them?

The DevOps 2.0 Toolkit

If you liked this article, you might be interested in The DevOps 2.0 Toolkit: Automating the Continuous Deployment Pipeline with Containerized Microservices book.

The book is about different techniques that help us architect software in a better and more efficient way with microservices packed as immutable containerstested and deployed continuously to servers that are automatically provisioned with configuration management tools. It's about fast, reliable and continuous deployments with zero-downtime and ability to roll-back. It's about scaling to any number of servers, the design of self-healing systems capable of recuperation from both hardware and software failures and about centralized logging and monitoring of the cluster.

In other words, this book envelops the full microservices development and deployment lifecycle using some of the latest and greatest practices and tools. We'll use Docker, Ansible, Ubuntu, Docker Swarm and Docker Compose, Consul, etcd, Registrator, confd, Jenkins, nginx, and so on. We'll go through many practices and, even more, tools.

The book is available from Amazon ( and other worldwide sites) and LeanPub.

Stay up to date

We'll never share your email address and you can opt out at any time, we promise.