Resilient Routing and Discovery at Shopify

Written by: Codeship

Codeship was at DockerCon 2015! This week, we’ll be providing summaries on our blog of some of the talks we attended at this two-day conference in San Francisco. If you are interested in Docker support from Codeship, click here.

Monday afternoon, Simon Eskildsen spoke at DockerCon about how Shopify landed on a simple infrastructure solution that allows for a distributed and resilient routing and discovery system at scale.

As an infrastructure engineer at Shopify, Eskildsen explained to his audience that Shopify rejected traditional concepts of complex service discovery protocols and distributed microservices. Instead, they work with larger, resilient applications discovered via DNS.

[caption id="attachment_1974" align="aligncenter" width="6016"]

Simon Eskildsen at DockerCon 2015[/caption]

Microservices don’t solve everything

Eskildsen stressed that, first of all, microservices are not the end-all, be-all solution to building every infrastructure. Application resiliency, he pointed out, is far more important. A distributed application consisting of microservices can be vulnerable.

He suggested taking a look at what he called a resiliency maturity pyramid. Tests can be applied to your application in increasing severity to test for level of resiliency. The pyramid should be climbed as infrastructure grows, as the following diagram demonstrates.

A key metric for applications should be resiliency, Eskildsen stated, more so than readability, separation of concern, granular scalability, or any of the main arguments for microservices. An application’s design should be somewhat empirical, rather than arbitrarily designed.

It’s perhaps no great surprise then that Shopify doesn’t use microservices. Instead, Eskildsen said that the company uses fewer, but larger, applications.

Focus on Discovery

Discovery, Eskildsen said, is your infrastructure’s source of truth for services, metadata, and orchestration.

You should consider whether the goal of the method is regional versus global discovery, but at any rate, a discovery backbone should have the following characteristics:

  • No single point of failure

  • Stale reads > no reads

  • Reads order of magnitude larger than it writes

  • Fast convergence

Service discovery is often overly complicated, Eskildsen pointed out. DNS as a service discovery protocol works just fine for the majority of use cases. Shopify decided to implemet service discovery using DNS for a few specific reasons:

  • It’s resilient

  • It’s simple

  • It has API access in most cases

  • It’s globally supported

The decision to work with DNS was also partially based on waiting for Docker to release something like Docker network, which it announced last Monday.

That being said, Eskildsen’s talk on Monday outlined a specific solution to a fairly specific problem. Many a Codeship user will be testing via simple Docker compose stacks, which won’t need to touch anything more complicated like out of the box service discovery. In terms of containers, microservices are suited well to many smaller services, rather than fewer larger ones.

Of course, Eskildsen’s main point about slowing down and considering a simple solution will always apply to any infrastructure design.

[youtube https://www.youtube.com/watch?v=ZDeAEZHby\_A&w=560&h=315\]

Slides

Stay up to date

We'll never share your email address and you can opt out at any time, we promise.