Previously, I have written about serverless, in that case I was focussing on Function As A Service: a way to break down your application into “atoms” that naturally run in a “server less” fashion. In this post I wanted to cover how Kubernetes itself can run in a server-less fashion.
What I mean by that is that you don’t manage servers, and resources are only consumed as needed.
In an ideal world, you wouldn’t care about this lower level stuff. But for now, I believe it pays to be aware of where things are heading. In an ideal world you would just pick up the itemized bill for what your application was doing, what it needed when it needed it (and hopefully not pay for CPU, memory and other things that aren’t needed). Also note to make use of things like Jenkins X. You shouldn’t have to dive this deep, but read on to satisfy your curiosity.
Martin Fowler has a great article which teases apart what serverless is, which is worth a read to get some broad background for the extra curious.
There are two interesting enabling technologies that can “make Kubernetes serverless.” What I mean by this is both not managing servers as well as potentially reducing the total cost of running. I define the total cost of running as both the cost of cloud infrastructure (most are billed per minute now) as well as the cost of managing (administration).
Firstly - as I have mentioned before, Amazon Fargate which takes away the managing of servers that run containers. You tell it what you need to run and it works out how. Currently it is integrated with ECS. I view Fargate as a work-in-progress which is showing the way to make container workloads as easy as more traditional serverless workloads (ie functions as a service). Fargate is “coming soon” to Amazon’s Kubernetes service (EKS) but we can still look at how they may work together, based on what is available today.
Secondly - is something called a Virtual Kubelet. A Kubelet can be thought of as the smallest unit of a Kubernetes cluster. Each node in a cluster would run a kubelet agent, which makes that node (server) available to take on Tasks. A virtual kubelet looks like kubelet (it has a well defined API) to a cluster but the virtual bit means that there doesn’t have to be an actual server behind it; it can pretend there is, and it can then take on task workloads (which is - by definition - serverless).
You can see how combining Virtual Kubelet with something like Fargate means that you get both a Kubernetes experience, but without servers sitting there, being managed, waiting to take on workloads.
This diagram from the Virtual Kubelet repo I think does a pretty great job at explaining how they work:
When the virtual kubelet is installed in a cluster, then on top of what existing nodes are there, will be an additional virtual node. This virtual node is what is “serverless” as it isn’t really there, just pretending to be there. The Kubernetes API and constructs are still used, the virtual part of the kubelet is opaque to the users of it.
When the virtual-node gets assigned a task, say to run a Pod, it will then delegate to the configured provider to actually run the Pod, to actually do the work. Virtual kubelet has a provider abstraction, which has implementations for Azure, AWS Fargate and Hyper.sh clouds.
Read more about virtual kubelets here.
AWS Fargate and pricing
Fargate manages the underlying VMs and infrastructure for you, even more so than EKS does (at the time of writing). It is billed as you use it, by the second. This makes it legitimately a “serverless” construct as there are no servers to manage, nor is there a static cost of running servers. Fargate means your cluster can rapidly scale with load, if and when it is needed. With EC2 you can also use autoscaling to accommodate some variability in load, but it is more coarse grained. Fargate, at least in theory, means that scaling can happen at the last possible moment (aside: this implies it is even more important to have lean/small/fast to start containers - but that is always a good practice anyway, you don’t want more than you need in containers you deploy into production).
Let’s look at some pricing for Fargate vs the regular on-demand pricing of EC2. Fargate is priced per second, per CPU and per GB of RAM needed (it does have some limitations in permutations of what you can choose). EC2 is priced (now) per hour but can be billed per minute, but with fixed sizes. To make it an apples with apples comparison, lets use a fixed configuration equivalent to a ‘m5.large’ virtual machine: 2 CPUs and 8 GB of RAM (a modest machine, but not a bad worker node size):
- EC2: $0.096 per hour
- Fargate: $0.2018 per hour
So there is a premium on Fargate, it is about double. But wait, it obviously isn’t that simple. Firstly, Fargate is metered per second. If your workload can start up and finish in sub-minute time, Fargate will obviously be great value. Fargate would also be far more elastic, in the sense that if your workload is lumpy and not steady over an hour, it could end up cheaper.
So let’s say your traffic is fairly lumpy: ideally autoscale with EC2 would make short work of that, but it often works better when the load varies in a smoother pattern. Fargate would shine here as it could launch tasks (pods) on demand; when they aren’t needed, resources are immediately released, billing is stopped that second. Autoscale can take some time before VMs are released from the cluster. So, in practice, Fargate could end up cheaper. However, if your traffic is smoother (it can still vary, but in a less lumpy fashion), then EC2 could be cheaper.
The other costs
Humans! One of the key advantages of Fargate and Virtual Kubelets is that the administrative overhead of configuring cluster settings, or (in some cases) VMs, is non-existent, you just think in terms of resources needed, and the containers that make up your application pods. As always, there are tradeoffs.
When to use
If you were to show a graph of your system load, say in number of users of your app (or whatever you think represents a good score of load of the system), and it looks like this:
Then perhaps virtual kubelets and a serverless backend like Fargate would be great for you.
However if the graph looks more like:
Perhaps a less virtual kubelet approach, with autoscaling, would be more economical. Often patterns like this are predictable with time of day, so more traditional autoscaling can work quite well (also purchasing reserved instances can make further savings). I have seen many workload graphs over the years of larger scale systems that show almost sinewave like patterns that represent the bulk of users and the workday. These work quite well with autoscaling.
Well I like to say “laziness pays off right now” and this may be true both now and in the future with Fargate and EKS, if you can wait. If you can’t, read this post by Amazon on how to do this with Fargate and Virtual Kubelets.
Links to Additional Resources