At this point it is certainly no secret that I am fond of the work the Microsoft Azure team have been doing over the past couple years. While I was excited to announce the Jenkins project had partnered to run project infrastructure on Azure. Things didn't start to get really interesting until they announced Azure Container Service. A mostly-turn-key Kubernetes service alone was pretty interesting, but then "AKS" was announced, bringing a, much needed, managed Kubernetes resource into the Azure ecosystem. Long story short, thanks to Azure, I'm quite the fan of Kubernetes now too.
Kubernetes is brilliant at a lot of things. It's easy to use, has some really great abstractions for common orchestration patterns, and is superb for running stateless applications. Stateful applications also run fairly well on Kubernetes, but the challenge usually has much more to do with the application, rather than Kubernetes. Jenkins is one of those challenging applications.
Since Jenkins is my jam, this post covers the ins-and-outs of deploying a Jenkins master on Kubernetes, specifically through the lens of Azure Container Service (AKS). This will cover the basic gist of running a Jenkins environment on Kubernetes, evaluating the different storage options for "Persistent Volumes" available in Azure, outlining their limitations for stateful applications such as Jenkins, and will conclude with some recommendations.
Jenkins and the File System
To understand how Jenkins relates to storage in Kubernetes, it's useful to first review how Jenkins utilizes its backing file system. Unlike many contemporary web applications, Jenkins does not make use of a relational database or any other off-host storage layer, but rather writes a number of files to the file system of the host running the master process.
These files are not data files, or configuration files, in the traditional sense. The Jenkins master maintains an internal tree-like object model, wherein generally each node (object) in that tree is serialized in an XML format to the file system. This does not mean that every single object in memory is written to an XML file, but a non-trivial number of "live" objects representing Credentials, Agents, Projects, and other configurations, may be periodically written to disk at any given time.
A concrete example would be: when an administrator navigates to
http://JENKINS_URL/manage
and changes a setting such as "Quiet Period" and
clicks "Save", the config.xml
file (typically) in /var/lib/jenkins
will be
rewritten.
These files aren't typically read in any periodic fashion, they're usually only read when objects are loaded into memory during the initialization of Jenkins.
Additionally, XML files will span a number of levels in the directory
hierarchy. Each Job or Pipeline will have a directory in
/var/lib/jenkins/jobs/<jobname>
which will have subfolders containing files
corresponding to each Run.
In short, Jenkins generates a large number of little files across a broad, and sometimes deep, directory hierarchy. Combined with the read/write access patterns Jenkins has, I would consider it a "worst-case scenario" for just about any commonly used network-based storage solution.
Perhaps some future post will more thoroughly profile the file system performance of Jenkins, but suffice it to say: it's complicated.
Kubernetes Storage
With a bit of background on Jenkins, here's a cursory overview storage in Kubernetes. Kubernetes itself provides a consistent, cross-platform, interface primarily via three "objects" if you will: Persistent Volumes, Persistent Volume Claims, and Storage Classes. Without diving too deep into the details, workloads such as Jenkins will typically make a "Persistent Volume Claim", as in "hey give me something I can mount as a persistent file system." Kubernetes then takes this and confers with the configured Storage Classes to determine how to meet that need.
In Azure these claims are handled by one of two provisioners:
- Azure Disk: an abstraction on top of Azure's "data disks" which are attached to a Node within the cluster. These show up on the actual Node as if a real disk/storage device has been plugged into the machine.
- Azure File: an abstraction on top of Azure Files Storage, which is basically CIFS/SMB-as-a-Service. CIFS mounts are attached to the Node within the cluster, but rapidly attachable/detachable like any other CIFS/SMB mount.
Both of these can be used simultaneously to provide persistence for stateful applications in Kubernetes running on Azure, but their performance and capabilities are not going to be interchangeable.
Azure Disk
In AKS, two Storage Classes are pre-configured by default, yet neither one is configured to actually be the default Storage Class:
default
: utilizes the "Standard" storage (as in, hard drive, spinning magnetic disks) model in Azure.managed-premium
: utilizes the "Premium" storage (as in, solid state drives).
The only real distinctions between the two which I have observed are going to be speed and cost.
Limitations
Regardless of whether "Standard" or "Premium" storage is used for Azure Disk-backed Persistent Volumes in Kubernetes (AKS or ACS) the limitations are the same.
In my testing, the most frustrating limitation is the fixed number of data disks which can be attached to a Virtual Machine in Azure.
As of this writing, the default Virtual Machine size used when provisioning AKS
is: Standard_D1_v2
. One vCPU and 3.5GB of memory and a data disk limit of
four. Fortunately the default node count for AKS is current 3, but this
means that a default AKS cluster cannot currently support more than 12
Persistent Volumes backed by Azure Disk at once.
An easy way to change that is to provision larger Virtual Machine sizes with
AKS, but this cannot be changed once the cluster has been provisioned. For
my research clusters I have stuck with a minimum size of Standard_D4_v2
which
provides up to 32 data disks per Virtual Machine, e.g.:
az aks create -g my-resource-group -n aks-test-cluster -s Standard_D4_v2
The Azure Disk Storage Class in Kubernetes also only supports the
ReadWriteOnce
access mode.
In effect a Persistent Volume can only be mounted read/write by a single Node
within the Kubernetes cluster. By understanding how Azure Disk volumes are
provisioned and attached to Virtual Machines in Azure, this makes total sense.
The impact of this means that the only allowable replica
setting for any
given workload which might use this Persistent Volume is 1.
This has one further limitation on scheduling and high-availability for workloads running on the cluster. Detaching and attaching disks to these Virtual Machines is a slow operation. In my experimenting this varied from approximately 1 to 5 minutes.
For a "high availability" stateful workload, this means that a Pod dying or being killed by a rolling update, may incur a non-trivial outage if Kubernetes schedules that Pod for a different Node in the cluster. While there is support specifying node affinity in Kubernetes, I have not figured out a means of encouraging Kubernetes to keep a workload scheduled on whichever Node has mounted the Persistent Volume. Though it would be possible to explicitly pin a Persistent Volume to a specific Node, and then pin a Pod to that Node, a lot of workload flexibility would be lost.
Benefits
It may be tempting to think at this point "Azure Disk is not good, so everything should just use Azure File!" But there are benefits to Azure Disk which should be considered. Azure Disk is, for lack of a better description, a big dumb block store. In that simplicity are its strengths.
While Persistent Volumes backed by Azure Disk can be slow to detach or reattach to a Node, once they're present, they're fast. Operations like disk scans, small reads and writes, all feel like trivially fast operations from the Jenkins standpoint. In my testing the difference between a Jenkins master running on local instance storage (the Virtual Machine's "main" disk) and running a Jenkins master on a partition from a Data Disk, is imperceptible.
Another benefit which I didn't realize until I evaluated Azure
File backed Persistent Volumes is that, as a big dumb block
store, Azure Disks are essentially whatever file system format you want them to
be. In AKS they default to ext4
which is perfectly happy and native to me,
meaning my Linux-based containers will make the correct assumptions about the
underlying file system's capabilities.
Azure File
AKS does not set up an Azure File Storage Class by default, but the Kubernetes
versions which are available (1.7.7, 1.8.1) have the support for Azure File
backed Persistent Volumes. In order to add the storage class, pass something
like the following to Kubernetes via kubectl create -f azurefile.yaml
:
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: azurefile
annotations:
labels:
kubernetes.io/cluster-service: 'true'
provisioner: kubernetes.io/azure-file
parameters:
storageAccount: 'mygeneralpurposestorageaccount'
reclaimPolicy: 'Retain'
# mountOptions are passed into mount.cifs as `-o` options
mountOptions:
According to the Azure File documentation
it's not necessary to specify the storageAccount
key, but I had some
difficulty coaxing AKS to provision an Azure Storage Account on its own, so I
manually provisioned one within the "hidden" AKS Resource Group"
(MC_<group>_<aks-name>_<location>
) and entered the name into
azurefile.yaml
.
Full disclosure: I hate Storage Accounts in Azure. Where nearly everything else in Azure rather enjoyable to use, and neatly tucked into Resource Groups, and have reasonable naming restrictions, Storage Accounts are crummy and live in an Azure global namespace so if somebody else chooses the same name as what you want, tough luck. The reason this is somewhat relevant to the current discussion is that Storage Accounts feel old when you use them. Everything about them feels as if it's from a by-gone era in Azure's development (ASM).
The feature used by the Azure File Storage Class is what I would describe as "Samba/CIFS-as-a-Service." Kubernetes is basically utilizing the Microsoft-technology-equivalent of NFS.
But it's not NFS, it's CIFS. And that is important to Linuxy container folks.
Limitations
The biggest limitations with Azure File backed Persistent Volumes in Kubernetes
are really limitations of
CIFS, and frankly,
they are infuriating. An application like Jenkins will make, what were at one
point, reasonable assumptions about the operation system and underlying
file system. "If it looks like a Linux operating system, I am going to assume
the file system supports symbolic links" comes to mind. Jenkins will attempt to
create symbolic links when a Pipeline Run or Build completes, to update a
lastSuccessfulBuild
or lastFailedBuild
symbolic link, which are useful for
hyperlinks in the Jenkins web interface.
Jenkins should no doubt be more granular and thoughtful about file system capabilities, but I'm willing to bet that a number of other applications which you might consider deploying on Kubernetes are also making assumptions along the lines of "it's a Linux, so it's probably a Linuxey file system" which Azure File backed Persistent Volumes invalidate.
Volumes which are attached to the Node, are attached with very strict
permissions.
On a Linux file system level, an Azure File backed volume attached at /mnt/az
would be attached with 0700
permissions allowing only root access. There
are two options for working around this, as far as I am aware of:
- Adding a
uid=1000
to themountOptions
specified for the Storage Class in theazurefile.yaml
referenced above. Unfortunately this would require that every container attempting to utilize Azure File backed volumes use the same uid. - Specifying a
securityContext
for the container with:
runAsUser: 0
. This makes me feel exceptionally uncomfortable, and I would absolutely not recommend running any untrusted workloads on a Kubernetes cluster with this setting.
The final, and for me the most important, limitation for Azure File backed storage is the performance. Presently there is no Premium model offered for Azure Files Storage, which I would presume means that Azure File volumes are backed by spinning hard drives, rather than solid state.
The performance bottleneck for Jenkins is not theoretical however. With a totally fresh Persistent Volume Claim for a Jenkins application, the initialization of the application took upwards of 5-15 minutes, namely:
- 2-3 seconds to create the Persistent Volume and bind it to a Node in the Kubernetes cluster.
- 3-4 minutes to "extract [Jenkins] from war file". When
jenkins.war
runs the first time, it unpacks the.war
file intoJENKINS_HOME
(usually/var/lib/jenkins
) and populates/var/lib/jenkins/war
with a number of small static files. Basically, unzipping a 100MB archive which contains hundreds of files. - 5-10 minutes from "Starting Initialization" to "Jenkins is ready." In my observation this tends to be highly variable depending on the size of Jenkins environment, how many plugins are loaded, and what kind of configuration XML files must be loaded at initialization time.
The closest comparison to Azure File backed storage and the performance challenges I have observed with it, are similar to challenges the CloudBees Engineering team observed with Amazon EFS when it was first announced. The disk read/write patterns exhibited by Jenkins caused trouble on EFS as well, but that has seen marked improvement over the last 6 months, whereas Azure Files Storage doesn't appear to have had significant performance improvements in a number of years.
Benefits
Despite performance challenges, Azure File backed Persistent Volumes are not
without their benefits. The most notable benefit, which is what originally
attracted me to the Azure File Storage Class, is the support for the
ReadWriteMany
access mode.
For some workloads, of which Jenkins is not one of them, this would enable a
replicas
setting greater than 1 and concurrent Persistent Volume access
between the running containers. Even for single container workloads, this is a
valuable setting as it allows for effectively zero-downtime rolling updates and
re-deployments when a new Pod is scheduled on a different underlying Node.
Additionally, Azure File volumes can be simultaneously mounted by other machines in the resource group, or even across the internet, which can be very useful for debugging or forensics when something goes wrong (things usually go wrong). Compared to an Azure Disk volume which would require a container to be successfully running in the Kubernetes environment before you could dig into the disk.
Conclusions
Running a highly available Jenkins environment is a non-trivial exercise. One which requires a substantial understanding of both the nuances of how Jenkins interacts with the file system but also how users expect to interact with the system. While I was optimistic at the outset of this work that Kubernetes, or more specifically AKS, might significantly change the equation; it has not.
To the best of my understanding, this work applies evenly to Azure Container Service (ACS) and Azure Container Service (AKS) (naming is hard), since both are using the same fundamental Kubernetes support for Azure via the Azure Disk and Azure File Storage Classes. Unfortunately I don't have time to do a serious performance analysis of Data Disks using Standard storage, Data Disks using Premium Storage, and Azure File mounts. I would love to see work in that area published by the Microsoft team though!
At this point in time, those seeking to provision Jenkins on ACS or AKS, I strongly recommend using the Azure Disk Storage Class with Premium storage. That will not help with "high availability" of Jenkins, but at least once Jenkins is running, it will be running swiftly. I also recommend using Jenkins Pipeline for all Jenkins-based workloads, not just because I fundamentally think it's a better tool than classic Freestyle Jobs, but it has built-in durability. Using Jenkins in tandem with the Azure VM Agents plugin, workloads are kicked out to dynamically provisioned Virtual Machines, and when the master goes down, from which recovery can take 5-ish minutes in the worst case scenario, the outstanding Pipeline-based workloads will not be interrupted during that window.
I still find myself excited about the potential of AKS, which is currently in "public preview." My recommendation to Microsoft would be to spend a significant amount of time investing in both storage and cluster performance to strongly differentiate AKS from Kubernetes provided on other clouds. Personally, I would love to have: faster stateful applications, auto-scaled Nodes based on compute (or even Data Disk limits!), and cross-location Federation for AKS.
Maybe in 2018!
Configuration
Below is the configuration for the Service, Namespace, Ingress, and Stateful Set I used:
---
apiVersion: v1
kind: "List"
items:
- apiVersion: v1
kind: Namespace
metadata:
name: "jenkins-codevalet"
- apiVersion: v1
kind: Service
metadata:
name: 'jenkins-codevalet'
namespace: 'jenkins-codevalet'
spec:
ports:
- name: 'http'
port: 80
targetPort: 8080
protocol: TCP
- name: 'jnlp'
port: 50000
targetPort: 50000
protocol: TCP
selector:
app: 'jenkins-codevalet'
- apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: 'http-ingress'
namespace: 'jenkins-codevalet'
annotations:
kubernetes.io/tls-acme: "true"
kubernetes.io/ingress.class: "nginx"
spec:
tls:
- hosts:
- codevalet.io
secretName: ingress-tls
rules:
- host: codevalet.io
http:
paths:
- path: '/u/codevalet'
backend:
serviceName: 'jenkins-codevalet'
servicePort: 80
- apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: "jenkins-codevalet"
namespace: "jenkins-codevalet"
labels:
name: "jenkins-codevalet"
spec:
serviceName: 'jenkins-codevalet'
replicas: 1
selector:
matchLabels:
app: 'jenkins-codevalet'
volumeClaimTemplates:
- metadata:
name: "jenkins-codevalet"
namespace: "jenkins-codevalet"
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
template:
metadata:
labels:
app: "jenkins-codevalet"
annotations:
spec:
securityContext:
fsGroup: 1000
# https://github.com/kubernetes/kubernetes/issues/2630#issuecomment-344091454
runAsUser: 0
containers:
- name: "jenkins-codevalet"
image: "rtyler/codevalet-master:latest"
imagePullPolicy: Always
ports:
- containerPort: 8080
name: http
- containerPort: 50000
name: jnlp
resources:
requests:
memory: 384M
limits:
memory: 1G
volumeMounts:
- name: "jenkins-codevalet"
mountPath: "/var/jenkins_home"
env:
- name: CPU_REQUEST
valueFrom:
resourceFieldRef:
resource: requests.cpu
- name: CPU_LIMIT
valueFrom:
resourceFieldRef:
resource: limits.cpu
- name: MEM_REQUEST
valueFrom:
resourceFieldRef:
resource: requests.memory
divisor: "1Mi"
- name: MEM_LIMIT
valueFrom:
resourceFieldRef:
resource: limits.memory
divisor: "1Mi"
- name: JAVA_OPTS
value: "-Dhudson.DNSMultiCast.disabled=true -Djenkins.CLI.disabled=true -Djenkins.install.runSetupWizard=false -Xmx$(MEM_REQUEST)m -Dhudson.slaves.NodeProvisioner.MARGIN=50 -Dhudson.slaves.NodeProvisioner.MARGIN0=0.85"