When building continuous delivery pipelines, teams are often faced with the Fan-In challenge.
Background
What do we mean by this? The diagram below shows a typical scenario. You have multiple components or applications that form part of the system. Each has their own build/test pipeline. This may also go as far as performing deployments and running various component level tests.
Ideally, your architecture principles have resulted in loosely coupled components, with well defined interfaces. You will have an approach for ensuring interface backwards compatibility, or, if you need to make a breaking change, a mechanism to version your API and support concurrent versions. This will allow each component to have its own independent lifecycle including production deployments.
However, sometimes this is not possible, and you have a need to bring together (assemble if you like) a collection of components with specific versions, and execute some form of aggregate / integrated test phase. In one particular customer engagement, there was a requirement to continually run a Non-Functional Test Suite against a baseline set of components. As they were updated, the baseline needed to be updated, and the tests re-run. In working on this problem, we developed an implementation for this common Fan-In pattern. In the rest of this article I will show you how to build a set of jobs to implement this pattern.
Requirements Addressed
Each component already has a CI / CD pipeline of its own resulting in the publishing of tested and versioned artifacts
Each component has its own version numbering strategy
The system wide test cycle will take many hours to run
On completion of the test cycle, it should be run again
At the start of each test run, any new component versions should be deployed
Each test run must have an audit of the component versions upgraded and included in the tests
Job Implementation
The solution implements the requirements using two workflow jobs:
Add/Merge Manifest Job
Deploy and Test Job
These are show in the diagram below, with the key steps that they implement.
Why Two Jobs?
The simple answer is "due to concurrency".
The more detailed answer:
We want to ensure all triggers from the upstream components are processed - we don't want to drop any of them
We need to ensure updates are processed sequentially
We need to ensure that only one update at a time is taking place on the manifest (i.e. synchronized writes) - i.e. build concurrency of 1
We also need to ensure that only a single build can be running either the Deploy or the NFT test - i.e. we cannot start re-deploying new component versions while an NFT run is happening
We looked at whether a single workflow job would suffice, but, workflow's concurrency restriction means multiple builds become coalesced, so there is a risk of dropping some of the triggers and their associated parameters, and secondly, it is not possible to wrap more than one stage into a concurrency unit (well now at least until JENKINS-29892 is implemented).
Add/Merge Manifest Job
Let's start by implementing the Add/Merge Manifest Job - this is a Workflow job type.
The first problem that we need to solve is where to persist the state of the required component versions that need to be deployed - i.e. a System or Application level manifest. A number of options were considered:
Storing the versions in a file in an SCM
Storing the versions in a shared file system
Storing the versions in a file associated with the build
Looking up from an external repository - i.e. always taking latest "published" versions
We took the decision to store the versions in a file associated with the build as this provides the greatest portability (only relying on Jenkins itself rather than external systems), but also, as it allows the history to be maintained.
Our implementation retrieves the previous Manifest (i.e. from the last successful build) and then allows updates/additions to be made before it is re-published as part of the build.
Let's look at how we implement the file handling steps within workflow.
Obtaining the previous Manifest:
try { step 'CopyArtifact', filter: 'manifest', projectName:env.JOB_NAME, selector: [$class: 'StatusBuildSelector', stable: false]]) // Do something with the file } catch (Exception e) { echo e.toString() // Do something to create the first version of the file }
We have chosen to only take the file from the Last Successful Build to ensure if the job abends and the publish fails we don't lose continuity.
On completion of our update processing we need to publish the updated manifest:
archive 'manifest'
Next we need to actually read and write the component version numbers from / to the manifest. Fortunately Groovy provides some easy classes to handle this. We are going to use the Properties() class.
If we don't already have a Manifest (i.e. 1st run scenario), we define a new Properties() object:
versions = new Properties()
Otherwise, we load the properties object from the file:
def str = readFile 'manifest': file, charset : 'utf-8' def sr = new StringReader(str) def props = new Properties() props.load(sr)
Writing the file out is similar, however, due to serialization restrictions, we need to wrap the use of StringWriter in a @NonCPS function:
writeFile file: 'manifest', text: writeProperties(props) @NonCPS def writeProperties (props) { def sw = new StringWriter() props.store(sw, null) return sw.toString() }
And then this job performs its main function which is that of taking the job parameters and updating the manifest:
versions[app]=revision
and performing an asynchronous trigger of the build to run the tests:
build job: downstreamJob, propagate: false, wait: false
Note: This job is set up as a parameterizedbuild with two string parameters: app and revision This job is set up to not perform concurrent builds. Should it get triggered concurrently with different parameters, then triggers will not be coalesced - multiple builds will be queued. This guarantees that we process all updates to the manifest in turn.
The full source code of this job can be downloaded from:
https://github.com/harniman/workflow-demos/blob/controller/fan-in-add-to-manifest.groovy
To aid comprehension, the job has been set up with stages, which allows progress to be clearly seen in the CloudBeesJenkins Platform Stage View:
Note the link to the manifest which is highlighted.
The manifest shows the versions of the apps required as per:
Deploy and Test Job
The deploy and test job is also configured to execute one build at a time. It is also a workflow job type. This ensures that multiple triggers are coalesced - it is not parameterized, instead, reading the manifest for components and versions required from the upstream build.
This job uses similar steps to read both the required manifest from the upstream build, and also the current manifest from the last run of this job. Again, we need to handle scenarios where this is the first run, and the file does not exist.
Having created the current and required Properties() objects, it is an easy chunk of groovy code to compare and return a list of updates required:
def compareVersions ( requiredVersions, currentVersions) { currentapps = currentVersions.stringPropertyNames().toArray() reqapps = requiredVersions.stringPropertyNames().toArray() Properties updatedVersions = new Properties() for (i=0; i < reqapps.size(); i++) { def app=reqapps[i] if (currentVersions.getProperty(app) == requiredVersions.getProperty(app) ) { log "Calculating Deltas", "Correct version of $app already deployed" } else { log "Calculating Deltas", "Adding $app for deployment" updatedVersions.setProperty(app, requiredVersions.getProperty(app)) } } return updatedVersions }
The above code compares the set of currently deployed versions (as retrieved from the previous build) and those required (from the add-to-manifest job) and generates a list of the updates required. Instead of retrieving current versions from the output of the last successful build, it would be possible to query each of the running apps for their version as a stronger validation. It is also possible that implementing a means of removing no longer required apps could be considered. These enhancements are left to the reader to implement.
If changes are detected, we need to perform the necessary un-deploy and re-deploys to the environment. In order to reduce delays in the execution of the build, we will use the ability to run sub-workflows in parallel:
if (appsToUpdate.size()>0) { log "Update Apps", "The following apps require updating: ${appsToUpdate.toString()}" def branches = [:] for (i=0; i < appsToUpdate.size(); i++) { def app=appsToUpdate[i] def revision = updatedVersions.getProperty(app) branches[app] = { decom(app, revision) deploy (app, revision) } } parallel branches }
decom and deploy are functions defined in the script. It is left to the reader to implement the necessary steps to perform these tasks - either inside the workflow, or by calling existing mechanisms.
On completion of all the deployments, the test steps are invoked. Again, the reader is tasked with filling in the details.
The final step on completion of the test step, is to re-trigger the job so it runs again. This need was specific to the customer concerned - you may choose on successful testing to publish the results, perform some further deployments, such as to staging environment or trigger another job.
The execution can be monitored from Stage View:
Again, note the links to the manifest of tested versions and the updates that were included in the run:
manifest contents:
updates contents:
The full source code can be downloaded from https://github.com/harniman/workflow-demos/blob/controller/fan-in-deploy-and-test.groovy .
Further Improvements
These two workflow jobs share some common functions such as reading and writing the Properties objects to/from files. Rather than duplicate, consideration should be given to extracting re-usable functions into the shared CPS Global Library. Please see https://cloudbees.com/blog/jenkins-workflow-using-global-library-implement-re-usable-function-call-secured-http-endpoint for more details.
Nigel Harniman
Senior Solution Architect
CloudBees