Lessons Learned from Build-Flow Development

Advanced Jenkins usage require to chain jobs together, and in most cases not just as a pure sequence. The classic approach relies on a bunch of plugins : parameterized build triggerjoin,downstream-ext, promoted buildsretry failed buildsconditional build step, etc. This has significant drawbacks :

  1. configuration can quickly become complex, especially if you want to introduce some dynamic decision into your build pipeline. Interaction between those plugins can have strange side-effects, hard to diagnose an fix.
  2. there’s no global view of the whole software process. Workflow configuration is distributed in various involved jobs, hard to understand and maintain.

Some plugin already tried to improve this, for sample the very popular build-pipeline plugin renders a nice global overview for a static pipeline - but is limited to mostly sequential, static workflow.

Request I got from customer when I joined CloudBees included choosing the next job to trigger based on current job status or output, running in loops, making decision based on combination from various parameters, and some very complex orchestration requirements.

I then developed build-flow Jenkins  plugin as a spare-time project. Initial goal was to design an DSL to let Jenkins user express in a very concise way the orchestration of jobs involved in a complex software development process.

As my first DSL implementation, I used Groovy, known to be a very efficient and simple way to create DSL. This made development easy, but had a major and unexpected negative impact on this plugin.

Without restricting the Groovy DSL, most user for this popular plugin use build flow as a script console. Groovy is used on groovy console, so most advanced Jenkins user already know how to use it to access Jenkins API. Some started using groovy script to compute intermediate results to be passed as parameter for other jobs, sometime even storing some results as file for later use, and defining methods to keep the flow manageable.

This definitively is a nonsense for me. DSL was supposed to define an orchestration model, I had in mind to get it converted into a state machine under the hood (comparable to the way Gradle converts your build script into a task execution plan). With advanced groovy scripting involved, this becomes impossible.

Lesson #1: if you give user some tool, they will use and abuse it. Strictly define boundaries and constraints on your APIs, even being a DSL. Groovy can be used to restrict DSL to a subset of keyword and constructs, but that’s another topic and I’m not a groovy guru to explain this.

Other major issue is Groovy shell used to run the flow can’t be interrupted and later resumed. If some job fail, you can’t re-trigger it and get the flow to resume from previous state. If Jenkins has to restart, you have to wait for the whole flow to complete. Some CloudBees customer have build process to require hours to complete…

Groovy is a compiled language, converted into bytecode, so you can’t capture a scripting context and serialize it. Options I had in mind are unnecessary complex and will probably break existing DSL scripts one way or the other.

Lesson #2: higher-level tooling on Jenkins need to be asynchronous, idempotent. Sometime, a Jenkins job is not fine-grained enough and already suffer this issue.

Last but not least, Groovy is so powerful it can let you do anything on Jenkins, so a flow require RUN-SCRIPT (i.e. Administrator) permission, that makes no sense for initial job orchestrator plugin use-case. Possible option is to use groovy-sandbox, that we already use to secure CloudBees template plugin.

About Jenkins security, the DSL let you run a job by name, the name can even be computed by the script, so without Administrator permission we have to check you only trigger job you have BUILD permission on. For this, plugin would have to memorize permissions for the user to configure the build-flow, could rely on recently introduced QueueItemAuthenticator, anyway not a trivial change.

Lesson #3: don’t neglect security! groovy sandbox seems to be a good option, could apply to other groovy-based plugins.

Based on this, I’m not sure I’ll be able to get a “production ready” build-flow plugin just based on my spare-time contributions - especially as I’m also maintaining git-plugin that consumes most of my spare-time

On the other hand, workflow becomes a hot topic, and during last CloudBees engineering meeting in Los Altos, 3 hours were spent discussing opportunities to create a new Enterprise plugin to cover this topic. More to come … soon :)


Nicolas De Loof


Blog Categories: 

Add new comment