Oleg Nenashev and I will be speaking at DevOps World | Jenkins World in San Francisco this year about Scaling Network Connections from the Jenkins controller .
Over the years there have been many efforts to analyze, optimize and fortify the Jenkins remoting channel that allows a controller to orchestrate agent activity and receive build results. Techniques such as tuning the agent launcher can improve service, but qualitative change can only come from fundamentally reworking what gets transmitted and how.
In March, jira:27035 introduced a framework for inspecting the traffic on a remoting channel at a high level. Previously, developers could only use generic low-level tools such as Wireshark, which cannot identify the precise piece of Jenkins code responsible for traffic.
Over the past few months, the Cloud Native SIG has been making progress in addressing root causes:
The Artifact Manager on S3 plugin has been released and integrated with Jenkins Evergreen, allowing upload and download of large artifacts to happen entirely between the agent and Amazon servers.
Prototype plugins allow all build log content generated by an agent (such as in sh steps) to be streamed directly to external storage services such as AWS CloudWatch Logs.
Work has also begun on uploading JUnit-format test results, which can sometimes get big, directly from an agent to database storage.
All these efforts can reduce the load on the Jenkins controller and local network without requiring developers to touch their Pipeline scripts.
On the horizon
While “one-shot” agents run in fresh virtual machines (VMs) or containers greatly improve reproducibility, they suffer from the need to transmit megabytes of Java code for every build, so Jenkins features will need to be built to precache most or all of it. Work is underway to use Apache Kafka to make channels more robust against network failures.
Most dramatically, the proposed Cloud Native Jenkins MVP would eliminate the bottleneck of a single Jenkins controller service handling hundreds of builds. Large Jenkins installations should use agents to distribute the build load. Yet the controller can still receive and send lots of data over the network channel to agents, causing scalability issues as build logs, artifacts and test results are streamed.
New tools can help you identify protocol-specific load issues coming from the Jenkins core or various plugins. Additionally, Jenkins core developers are also working on alternate cloud storage for some of this data, permitting it to be streamed directly to or from the agent so that the controller needs to handle only metadata.
Come learn how these tools and features can help you manage performance-critical installations in our talk!
Stay up to date
We'll never share your email address and you can opt out at any time, we promise.