As ecosystems such as the Yocto Project and OpenEmbedded becomes more popular for embedded device development, usage of the underlying bitbake build tool is increasing. Bitbake is a powerful tool used to manage, build and integrate complete operating system images, through package and distribution management activities such as fetching of source code, configuration, cross-compilation, installation. As with all software build systems, speed and performance of bitbake is of utmost importance and a key enabler for productivity and quality. This has also been validated multiple times when I've talked to key contributors and users in the Yocto and bitbake communities, where there are a lot of focus and ongoing discussions about what can be done to improve performance. So a valid question is, can we understand how a bitbake build currently behaves and from that, determine where or if there are opportunities for improvements? In this post I will explain my process of understanding bitbake build performance and provide some useful utilities that I hope the bitbake community will benefit from as the never-ending quest for additional build performance continues!
What's available today in terms of bitbake build visualization and performance analysis?
After following the Yocto Project Quick Start guidelines and some exploration of the bitbake build process and its artifacts , I ran into a folder called buildstats under /tmp. This folder has the below structure:
buildstats/ (e.g. core-image-sato-qemux86)/ / / do_compile do_configure : : do_unpack Looking at one of these do_ files (e.g. do_compile) reveals a lot of interesting data:
>cat do_compile Event: TaskStarted Started: 1367381613.31 xkbcomp-1.2.4-r8.0: do_compile: Elapsed time: 3.25 seconds CPU usage: 23.3 EndIOinProgress: 0 EndReadsComp: 0 : : StartTimeWrite: 1725266628 StartWTimeIO: 1725869092 StartWritesComp: 0 Status: PASSED Ended: 1367381616.57 So for each task, we can get start time, end time and a bunch of other potentially useful data, interesting! Have someone in the bitbake or Yocto Project communities already done some profiling and analysis using this data? It turns out there is a utility out there called pybootchart that can generate a static SVG-visualization as a vertical listing of all tasks in a bitbake build. With me being used to analyze build performance using the power of ElectricInsight, this visualization from pybootchart fails to scale with the amount of presented data, and provides very little additional actionable metrics and reporting that will help me understand where my bottlenecks and opportunities for improvement are. So an interesting question is, is there a way to leverage ElectricInsight to visualize and understand bitbake build behaviour and performance, using the bitbake buildstats data? It turns out it was fairly trivial to implement a script that can transform all this data into an ElectricInsight compatible annotation-file, that we can use to understand actionable takeaways such as effects of bitbake concurrency, and overall task-by-time reporting. Further details below about how to access this script available at the public CloudBees GitHub Repository.
Using ElectricInsight to visualize and understand effects of bitbake concurrency
Bitbake supports at least two levels of parallelism – through multi-threading within the bitbake task execution mechanism (BBNUMBER_THREADS) as well as through passing the -j flag to the underlying calls to make (PARALLEL_MAKE). When run on a physical 8-core server with 20GB of RAM and decent disk performance, the three screenshots below are from bitbake builds using varying levels of concurrency (concurrent threads on the y-axis, time on the x-axis, and individual tasks represented by the various colored boxes): **_BB_NUMBER_THREADS=8 / PARALLEL_MAKE=8 : BB_NUMBER_THREADS=12 / PARALLEL_MAKE=12 : BB_NUMBER_THREADS=16 / PARALLEL_MAKE=16** : As you can see for these three different configurations for this particular build on this particular box, 8-way concurrency delivers the best performance at roughly 74 minutes and distribution on the threads seems pretty packed. As you scale up the concurrency, there are two phases in the build at roughly the ~32m and ~50m marks where the gap or idle threads indicates serializations - obviously pointing out areas where I would start my analysis if I were to make an attempt at optimizing this build. When running ElectricInsight with such an annotation, filtering out which tasks are possible culprits for the serializations is easy to point out. Apart from showcasing the ElectricInsight bitbake build visulizations through the above three screenshots I don't aim to make an exhaustive analysis at this point.
Using ElectricInsight to understand relative task by time distribution
ElectricInsight has a built-in report that can be used to visualize a heat map of where your build is spending its time, called the "Job Time By Type" report. This automatic categorization into different job-types is done by some clever identification and mapping, happening under the hood of the tool. Unfortunately at this point, these categories in ElectricInsight are fixed and not customizable. To enable this categorization, I built in some mapping logic in the conversion script where the most significant task-to-job mappings are shown below. Let's take a look at what we get: As you can see, the do_configure, do_compile and do_package tasks combine for a rough total of 77 of the total runtime. I must admit the relative significance of the do_configure task with an average runtime of 25s was a bit surprising to me, and would be interesting to explore further. For context, other significant bitbake tasks in this build are:
Filesystem I/O: do_install Exist: do_package_write_rpm Code gen: do_populate_sysroot
This is cool! How can I use this tool to visualize and better understand my own bitbake build?
It's really simple:
ElectricInsight is bundled as part of the free CloudBees Accelerator Developer Edition download.
The script to convert the bitbake buildstats data into an ElectricInsight compatible annotation is available for download at the public CloudBees GitHub Repository.
In your bitbake build environment, simply run the following script downloaded from GitHub and open up the resulting annotation-file in ElectricInsight:
What else could be done with this tool?
ElectricInsight is a very powerful tool for build optimization, troubleshooting and analysis. There are a number of possible further capabilities that could be built into the bitbake-to-annotation conversion script:
Embed the stdout from each bitbake task, for easy search and troubleshooting
Leverage additional data-points from the bitbake buildstats task files for further metrics and data visualization
Are you finding this useful or have any feedback from using this tool? Don't hesitate letting us know!
Build Acceleration and Continuous Delivery
Continuous Delivery isn’t continuous if builds and tests take too long to complete. Learn more on how CloudBees Accelerator speeds up builds and tests by up to 20X, improving software time to market, infrastructure utilization and developer productivity.