The Electronic Arts DevOps Tale

This blog post is by Josh Nixdorf, Technical Director, Electronic Arts 

Electronic Arts is all about games. It’s no surprise that software, and software developers, are at the heart of every feature in every game we create, including EA SPORTS FIFA, EA SPORTS Madden, or Star Wars Battlefront. When we release our games, they are immediately enjoyed by millions of players around the world. At that scale, after just the first few hours of play (sometimes less), our players have exercised the game more than our QA efforts ever could. So, we relentlessly pursue ways to ensure that QA and development can focus on creating fun and authenticity over finding bugs. Before each release, my development and release engineering teams do everything we can to make sure that our developers and testers can make the most of their available time in adding new features and ensuring quality. That’s why when it comes to DevOps, we don’t play games.

Without DevOps we would not be able to do many of things we do today, at least not at the scale we do them.

What kind of scale? On our big projects, we have about 1,000 contributors performing about 100 daily commits on a code base with 25 million lines of code. The executables we build are about 300MB, while the artifacts are closer to 50GB. It’s not uncommon for EA to put in two million hours of testing time per year on our games.

Many of our project metrics have been increasing almost exponentially over the past 12 years – including team size, commits, artifacts and the number of titles we are working on.

From my team’s perspective, one of the most important metrics we track is years of effort saved. That too has increased exponentially as we progressed on our DevOps journey, starting at about 20 years of effort with our first crack at DevOps and surpassing 1,000 years today. Our journey was not a smooth step-by-step progression, but rather one with some setbacks, a few hard lessons learned and a humbling experience or two. When we think back on how we got where we are today, we see three broad stages of maturity: DevOps 1.0 (2006-2010), DevOps 2.0 (2011-2015) and DevOps 3.0 (2016- present).

DevOps 1.0 – Tentative First Steps

When we first began in 2006, we didn’t do DevOps or operations mindfully – in fact, we didn’t even know about the term “DevOps.” We did know that our game development teams had fallen into an unhelpful pattern. Typically, each team would place a junior engineer in charge of the build system – at that point, it wasn’t continuous integration/continuous delivery (CI/CD), just builds. This person was usually the first one in each day, and the last to go home. After about a year, he or she would get pulled onto a dev team, a new junior engineer would be in charge of builds and the cycle would repeat.

In our first generation of DevOps, we sought to break this pattern by transitioning to meaningful CI/CD. We wanted to banish the phrase “it works on my machine” from our hallways and meeting rooms. We took some tentative steps toward automation-as-code and began writing our own CI engine. We had some early successes but also saw some significant downsides. On the plus side, we saw improvements in reproducibility, quality and scalability. We were also saving time and it was easier for developers to move to a new team because there was now a common continuous integration experience. Ultimately, however, we found the system we wrote was too complex for our teams and our users. It made it more difficult to onboard engineers because they had to learn our custom tool. We had underestimated the support burden of maintaining our own tool. And most importantly, we realized that as an entertainment company, building our own CI engine was never going to give us a competitive advantage.

DevOps 2.0 – Bridging Dev, IT, QA and More

Around eight years ago, we began the second phase of our journey. We still weren’t calling what we were doing DevOps, but we made significant strides in several areas, including moving from physical to virtual infrastructure and bringing in industry standards tools, including Jenkins, rather than developing our own. Around this time, we recognized that development teams had little idea what IT was doing and vice versa, and we began looking for ways to bridge that gap.

The move from physical blades in our datacenters to virtualization was born out of necessity – we simply had no more room in our datacenters to keep up with the demand for more builds. The move to virtualization and Infrastructure-as-a-Service boosted reliability and consistency, but it did require a few cultural changes, particularly in IT.

The transition to Jenkins from our custom tool provided several advantages. First, obviously, we had fewer problems related to our custom engine. In addition, it simplified hiring, because many developers had prior experience with the same modern frameworks, modern languages and modern tools that we were now using. Even the engineers coming out of school these days have some experience with them. We also took another run at automation-as-code (as we had done previously with DevOps 1.0). This time we were a bit more successful, but our problem was that we implemented our system before pipelines were widely available in Jenkins. So, once again, we had a custom system that deviated from what has become industry standard, and we had to go about redoing automation-as-code using pipelines.

DevOps 3.0 – The Present and Future

In 2015, we started DevOps 3.0. This time, we were all aware of DevOps and we recognized the need to become active participants in the larger DevOps community that had been growing for some time. In fact, we needed to play a little catch-up. We start looking to the cloud and embracing open source, which was a big step for EA. Soon, my team began providing the following guidance to development teams:

Before you write anything, check to see if there is an existing open source or off-the-shelf solution. We shouldn’t be solving problems that are already solved. If you insist that this is not a solved problem, then you’ll need to have to push what you write back to the community.

One of the ways we are working with and contributing back to the open source community is via Jenkins plugins. After about one year of doing that, we saw that getting onboard with CloudBees was an interesting opportunity to further our engagement and get some additional benefits, such as enterprise support and added scalability and stability. We saw the CloudBees solution as a great way to reduce developer burden without having to do everything ourselves or wait for the community to do it.

Looking back on our journey so far, I am happy to admit that we made mistakes. Many of them stemmed from unseen tradeoffs that we did not fully recognize until we had started down a particular path. When we thought we absolutely had the right answers, it was usually because of a tradeoff we had yet to entirely grasp.

Along the way, we learned the importance of culture and how difficult it can be to change. Going forward, we are addressing this by applying one of the biggest lessons we’ve learned on our journey. We must frequently ask ourselves:

Whose life are we making better?

Our team had to recognize that our job wasn’t just to save time for ourselves or even necessarily to reduce costs for EA. Instead, it was more about saving time for developers and for QA, because the time that we save them is reinvested by the developers to write better games and by QA to test for fun and authenticity, rather than for defects.

That ultimately results in us delivering a better experience to our players, which is what will continue to give EA a competitive advantage down the road and keep gamers playing millions of hours in those first hours of release and until the next release is available.

Josh Nixdorf is a technical director at Electronic Arts in Burnaby, BC. He has worked on numerous games including NBA, FIFA, NHL, Madden and SSX in varying roles such build engineer, programmer and automation engineer. In his current role as technical director, software engineering, Josh oversees a worldwide team of automation and build engineers working on most of the Electronic Arts catalog.