Josh Nixdorf - Winning the CD Game
In this episode of DevOps Radio, we hear from Josh Nixdorf, technical director at Electronic Arts. We’ll hear about his start with software development, the difference between developing gaming software and business software and how to win the game of continuous delivery.
Andre Pino: You’re listening to DevOps Radio, the podcast series that dives into what it takes to successfully develop, deliver and deploy software in today’s ever-changing business environment. In this episode our featured guest is Josh Nixdorf, technical director at Electronic Arts. Welcome, Josh. It’s great to have you with us today.
Josh Nixdorf: Hi, Andre. How’s it going?
Andre: It’s going great. So Josh, I think just about everybody knows Electronic Arts as the gaming company with very popular titles like NBA and Madden. Which of those have you had a hand in?
Josh: That’s a good question to start on. When I started the company it was with directly doing some build engineering on NBA, but the team that I’ve been on is generally essential to the CI/CD space for the whole of the company. We started off in sports. So I had a chance to work with the FIFA soccer title, football for all our European listeners that might be annoyed that I just called it soccer. Basketball, hockey, Madden football, as you called out and then recently we’ve extended more globally within the company. So there’s more interaction with Need for Speed, Star Wars and some of our other titles in that space, Battlefield.
Andre: Very cool. It’s nice. Do you tell all your friends that you’ve got your hand into those titles?
Josh: It’s a hard one. The CI/CD stuff tends to be both so behind the scenes. When I first started here, I definitely was telling my friends that and they would want me to point at the thing that I did. “Well, you know that box that’s right there, the fact that the game is in that box, that was me.” That has a lot less glory than some of the feature development.
Andre: Understood, but you’re the wind beneath the developer’s wings, right?
Josh: That’s the goal.
Andre: Cool. So for our audience, Josh, give us a little sense for your background. How did you get into software development to start with?
Josh: That – I got to try not to be too long winded here. So you know what? Let me start at the really beginning. I really, really, really loved video games when I was a kid. I’m sure no one is gonna be surprised to hear that. So from a very early age I knew that when I grew up I wanted to make video games. It was – just seemed like the thing to do. I enjoyed playing them. There wasn’t this huge market to be able to survive playing them yet. So making them seemed the logical thing to do. And I made the mistake of getting a computer science degree thinking that would help me, and it did in many ways get into the industry, but making games has more to do with – how do I put it – it’s a design type thing. And as someone with very little art skills, the science aspect was a way in the door but not necessarily doing what people would imagine that game makers actually do. So I seemed to be pretty good with computers as a kid.We didn’t get one in my house until I was almost a teenager, but I just seemed to have a knack for that stuff. So given as a kid I always wanted to make my own games, when I got a computer it seemed, “Well, I can just start doing this. This should be trivially easy,” and it wasn’t. So I spent all sorts of time learning obscure programming languages, ‘cause they weren’t really teaching this stuff at least in high school yet where I grew up. So I started writing code and visual basic and ended up actually finding a need for people to get custom software. So actually when I was 15, 16 I was selling custom applications to people and then I got into web stuff ‘cause obviously that was big around the turn of the millennium. And then, at that point, I had entered into a computer science degree and the rest is sort of history from there. But for the most part, it was I had always known that I had wanted to make games and so I wanted to do whatever I thought I could to get into the industry and computers seemed the easiest way to do that for the skill set that I have.
Andre: So as a technologist I know that a lot of technologists have their pet technologies and projects and upcoming things that they like working on. What are some of yours?
Josh: That’s interesting. I’ll have to put some thought into that. There’s a fair amount of interest that I have outside of work, but in this space it’s always amazed me how not lazy everybody is. That’s probably not the answer you’re looking for, but where I had come from was when I had started here everybody was working really hard. And it’s not that it didn’t make sense. People are really passionate, but they weren’t really working smart, is probably the best way to put it, and so for me where my passion, at least in this space comes from, is how do I make computers do things that people can do. And that’s pretty easy when everybody is just getting started and maybe aren’t aware of what’s available in that CI/CD space, but as things progress it’s, “How do you get more complicated techniques? What are the things that computers can do that people don’t yet realize computers can do?” There’s lots of great papers, articles, and videos available on the Internet of games being played by AI and not the in game AI playing it, but actually external AI’s playing the games and learning how to play the games. That type of stuff is of particular interest.
Andre: That’s really interesting. So sort of automation is – and the need to automate a lot of the more mundane processes associated with software development delivery is what really first got you involved in continuous integration delivery.
Josh: Absolutely. And I would say that my passion for automation extends into my personal projects. I was fortunate enough to have been able to pick up a house a few years back and my wife will tell you that I’ve spent way too much time trying to put in all sort of automated stuff. Be able to remotely answer door knocks or just other absurd things, but I want an automated house. I wanna live in the world of Star Trek where you just tell the computer what you want. I want that at my work and I want that at my house.
Andre: Nice. So your career passion extends into your personal passions as well.
Andre: So Josh, I can only imagine that developing gaming software is different than developing business software. Can you talk a little bit about that?
Josh: Absolutely. So the – I only have minimal experience doing business software. So most of my anecdotes are gonna be as a result of talking to friends, co-workers that have formally been in business. My only limited experience was small app development, but the biggest thing that comes to mind is game developers look like software developers and they talk like software developers and they certainly sound like software developers, but they almost all end up being entertainers and that ends up being a fundamental difference is we’ll talk like a software development company, but it’s really an entertainment company. And that ends up making for some really, really subtle differences. The biggest of which that I’ll often site when asked a question like this is, we don’t necessarily need to write quality software. And that isn’t to say that we don’t aspire to it, but where another company’s bottom line will be based on how much quality is in the software, how few bugs they have, how little maintenance they do, our sort of KPI for success is, “Was this fun? Did it sell copies?” And bugs surprisingly frequently have no impact on sales and sometimes have the opposite impact. A lot of the features that people really like in especially older games tend to be bugs that were accidently introduced. There’s a huge online community of people playing speed runs of old games that people haven’t played in 20 years trying to see how quickly they can beat those things and they’re exploiting all sorts of bugs that are left in those games to be able to do it and these people – they have a fan base. There’s a market for this kind of stuff. So it’s really interesting versus regular software where sort of mistakes are a horrible liability to be avoided at all costs. As long as what didn’t impact the fund negatively we actually don’t really need to care.
Andre: So bugs for fun. That’s a new one. That’s awesome. So tell us a little bit about the process that you’ve been automating amongst your software development folks and how that’s progressed over the years.
Josh: Okay. So when I had started the company it was a little less than a decade ago now I took a role as a build engineer, and as I told with my background all I ever wanted to do was make games. And so I had done through my graduate degree a fair amount of graphics work and audio stuff and AI stuff. I thought, “I’m gonna make sure that no matter what job they want me to do that I’m prepped for this.” And then when I started here it was like, “By the way, you’re a build engineer.” And my first question was, “Well, what the heck is that?” And they started explaining it and I’m like, “Why do you need that?” The problems just didn’t make sense to me, but I’d only ever worked on tiny projects that a person or two people could do, not the kind of projects that a team of a couple dozen needed to work on. So when I started here that’s actually a good note is the team sizes were generally in the couple dozen developers and maybe a couple dozen people handling the art side of things. And there was kind of this intricate dance of everybody checking into the code repository and hoping that everything would work out, but the build engineer had the really terrible role of making sure – they needed to be the first person in in the morning to ensure that the nightly build had gotten to QA’s so that they could do their role. And they were the last person to go home at night to make sure that the process which would create that QA build for QA actually got kicked off at the right change list. So for the first year I felt that I was just completely tied to the team. It wasn’t like I had to work 80 hours a week or something like that, ‘cause you really weren’t working that much. It was you needed to be there first to make sure that it worked and last to make sure that it worked. So the amount of time in between was relatively easy, but it was more demanding than I wanted it to be as I called out before. I tend to like to be lazy about these things. So I ended up very fortunately working for a manager who really believes in automation and had made it clear to the entire team that if we worked zero hours a week as long as the computers were doing all the work we were supposed to be doing that was fine by him. That obviously wasn’t practical, but it set the stage for that really aspirational goal of, “Work smarter, not harder.” And so when we had started there were maybe a handful of CI jobs, but they didn’t produce the QA build themselves. That was largely a manual process that was triggered by something. The dev team was really cherry picking at the point of which those builds would be produced. Actually the process was a disk would be waiting in the morning, rather an image would be waiting in the morning. I’d have to burn it to a disk and then I’d have to deliver it to the team to do a little bit of testing and then we would move that on to QA. After I put in about a year with the team and they’d seen how hard I was working and what impact that was having on me they were willing to take some suggestions and we were able to start changing the way we did CI by automating the QA build, by automating the QA build every night, by having that build go to QA actually every morning automatically and having them do the sign-off for it. And this dramatically increased their life – improved the quality of their life because now they weren’t – they were reducing the stress in which builds go to QA. They were just sort of overwhelming QA with builds, but giving QA the option to pick and choose. If they gave something a quick test and it was no good they could just use yesterday’s build or at least years ago we could use day old builds. And so my life improved as well. Since that time though the release process has changed substantially for us and the team sizes have grown. So now – back then our biggest teams, as I said, had maybe a couple dozen developers and a couple dozen artists. Now that same team has closer to 100 plus developers and they’re not so much artists anymore, but content creators. I couldn’t even tell you how many there are worldwide but it’s a phenomenal number and that’s the impact as well is these titles aren’t being developed by several dozen people in one location. They’re developed by hundreds of people all over the world. So now that process is – it’s not – the nightly build is a good example. You can’t just kick off a build at 11:00 because everybody’s gone home. Well, 11:00 is when the crew that’s working on this title in Asia starts up or in Europe or where ever around the world. So we have far more players involved. The dance has gotten quite a bit more complicated and there are certainly options. Anyone who’s listening that has any experience in CM will have wondered why I haven’t talked about branching yet. Why don’t people just work in multiple branches? The way that we need to test those builds, the QA process, and the speed of which we need to work doesn’t allow or isn’t nicely conducive to merging those branches in later. It actually ends up being quite a bit easier to have everybody work in one branch and ensure that that product at any given time is both highly stable and tremendously current. And so over the years the process has morphed into being able to keep up with those requirements. So where we had dozens of CI jobs, we now have hundreds. Where we had a handful of QA disk builds we now have potentially hundreds for each of the world’s regions or different configurations or platforms that we might ship these things on. So the scale has increased traumatically, but fortunately we’ve been able to continue – we’ve been able to scale up the CI growth to be able to facilitate that. And frankly, I think if we were able to talk to some of the biggest teams they’d tell you that a decade ago the CI was a nice to have that kind of helped people. Now the dance simply wouldn’t be possible without that level of CI, that the CI has allowed the teams to scale and stay competitive.
Andre: That’s an amazing point. So not only has your business scaled, but the scaling of the development operation as well as increasing in complexity has really provided you with some unique challenges.
Josh: Absolutely, but as I said, the CI has really enabled that complexity to even be able to exist. I can’t imagine how we could do – how we could ship the games we do in the time frames we ship them if we didn’t – if we hadn’t really bought into sort of this CI methodologies and how much that could improve the world.
Andre: So I don’t think you can talk about continuous integration, continuous delivery without talking about testing, the automated testing, and I’m sure our audience would be really interested in understanding some of the testing challenges as a game maker that you face and how you’ve met some of those challenges.
Josh: This is a good one and it’s a particularly passionate topic for me. So one of the most interesting challenges that I’ve found is as I joined the industry and learned more about testing and talked to other companies and how they do their testing, it’s always amazed me how different it actually is for games. Some of the really, I wouldn’t call them unique problems, but things that people might over look, UI testing is – it isn’t impossible, but it’s very, very difficult. I know a lot of people say – lots of people will use the excuse that UI testing is difficult and then not do it. I’m not suggesting that. I’m suggesting that most of our games are products that have been around for a long time. They are huge code bases. So by the time someone started asking, “Hey, we could be doing UI testing to make our lives better,” we had already inherited maybe a project – most of our core basis aren’t gonna be a decade old, but let’s use that as the extreme case. Maybe you have something that already has ten years of hundreds of developers working on it, millions of lines of code, how do you even decide where to start UI testing? So not to say that we don’t have it. We certainly do, but as far as a game product goes it wasn’t a practical way to do that. So most of our testing has focused on integration testing or functional testing where we’re largely taking the final product and then we’re testing that and even that has lots of interesting implications. Sports titles are really interesting, but if you pick up the Madden Football game you’re playing against the computer. The computer is a pretty good component I think. If most people play the game they’ll probably agree to that. If we wanna test a game it’s perfectly viable for us to say, “Well, let’s just make the computer play against the computer,” and because we’ve already built an AI into the game we already have these agents that can go then sort of do at least a fair amount of the coverage testing for us. So for sports games at least that level of testing is almost built in.
Andre: It’s wild to think about.
Josh: Yeah. Right? It’s the AI already being there makes testing some things really, really easy, but where we end up having the problems then are in navigating through the front ends or in the online interactions. So a lot of where we end up really – well, in the past we’ve had struggles, but where we’ve done some really interesting things now is how do we make sure we’re navigating through the front ends effectively. So a lot of the tests we’ll make sure that they can crawl through, that the game areas stay open, that a bug doesn’t get introduced that does something crazy like make all the players the same guy or something like that or block off access to the teams. For the online testing it’s making sure that we are continually stressing that interaction. Any players that may have disagreed with my earlier assertion that the AI is not about opponent, well, those are probably your diehards that are playing online against their friends and the AI is not sufficient for them. They have to play against real people. At this point I think it’s safe to say, and probably not a secret, that most of our games are gonna be played online against other people. So if we work on a game all year and we have the AI tested for us that’s great to make sure that we haven’t broken the game play, but that online component, if we ship something and discovered, “Oops, we didn’t realize that this flow is broken,” our customers are going to discover that really quickly and they’re not going to be happy. So a lot of the stuff ends up being ensuring that those front-end workflows that people rely on are continuously working and never introduce any edge cases and then for the non-sports titles that don’t necessarily have the AI’s built-in things get even more wild because now maybe you are trying to write an AI. Maybe you have something simple that just does button presses. You hit left 10,000 times and see what happens, hit right 10,000 times and see what happens. But their experience is definitely much more difficult, but because as I noted before, we have been iterating on previous code basis. When we start with the brand new product we start with a working title and so the CI, we rely on the testing as part of that and the builds as part of that to maintain that high quality throat development. I would say for most of our titles, especially it’s an iterative title, the one that you shift we try never to drop below that level of qualities throughout the entirety of the dev cycle. Obviously we may make some fundamental changes that break things, but we don’t build a game for eight months and then bring QA in to start doing something. We are working with them – from day one they’re playing a playable game. So that gives the opportunity to have computers do the monotonous stuff like hit left 10,000 times and have the players focus on – the QA focus on fun and authenticity right from day one. So that’s an interesting challenge, but definitely one of the fun sort of perks of testing in games.
Andre: Well, it certainly sounds like you’ve moved the ball significantly forward in that area and the continuous testing aspect is clearly playing a big role in the success of Electronic Arts, that’s for sure. What – so what advice, as you look at all the automation that you’ve brought to bear on the continuous integration and delivery of games, what advice would you have for people who are just starting out in their journey of automation?
Josh: Start small and keep it easy. Trust is the most important thing that a developer needs to have in the CI system. If we’re gonna have computers take care of things people need to trust that the machines are delivering the correct results, that we’re not getting false positives or false negatives. So start small. Make something that you know works, do it manually, and then have the computer continue to do that for you. But very much don’t overdo it. As I mentioned before, it’s easy to go read papers on these crazy AI agents that will learn how to play a game. That’s an aspirational idea and if you can build that into your game from the get go you can design your game to be testable, by all means do that. But if you’ve already got something and you’re just getting into the, “Okay. Maybe I have expanded my team from two or three people to we’re gonna get serious. We’ve got ten people working or maybe it’s four people who are located around the world, we really need something other than our own discipline to help make sure that we can maintain code health or product health.” Start small. Don’t overdo it. It starts with one test, one build, and it scales easily from there.
Andre: Outstanding. So what’s the next big area of focus for you in Electronic Arts?
Josh: I’m glad you asked. This is one of the things that I’m particularly passionate about now. So as I said, I’ve been doing this role for a really long time. I would hope any of your listeners that are in a similar boat or wondering, “How the heck can you continue to do a build engineering role or be in that CI space for a decade? Haven’t you solved the problems by now?” And I’ve recently sort of realized one of the most interesting problems that we have at the moment is the cycle of change. So as we’ve already discussed we have our trust with our products. We have good code quality health. We are helping people to make better games than they would have been able to make in the past. How can we still have problems? Build engineering, at least in the game industry, at least here – I can only speak for us – isn’t particularly popular. The vast majority of people that want to join the company are all seeking the glory of, as I noted before, being able to point at something on screen and tell your friends that you did that. So being behind the scenes, it’s hard to find people that are willing to do that. And build engineering historically hasn’t been – or CICD, however you wanna refer to it – hasn’t been historically very sexy. So for us we have a continual stream of, I would say, junior engineers that are largely how we end up scaling the team and there’s nothing wrong with that. They come in, they’re eager. And being able to do clever things like use AI to do those – run those tests or scale up to the phenomenal sizes we’ve hit in game development do present interesting challenges that help keep those developers and those engineers interested, but for myself I’ve begun noticing the pattern of it’s almost empowerment is being a problem. We wanna bring people on to the team. We wanna give them a place that’s exciting to work. We wanna empower them to solve the problems to make their day better no different than I did when I had started. I had that manager who gave me a long enough leash to be able to improve my world. I wanna do that for the people that come in. But now we have the countering force of we’ve hit a stability point. Some of the change that we empower people to do ends up actually being a negative impact. I would use an example, when I started we were on a CI system that was not Jenkins. A few years into that, because of that empowerment, someone came up with the idea that they should write their own CI system for this. Now fortunately we’ve got our head or our act together and most of our titles have all _____ on using Jenkins for the CI, but I am in constant fear that in three years down the road the people that are new to the team, that don’t appreciate what we had gone through in order to make the switch to Jenkins don’t necessarily appreciate all the ease that it brings to set up our utilization, that those people are going to go, “Hey, this solution doesn’t solve this problem that I have. I’m going to write my own solution to this.” Or they’re going to deviate from the standards that we have. So we’ve hit the really interesting stage of how do you balance people being able to innovate and improve versus the group benefit of everybody being the same. When all our titles here were automated differently, when all the reports were different, it made it difficult for even game developers to move from one project team to another team. It was bad enough they have to learn a new code base, but now they’re learning new software engineering work flows. It’s different reports. Nothing is the same. As my team has been able to grow and handle the automation for the majority of the teams at the company it provides a consistency and experience. And if you move from the soccer title to the football title to the basketball title to the Need for Speed title these things all look the same now. And so it’s just a few less components that you’re having to learn as you make that switch, but it reduces that barrier, makes things overall easier, but like I said, that benefit runs counter to at some point somebody has to pay the cost of a bad work flow. And so it’s very interesting challenges in this space. So I see my next couple years as trying to figure out not just how do we make those work flows better, but how do we get people who didn’t have to live through, let’s say the dark ages, to appreciate why the solutions are the way they are so that we don’t find ourselves repeating the mistakes of the past.
Andre: And so leverage that investment that you’ve made over these years.
Andre: Josh, thanks very much for joining us today.
Josh: Thank you.
Andre: Like what you’ve heard today? Don’t miss out on our next episode. Subscribe to DevOps Radio on iTunes or visit our website at cloudbees.com. For more updates on DevOps Radio and industry buzz follow CloudBees on Twitter, Facebook, and LinkedIn.