Episode 80: How to Scale DevOps CI CD As Your Organization Grows

In Episode 80 of DevOps Radio, host Brian Dawson and tech executive Randy Shoup share advice on how to scale DevOps and CI/CD practices as organizations grow.

Brian Dawson: Hello. This is Brian Dawson with another episode of DevOps Radio. Today, we have the privilege of having Randy Shoup, a former VP of Engineering at WeWork and a long time engineering leader, who has worked at some notable companies that we're all aware of, and has done some pretty notable things. We're going to have a great opportunity to talk to him about that.

I'd like to first start by saying hello, Randy. How are you doing today?

Randy Shoup: Hey, Brian. Great to be here. It's good stuff.

Brian Dawson: All right. Great to have you – great to have you here. Randy, maybe you could start off by giving the listeners an overview of your background.

Randy Shoup: Sure. So like you said, most recently I've been VP of Engineering at WeWork and before that at Stitch Fix. Then earlier in my career, I spent a bunch of time as an engineering leader at Google and at eBay. I spent some time at Oracle in the very early days of my career. I spent a bunch of time at a security software company called Tumbleweed. But yeah, lots of different things from the small to the large.

Brian Dawson: Wow. Can I share with you that I think roughly 75 percent of the places you worked, where a number of them where you've led engineering efforts, I have consumed your products and I've been a happy consumer of your product?        

Randy Shoup: Thanks. That's music to my ears.

Brian Dawson: Actually, I have a Stitch Fix box downstairs. I need to go get it to the – I need to break quarantine to go get it back into the mail.

Randy Shoup: Oh, I'm excited for you.

Brian Dawson: So thank you for your work. Thank you for your work. Gosh, we've already, in spinning up for this, had some great conversations, so I'm kind of taking a moment to decide which path we take. I think where I would like to start is you as an engineering leader, really a technology leader with a wide range of experience across – and I don't mean to date you – but like me, across a range of time, where we've seen an evolution of technology. 

You've worked in small startups to medium-sized organizations. In fact, earlier in your career, you worked in an enormous organization with Oracle. What are some of the common threads across these organizations? What is it that is the same? And arguably, you can call out what's distinctly different between these organizations and over the span of time that you've worked there. 

Randy Shoup: Yeah, great. You're being very polite. I've been in the industry for 30 years, so yeah, it's been a while, more than half my life.

It's a great question about common threads across a bunch of these places. I think there are a couple of them. I think the places that do it really – I think the things that make organizations high performing versus the things that don't are independent of the size of the company. It's maybe a little bit more difficult to get that stuff done at larger companies, but it's certainly possible.

So I think high performing companies, again, whether they're small or large, set really clear, well-defined goals. They express what's the problem that we're trying to solve. Here's the value that we're providing to customers, whether it's boxes in Stitch Fix or software that your applications can run on, something like that.

The best places I've worked give a lot of autonomy and also accountability to individual teams. That's certainly true. You can imagine as a startup, everybody fits around a conference table and we're all doing one thing together. The rules are fluid, but we're all trying to get the same thing done together.

The very best large companies, like the Googles, the Amazons, the Netflixes of the world just have lots of small teams all together, if that makes any sense. So they don't – it turns out that Google's engineering organization, I hear today, is 50,000 engineers. 

Brian Dawson: Wow. That's mind-boggling.

Randy Shoup: It was smaller when I was there. It already seemed large, but today it's 50,000 engineers. 

It doesn't behave like a 50,000-person unit. It behaves like – I'm making this up – 10,000 five-person units, 10,000 individual teams with a whole ecosystem among them. Some teams' customers are other teams. In fact, most teams' customers are other teams in there. But at the team level, everybody still fits around a conference table or at least virtually does, and they're all working on essentially one thing. They have a bunch of customers and they try to meet them.

That is, I think, the most important organizational learning I can take away from where I've seen software done well versus not done well, if that makes any sense.

Brian Dawson: Yeah, no. It does make sense. I would beg to say that when you're at a startup and you all fit around a single conference table, the joint buy-in, commitment, and resulting accountability is almost organic, right. 

Randy Shoup: Yes. It's really easy to do that. It's really easy, yeah.

Brian Dawson: And I assume and it sounds like you've – you've also probably taken the journey, the growth journey, kind of crossing the chasm from a small company to a medium-sized company. And I assume where a lot of people – and maybe you can talk a bit about this – how do you then scale that? Because I assume and what I've seen is a lot of people kind of think it just works. It just happened, when we were 40 people with a five to ten-person engineering organization, and you don't tend to identify what is necessary to scale that buy-in, commitment, and accountability as you get larger. Any thoughts on that?

Randy Shoup: Yeah, it's a great question. It's a great question. So let me explain where you want to end up, and then I'll talk a little bit about how you get there.

Again, it's easy to start. I mean it's not easy to be a startup, but from the perspective of what you said, buy-in, motivation, autonomy, accountability, that unit of the whatever, four to six people that fit around a conference table, it's very clear what we're all working on and there's no – there's very little debate about that. It's just what we all know we need to do and we just need to get it done.

Again, as I mentioned, the companies that do this well at large scale, each individual team still has that feeling of a small group of people working on one particular area. The challenge is how do you get from here to there. How do you grow that one team into several teams and then many teams? 

Brian Dawson: And retain the performance. 

Randy Shoup: And retain that. The key I think is several-fold. One is to make sure that every team has a very clear goal, and I don't mean like something is written on a piece of paper or whatever, although that's helpful. Every team should be able to be laddered up to some particular customer value, some particular business value.

You should be able to say, "This little startup," and I'm using my air quotes, "This team is responsible for this experience," whether it's to some internal customer or many customers or it's to external customers. You should be able to say, "That's a little mini company." If you can't, I think that's where things break down. If you can, I think you're good, because then you get all the benefits of really tight feedback loops and a very high bandwidth conversation within the team.                     

So the most challenging part as you grow along is that sort of – so it's easy when you're really small. Everybody fits around a conference table. It's challenging in that 20-person to 50-person range, because it's like you're not – sometimes you're not big enough to have multiple parallel teams, but you're not small enough to have one team. That, in my experience, is sort of the most challenging transition point, if that makes sense.

But then once you're – the numbers aren't super-rigid, but once you're past that 50-person mark and you can have whatever, ten teams of five kind of thing, it's usually possible for those teams to have a kind of permanent product or a permanent piece of customer value or business value that they provide. And it's getting through that – you said chasm, and I love that metaphor. It's crossing that chasm that can be challenging.

Brian Dawson: Yeah. I've seen it and I think I've been part of groups that have fallen right down that chasm. It worked great when we were five. You can still kind of make it work as one team-ish at 20. I mean it's not great, really. I don't recommend it, but it's possible. I see a lot of teams, but, like, that just super does not work, as you get to be 25, 30, 40, 50. So yeah, then you have to start thinking about – and this is where leadership comes in. Leadership is there. It's like how can I draw the boundaries around the teams in a way that they're able to be independent of each other? They have a clear objective that is measured by customer value, and they're able to get the job done.

Brian Dawson: Gosh, there's so much. I'd like to ask – I assume, right, while we're speaking to you as a technology leader or an engineering leader, as you kind of raised or pointed to in one of your answers, part of the trick is aligning around areas of business value.

Brian Dawson: To me, I think – and correct me if I'm wrong – that really as we've talked about how you need to scale and how high performing teams work, you're not just applying it to the realm of engineering things. Do these learnings kind of apply across engineering, product, whatever it takes to deliver value via software? Is that fair to say?

Randy Shoup: Yeah, 100 percent. In fact, I'd even – not to correct, but to restate the way you say it. I think the best organizations don't think of it as an engineering team and other people. It's like it's a team, which also does engineering. So within that team boundary, ideally you have people that are generating the ideas that are doing this work, so there is some product function, whether it's a separate person or if a team does it, in the same way as you can't have a startup that only has engineers and nobody else. It's not going to last for too long, right. 

You still have to have people that are thinking about understanding customer needs or thinking about what the product roadmap should be. So I prefer not to think about it as engineering teams and other people. I prefer to think about it as teams that produce value.

Brian Dawson: I love that.

Randy Shoup: We need all the skill sets in there.

Brian Dawson: Yeah. I love that. I actually recently did a presentation. Not to self-promote, but I think aligned around really continuous delivery is not just for developers anymore, but really a call to action I think for the technical people in the room, that it's important to understand that to deliver and develop the best and truly have impact, you have to work with and lean on the shared experiences of the larger group. You have to bring in other people with other domain expertise.

I love the way that you have pointed, however, and I wonder if a part of it – what I kind of hear is, yes, as we said, when we're all around the conference table, to kind of correct some of my thinking in your statement, no, we're not talking about just a team of engineers around a conference table. Part of the magic of the startup is that you have a series of domain experts with mutual respect working together, aligned around solving a shared problem. 

Randy Shoup: Totally, 100 percent.

Brian Dawson: And you need to kind of be able to scale that. Can I ask what is a VP of engineering's challenges, roles, responsibilities in helping an organization cross the chasm and maintain or create and encourage a high performing culture? What's the burden you bear in that?

Randy Shoup: Yeah, it's a lot of it. I mean in no particular order, just because we were talking about it and it's top of mind, drawing the organizational boundaries. It seems like that should be trivial or obvious or not anybody's job, but no, it's super-important to be constantly looking at – again, I'm making this up. Let's imagine I have a team of 50 or a people of 100, looking at what they're doing and making sure they're able to move independently.

Part of that is making sure – one of the little metrics I set for myself, ways I measure myself is can a team do, let's say, 80 percent of the work it needs to do kind of without talking to anybody else? And it's not because I don't want people to talk and I want people to have silos. It's not that at all. It's just if I have drawn the organizational and team boundaries well, most of the team's work can be done on top of other teams' APIs or whatever, like within the context of the team. 

It's not like they don't need anybody else. It's just they don't need to talk to anybody else. That's the distinction. If that's a low percentage, if the team is only able to get 20 percent of their job done and they're constantly having to have really high bandwidth conversations with other people, then maybe we didn't draw those boundaries well.      

Just to dive a little bit deeper on this idea, when both of us started in the industry, we didn't have these full stack teams. Most places didn't do that. They do like, "Here's the frontend team. Here's the application server, middle tier team. Here's the database team," or TBA team or whatever, and every single thing we tried to do would have to cross every one of those team boundaries. It's an opportunity for misunderstandings. People don't really understand – they're further away from the customer value. The feedback loops are slower, et cetera, et cetera.

So I would rather, in those three teams, rather than having three horizontal teams, flip it the other way, so have three vertical teams that are all capable of doing all that stuff. So that's number one is how can I figure out how teams are able to be independent of each other, so they can have really tight feedback loops and keep moving. 

Brian Dawson: So design on one, hold two. So your response for defining an effective and efficient organizational structure as a VP of Engineering. 

Randy Shoup: You said it better than I did, yeah, 100 percent. Again, not in the order of importance, because I think even before that, it's making sure that we set goals and we set goals that matter to customers. So I'm not going to set a goal of I want everybody to type 10,000 lines of code. That's a metric that is orthogonal to what the customers care about. 

So first, setting the expectations around, "Here are the goals that we need to achieve," always coming back to the why. I think if people – not I think – everything proves that if people don't feel that their work is connected to a real purpose that's outside of themselves – and I'll restate purpose as customer value, customer happiness, I think those are intimately linked with one other. If I can't draw a direct line between what I'm working on and value that goes to customers, then my leadership I think has failed me.

I want to be clear that that doesn't mean that every team has to be customer-facing. In large organizations like Google and Amazon and Netflix, the majority of engineering is below that level, if that makes any sense, and that doesn't make it less important. It just means that the customers are not the end consumers. They are the other teams, if that makes any sense. But you always should be able to draw a direct line, yeah.

Brian Dawson: Going back to feedback, and I don't want to slow your point, but yeah, we have this conversation a lot and I think, look, you may be on the middleware team. Shoot, you may even be working on internal, developing internal business systems, but if when you show up, you're able to make a connection as to how my servicing, my peer teams with a robust, smart middleware tier is helping service customers, I'm better empowered to do the right thing. I'm better motivated to show up every day and be committed and to solve problems. Is that –?

Randy Shoup: 100 percent, yeah. You ticked them all. It's more motivating for me. Even if nothing else were true, and lots of other things are true, it's way more motivating for me when I can draw a direct line to customer value.

Number two, and you said this and I say it all the time, if I don't have a very clear customer problem/customer value in mind, it's hard for me to make principled trade-offs about what I should do. Stating the problem clearly is one of the most important things that we can do as people and particularly as engineers. Like, what problem are we trying to solve?

I love this. It's a common quote, but not as common as it should be. Charles Kettering, who was the head of research at General Motors a million years ago used to say, "A problem well stated is a problem half solved." So, number one, it helps us to make the trade-offs. But number two, and just think about software that you build, a lot of the thinking upfront is like, "What actually is the problem here, and what's the minimal thing that I can do to solve that problem?" 

It's all that kind of problem analysis, that once you've done that and you've very clearly stated what the customer problem is, it often – I don't want to say it's trivial, but it's a heck of a lot easier to draw a line between, "Okay. Where is the code in the system today versus where I need to be?" It's so very important.

Brian Dawson: Let me make sure I get it, because I get excited and want to engage. So we were on this thread talking about your role and maintaining a high performance culture, especially in growth phases as a VP of Engineering. We hit two topics, and I'm sure there's a bunch. Are there other –?

Randy Shoup: Yeah, there's a lot more. I guess I'll start – you're going to ask me later for book recommendations, but it came to mind. Read The Manager's Path. For a lot of people, that's a great book. It's Camille Fournier, a former engineering leader at Rent the Runway and a bunch of other places. She's a fantastic author. It's a great read, a quick read, well worth it. But I will give you an answer.

We talked about structuring the organization. We talked about making clear we say the problems and we specify them in terms of customer value. I would also say hiring and people. The hiring and retention aspect of it is a hugely important aspect of being an engineering leader. 

I would say that for the last – I'm making this up, but it's not wrong – for the last ten years in my roles, probably 25 to 30 percent of my waking hours at work are spent on various aspects of attracting and retaining high quality talent. It's hugely important.

This is going to ladder up into the next thing I'm going to say about culture, but one part of the high quality is people that bring the skills that we need or bring the learning and growth mindsets, so that they can learn what we need to do, but the other aspect is the cultural aspect, making sure we're building a diverse team across many different dimensions of diversity, but also making sure we're building – and this is going to lead into the culture thing I want to say – making sure we're building a psychologically safe team. Psychological safety is a new term of ours, an old idea, new phrasing of it. 

And that's the other super-important thing. You don't want to hire – I don't actually want to use the real phrase. You don't want to hire smart, obnoxious people. Let's say it that way. There are more offensive ways of saying that little phrase, but look, it's super-important to find people that are good at their job, but also are genuinely good people.

Just to connect it up to my next point, building the culture, setting the example of what it means to build a diverse team and allow support, psychological safety. People should be able to bring their whole selves to work without fear of negative consequences. Modeling the right behaviors. 

I'll start to pause because I'm sure you want to jump in, but I feel really strongly about all these things.

Brian Dawson: No, no. You don't have to pause. I'm sitting here with a – gosh, I've never used this term. I probably shouldn't say it – a man crush. You're saying everything I want to hear. Yes, Randy, yes, I do. No, just amen. I'm really aligning. What you're saying is really resonating with me.   

Now, I was going to go move to the technical path and talk about connecting this to architecture, but I think some of the things you said, they were so powerful in regards to the importance of diversity and then creating a psychologically safe environment and having healthy interactions. I want to dig into that a little bit more because, wait, why are you talking about this stuff and you have it in your top three priorities. You're VP of Engineering. 

Randy Shoup: Yeah. Where is the engineering part? 

Brian Dawson: Right. You know I say that half jokingly, but one is I feel personally, that that is a place where maybe at a CEO level, a CXO level, oftentimes people mis-hire. They're looking for the most technically sound, the technically sharpest person, and we've weathered or excused in the industry a lot that, "You're an extremely smart person. You can distill and understand large, complex, technical problems in one breath. And hey, you don't get along with people, but that's okay. We're going to put you in charge of the organization."

Randy Shoup: Yeah. That's the brilliant jerk. I don't want any of those. Look, there's a lot of evidence, which I will happily go in – well, let's do it right now. Google, as it happens, did a bunch of studies trying to figure out – they were answering the question why were some teams better performing than other teams. You can imagine they went in with a bunch of hypotheses. I don't know, but you can imagine they thought, "Okay. It's the teams with the most tenure at Google, the teams with the most PhDs, the teams with the most experience in the industry."

Let's be honest. A couple of years ago, that would have been a lot of our hypotheses about what was most effective. 

It turns out that of the top five things, none of them were any of those things I mentioned. The number one of those top five was what's called psychological safety. That's the idea, again, as we were exploring, people are able to bring their whole selves to work, feel comfortable being themselves, without fear of negative consequences.

So when we say, "Bring your whole selves," what do we mean? There's absolutely a set of dimensions that's around I can bring my sexual orientation, my gender identity, my racial identity, my political beliefs. We don't need to be the same. In fact, it's even better if a team is diverse, like coming from different places. The key thing is that we respect one another, that we are able to all be different and all respect one another. 

The other way, this is an equivalent idea, but could be more visceral for some of the listeners. I'm a single parent and there's a bunch of stuff – before we did all the sheltering in place, I would leave to go pick my son up at school, and that's in the middle of the workday. I made clear with the people that I work with that, a) that was okay, and b) it was a deal breaker for me, like, "I can't take this job if I'm not able to be a parent." I think that's something that a lot of us can empathize with.     

Why does this matter? Why does psychological safety matter? We should be good to other people just because we should be. But why does that make the team higher performing?

Well, the insight is that everybody brings 100 percent. They're bringing their whole selves. If I have an idea that I'm not sure about or I'm more junior than everybody else, in a psychologically safe environment, I can express that, and it's okay for them to say, "You know, Randy, I see why you said that, but there's a bunch of stuff about Kubernetes you didn't know yet." Okay, cool. But that's like an okay conversation and I don't feel bad. I learned something. This idea of psychological safety is all about getting the most out of everybody, everybody being 100 percent. 

Randy Shoup: It's empowering. I mean it's all those things. It's empowering. It's freeing. The other way to think about it – and look, I'm a white male and I've had some of these experiences, but nowhere near as badly as a lot of people. When you're thinking in these unsafe environments, 25, 50 percent of your brain is like, "Hey, am I going to be made to feel bad because I don't know this thing? Am I going to be humiliated because I'm a single dad?" 

You know what I mean? There's all this stuff of people that's going on in your brain. Imagine how freeing – well, not imagine, think about how freeing that is in the environments in your life where you don't have to do any of that. All I have to do is do my job, and I can make mistakes.

Brian Dawson: I'll share personally as we talk about diversity. I haven't usually done this in these interviews or episodes, but for those that don't know, I'm an African American male. I have a diverse background, Irish, Cuban, and African American. I have lived for a long time in a very non-diverse industry. I have been to places over a span of months where I have literally been the only brown person in the room for a period of months, or conferences with 6,000 people, where I'm the only person of African American ancestry.

When I started, being technically competent was my validation and my shield, and I'd go into kind of a technically competent – a technical zone. Look, to connect various vectors here, I didn't have to deal with the business. You know what's going to give me credibility is just be good technically. Spit out the best code.

I learned that in line with what you're sharing with us as an audience that my personal experience, lack of full psychological safety caused me not to get the best of interaction with other stakeholders and other people with other domain expertise. But it also meant, as I've learned now, much later in my life, and I'll connect here that I was never really tapping my full self because I wouldn't step here or say that or ask that, because I had so much staked in being right as kind of a protection of my identity.

I am now living in a space, and I know some people – I'll tell you, Randy, you've probably heard people say, "Yeah, bring your full authentic self." Nobody does that. It's so corny. All of that stuff doesn't matter.

But I'm living in a place here where I work at CloudBees, where – I'll put it half-jokingly – I feel that I can show up as a brown or black man, as a father, as a guy that lives in San Diego. I can bring my own dialect. I can bring my own gaps of knowledge. I'm free to suggest things without fear of being ostracized or being wrong. 

And for me – so I'll just say again, that's going to be a part that will probably be edited out anyway, but I'd say now here at CloudBees, I'm actually really experiencing being able to come, feel free, without feeling like I'm being judged, without feeling discarded. I've been able to have more fruitful relationships and interactions with my peers, that I am sure better benefit the company than if I operated from my own space.

So sorry for that detour.

Randy Shoup: Oh, no sorry, 100 percent, yeah. Yeah, almost everybody in the industry has had it tougher than me, that's for sure. I've got to tell you, and even so, I feel that when it gels, like when you're in a team that you trust everybody – I mean it's all about trust, right – where you trust everybody. Everybody is bringing 100 percent. To your point, it's okay to fail, all that stuff. It's just unbelievable. 

Brian Dawson: It's awesome to hear – all right. So I took us down this track. Let me ask you a surprise question. Software engineering, software development or science?

Randy Shoup: Oh, boy. Both. Our practice of it in these days – so we are not yet an engineering discipline, sadly. We're a pretty young industry. We absolutely need to have a lot more discipline I think around what we do. In the practice of it, way more an art or a craft.        

I will say though, and particularly in the DevOps area and particularly some of the things we've already talked about, there is a lot of science that tells us what works and what doesn't. Again, you may decide – we may talk about my book recommendations. The other one is Accelerate. So that's Nicole Forsgren and her collaborators. It's, again, another quick, great read, very rich, where essentially the first half of it is, "Here are all the practices that are proven by science to lead to better engineering outcomes and, therefore, better business outcomes." And the whole second half of the book is all the science about why that's true, all the survey methodology, and she's a PhD and knows how to do this stuff.

So there is a science. There is a science behind these practices, but it's hard for me to argue that software development is today a science.

Brian Dawson: Yeah. That is a very interesting answer, and I think it's a tough thing to reconcile as also businesses depend more and more immutably, undisputedly on software as a key business driver and differentiator. We also wrestle with the fact that there's still a lot of it that's nondeterministic. So I think part of our growth conversation is reconciling that.

I think some of the things that you've talked about, and I hear and look at your experiences as a VP of Engineering, and it seems like you have a grasp on how you tie the deterministic needs of the business to kind of the less deterministic needs of the practice of software engineering and development.

Randy Shoup: Yeah, thanks. Again, back to our conversation about it's okay to fail, I've done it wrong. I've done everything – everything I've learned is because I've done it wrong, and then I learned to try to do it better. 

Now, and this wasn't true, I don't think, as much 30 years ago when I joined the industry. Now, I can learn a lot of this stuff without having to fail myself by reading a bunch of these books. There is so much – one of the things that our industry has done I think well – we can always do better – but done well over these 30 years is that we've codified – I don't want to say codified – a lot of people have learned a lot of important stuff, and they have written it down in a way that we can just consume it. So there's a bunch of learnings that all of us can jump off from.  

Brian Dawson: Let me shift a bit and ask as we start to run out of time – I'm loving our conversation, but I know we're going to run out of time. You've really done a great job in articulating what the components of a high performance culture are, what some of the roles and abilities for an engineering leader to influence that are. 

I think to the extent that we said we codified some of that, that's what this discussion of DevOps practices or tenets as a culture have done. It allowed us to codify some of this, right. 

Randy Shoup: Yeah.

Brian Dawson: I think the fact that DORA is termed the way it's termed, as more tied with I think some of the introduction of psychological safety as a standard part of engineering organization lexicon, they've tied it to DevOps. Now, at the end of the day, with this high performing team, following some DevOps tenets or operating within the DevOps culture, applying practices, at the end of the day, you're delivering a piece of software, a functional system.

Is there any connection between these organizational structures that you talked about a VP of Engineering needs to define that tie to customer value, and the architecture choices of the system that you ultimately deliver?

Randy Shoup: Totally, yeah. Boy, that's another whole hour we could chat.It's a great question. 

Brian Dawson: We're going to have a whole month-long series of Brian and Randy. We'll just start out – 

Randy Shoup: There you go. That would be fun. Yeah, people are probably familiar with Conway's Law. Mel Conway in 1968 observed that the architecture that was produced by an organization would reflect the communication paths within the organization. So essentially, you ship your org chart. 

So what is the connection between the organization that we talked about composed of small, autonomous teams in the sort of grapher network and the architecture? It's actually very direct. It's exactly those companies that are able to be innovative, even at very large scale. Again, I'm thinking Amazon, Google, Netflix, et cetera. 

All of those companies have essentially a very similar structure. Again, small, autonomous teams that each produce one service or one application, or maybe a set of related services and applications, and they work together. My service produces a set of APIs that are consumed then by other teams within the company, and then laddering up to applications that customers actually use. So there's a very direct relationship between this graph, again, of autonomous teams and what we might call a microservices architecture. 

What I am not saying is every company needs to have a microservices architecture. That's for sure not true. But at large scale, the places that are still able to be innovative and move really quickly and respond quickly, all tend to have evolved to that same thing totally independent of each other.

So at smaller scale, when we're a small team, let's have a monolith. Even when we're a couple teams, let's have teams producing components within a monolith that we deploy. So there's a very direct relationship between the phases and size of company and what architecture is appropriate at each level.    

Brian Dawson: Okay. I think it's interesting that you took us there. It kind of goes back to a theme of this conversation. You really can't separate the people from the organization from the technical product. You have to have a holistic and kind of shared view of them all.

Randy Shoup: Yeah. The Accelerate book does a very clear – has some wonderful diagrams. They say it starts with culture and then it goes to organization and practices and, ultimately, outcomes. 

The other way to think about it is in a successful company, all of these things have to work. You can't have everything works except, oh no, the people hate each other and don't – you know. You've got to have the people. You've got to have the organization. You've got to have the culture. Then you've got to have the technology, too. It's got to hit all of them. 

Brian Dawson: Great. Would you say – right before I shift topics, I have to ask. Some of the success, the outward success of some of the past companies you've worked at have been a result of a recognition of these multiple facets of a successful company. We mentioned – yeah, go ahead.

Randy Shoup: I was just going to say 100 percent, yeah. Again, not to keep plugging it, but it's very clearly proven in the Accelerate book, which is based on the DevOps report and research. All that stuff.. It's not just – yeah, it seems like these practices should matter. It's like no, the science says they matter. When you have them, you do better, and when you don't have them, you do worse. You know, two and a half times more likely these companies to exceed their goals on productivity and market share and profitability. The line is just so direct and that's the science. 

Brian Dawson: So trust, right. If that's how you're oriented, as a lot of us are as engineers, trust the science. 

Randy Shoup: That's right.

Brian Dawson: Great. Thanks for those insights. Kind of segueing from here, we've talked about culture. We've talked about organizations. As we mentioned, you were recently at WeWork. And all of these have a tie-in to where we are today as a society.

Within the midst of a pandemic, we're seeing a change to the way we interact, to the way we work, and we're trying to figure out what it means in the long run. Do you have an idea of how the workplace and workforce will be impacted by current times, and how it will maybe impact some of these vectors that we've talked about?

Randy Shoup: Yeah. It's hard. It's hard to figure out where we're going to be. I think we'll know in a couple of months, in six months or 12 months, where we're going.

I think, in a lot of ways, the pandemic is going to accelerate a bunch of the things we already were doing. So we were already seeing slowly a growth of support for remote work and remote-friendly, all the way to remote first. In a bunch of places that I've worked, Stitch Fix in particular, we did that and it was hugely powerful. So I think that is an unstoppable trend.

I'll do a little asterisk here, where a lot of people whose first and only experience with remote work is in the context of a global pandemic, there's going to be a lot of people who don't like it and have bad experiences. It's unclear whether that's remote work or that's remote work within the context of I fear for my personal safety, et cetera. But I think it's unarguable that there is a trajectory, and I think it's only – this stuff is only going to accelerate that trajectory.      

Then related to that, in the same way as organization and architecture mirror one another, we're going to be forming small teams by maybe people sheltering in place for a while. So all these DevOps concepts that are around, how do we structure an organization and make sure teams have well defined areas of responsibility, well defined goals? That's even more important in this model.

Why? Because you don't get the – if we're not keeping everybody aligned in various ways with intentional mechanisms, there's no unintentional osmosis, management by walking around strategy in these kinds of situations, if that makes any sense. Like, you can't stay aligned because you overhear people in the next office chatting about what they're working on. We need to be way more intentional about that stuff.

Brian Dawson: Or you guys all walk to Starbucks or Burrito Real talking about it.

Randy Shoup: Exactly, right, yeah. Then the third thing is, again, the importance of culture. In the old world, where some of us, maybe most of us were all in physical offices, there was this management by walking around concept. So you could see what people were doing and what they weren't, maybe.

We're in this environment where it needs to be, good, way more high trust. Nobody is walking into my house and seeing whether I'm working or not. So I think it's important that we develop these more trust-oriented, if you like, more psychologically safe environments, where, like, "Look, I trust you to get your job done," situation. I think that's going to become more important.

Brian Dawson: Once again, I want to be respectful of your time, but I've got to ask specifically. Have you seen – and I know you're working independently now, but truly you're still working with others and collaborating. How have you seen your role as an engineering leader and previously, more specifically, VP of Engineering change in the past couple of months?

Randy Shoup: Again, with everybody by mandate and by correct health situation by being separated, I find that I need to be way more intentional. So stuff that was already a good idea is now a requirement, so checking in on people, how people are doing. We always should be empathetic about people's situations, but I think that's even more important.

Again, the trust situation is way more important. All those things that, again, people who have already been working remotely will already have had to figure out. Everybody is having to figure it out.

Brian Dawson: I think you've already spoken to kind of how DevOps tenets and practices apply to the remote world. How does software delivery automation, kind of the component practices of DevOps, continuous integration/continuous delivery, do those have relevancy in any way, in your mind?

Randy Shoup: Hugely important. We could have spent this whole time talking about those type of things, which would be great. We chose to talk about different things and I think we talked about very important things. But yeah, it's hugely important. 

Within the context of all the things that we talked about organizationally and culturally, yeah, we have the tools to do the continuous delivery, and we have the development practices to make sure that when I produce my feature, I'm also producing the tests to make sure the feature works. Then I have this mechanism, whichever mechanism we want to use, to make sure those tests get run, to make sure things get deployed properly, check them in production. All that stuff is hugely important. 

Again, you're not suggesting otherwise. Just to be clear, all this stuff has got to work. You have to have all those things. I think anybody who says culture is – if I have culture, everything else is easy. I'm not sure I believe that, and for sure I don't believe if I have tools everything else is easy. It's all got to work.

Brian Dawson: I think they're all kind of inextricably intertwined.  

Randy Shoup: Exactly. We're both doing this with our hands. They are intertwined. They are interrelated in a very deep way. Again, the science as expressed in Accelerate and the stated DevOps report, it shows all these things are very integrally linked and they reinforce each other.  

Brian Dawson: I would suppose or propose that in a remote world, one of the things for people that are just moving to it, that when you don't have it, that you don't get a stumble on. You don't have those serendipitous discoveries at lunch over a burrito. 

By the way, I keep referencing burritos because Randy and I both are from Palo Alto, where in the area you get some of the best burritos, for those that don't have the privilege of being there – or coffee.

Oftentimes, people then swing towards overcorrecting and it's meetings, meetings, meetings, meetings, and you lose flow.

Randy Shoup: Yeah, oh yeah.

Brian Dawson: So as I'm thinking through this, along with everybody else, and I think you've led to it or said it indirectly if not directly, some of the cultural practices of DevOps and the technical practices of continuous integration and continuous delivery allow a layer of kind of connectivity, but then still be able to go work remotely and independently or in small teams, and know that at the end of the day that there's a layer of trust, and I beg to use the word, but shared oversight. 

Randy Shoup: Yeah. Look, again, like we say, the technical practices reinforce the organizational and cultural set up in a very synergistic way. 

Randy Shoup: They are two sides of the same coin. I say that because I 100 percent believe it. I'm not just being nice to you. I 100 percent agree that these things are self-reinforcing.

Brian Dawson: Usually people aren't nice to me, so don't worry about it, Randy. To get ready to get you out of here, I have to ask a couple more things before we get out. One is we have this concept on DevOps Radio called DevOoops, D-E-V-O-O-O-P-S, like oops, I've done it again. Can you share with me a DevOoops moment that you've experienced in software development and delivery, team management, where something has gone wrong, but you've learned from it?

Randy Shoup: Oh, man, so many. You can't see, but I used to – well, I'm mostly bald. I used to have hair when I started the industry. That's actually true. So yeah, every one of the DevOoops is losing a bunch of hair. I mean so many in 30 years.

The one I talk about a lot, because it was really stark and lots of great learnings, lots of dimensions was when I was working in Google Cloud I was running engineering for Google App Engine, so that's Google's platform-as-a-service. We had an eight-hour global outage. All of App Engine was down globally in November of 2012, and that was a pretty bad day.

Actually, you can still go find the public postmortem that the team wrote up. The details aren't super-important. I'm happy to discuss it if we need to, but it was, as with all outages, it was the accumulation of lots of different things all went wrong at the same time, if that makes sense, and we learned a ton.

It had been the accumulation of a bunch of technical debt, reliability related things that we hadn't ended up prioritizing. So after we got out of the woods and got everything restarted, and Snapchat was back online and Pokémon Go could work, once we got everything going again, we gave everybody a pause, said, "Get some sleep."

Then we had a retrospective. A blameless postmortem is a great cultural thing that lots of companies practice, including Google. So we had a blameless postmortem. We went through everything that went right and went wrong, et cetera.

Then, since this was a big one and there were a bunch of contributing factors, we got everybody in a room, like all the stakeholders, all the people that were involved in various ways, all the teams that were involved, and just peppered, for several hours actually, peppered whiteboards with all the different – with brainstorming of, "Think of all the different things that contributed to this incident. What are other things that can contribute to this? What are things that you think are important that maybe would maybe impact our customers, et cetera?"

We had a long time of not prioritizing these things, so there was a lot of stuff that people wrote. Then we sort of bucketed them into themes, "This bunch of things are all related to one another. These other things are related to each other."

Then we had people who volunteered to think more about those areas, went away, timeboxed for a week, come back with essentially the same set of people with a set of proposals, not a huge set of design documents, but a bunch of one-liner, "Okay. I think we need to do x, y, z, and I think it's going to take us one day, one week, one month, one quarter," like super-simple, super-lightweight for them, suggestions of things that would make a difference, and then an order of magnitude estimate for how much engineering effort they thought it would take.

Then we prioritized it all together, out of the whatever, 1,000 suggestions. We started in on a small number of them that were going to have the highest impact. Then we got started.

That was one of the most cathartic and one of the most camaraderie building, if I can coin a term, things I've ever experienced in my whole professional life. Everybody was so excited. I mean nobody liked that we had this outage, but everybody was like, "We really want to make this thing work a lot better," and we had all these ideas queued up and everybody felt great about getting them all out there, and then that we all had our say about what was the right priority for various things to do. Then we got working.

The result of that in maybe six short months was we had a 10x reduction in reliability issues. I could have maybe guessed this, but I would not have guessed how powerful it was. Again, this great camaraderie, this great shared ownership of performance, of reliability, of the underlying strength of the foundations of the system, if that makes any sense. Everybody felt this renewed sense of ownership of it and it brought everybody together. It was already a great team, but it was like we got even tighter.  

Brian Dawson: So you came out of it better. 

Randy Shoup: Yeah. People that were in that – I still catch up with people that were on my team then. It's now eight years ago. Everybody remembers that moment and remembers that – we called it the Reliability Fix It. Everybody remembers that brainstorming, that set of efforts that we did around there. It was this galvanizing moment for all of us.

Brian Dawson: And it ties back to not only so much of what we just talked about, but there was also a correlation to what we're dealing with now as a society. 

Randy Shoup: Yeah, right.

Brian Dawson: That we actually could embrace, work together, and come out better. Do you recommend –? You know what my next question is going to be. I'm going to hedge to it a little bit and say can you tell listeners how to find that postmortem. It sounds like at least reviewing that public postmortem may help our listeners learn a bit from the DevOoops that you just spoke about. 

Randy Shoup: Yeah. It's pretty general. It's out there. If we do share notes somewhere, I'm happy to share with you the link.

Brian Dawson: Yeah, let's do that. For those listeners that haven't seen, we do a blog post, a write-up on these interviews. Randy, if you can get us a reference, then I'll look to work to get that put into the blog post, which is effectively the show notes. We can share that.

Brian Dawson: All right. The one that you already stole my thunder on, but I'm still going to press you, Randy, is what is a – we've got like three of them already, but a book, a blog, a podcast, a resource for learning that you absolutely feel that our listeners would benefit from indulging in. 

Randy Shoup: Great question and we queued it up, so I already had it in my head. Like I say, the Accelerate book. If we weren't at the end, I'd – well, I'm still going to say. If you haven't read Accelerate, pause this thing, go buy Accelerate, go read it and then come back. It's so powerful and so concise in the right way, just very clear. It's not a 1,000-page tome. It's less than 200 pages I think. So Accelerate is a must read.

Again, as I mentioned, for anybody who wants any – wants to be or is any flavor of leader in engineering, The Manager's Path by Camille Fournier is a wonderful, again, quick read, very concise and powerful. She goes through ten steps of starting from being managed, like you're an individual contributor and you don't have management responsibility, but you're being managed, to tech lead, to first level manager, dot, dot, dot, all the way to CTO. It's super-clear and it comes from a lot of her great personal experience in that.

The other one that probably people wouldn't necessarily have heard of, but is a great book is called Making Work Visible. It's by Dominica DeGrandis. Again, super-short – I guess that's a theme here for me – and very well written. It's all about looking at ways that we are not using our time, either personally or as a team productively. It's a great way of thinking about how we can be more efficient. 

It's helped me personally. It's helped me as a leader of teams to think about and has helped to characterize when we're doing too much rework, when we're under-investing in things that would help us be more productive or more reliable. Really great. She's a great author. So yeah, all three of those I strongly recommend. 

Brian Dawson: Thank you. Fantastic references. The ones I haven't read, I'm going to read. The irony, in full transparency and being fallible, I've been waking up every morning reading Making Work Visible, but because I haven't managed to control my whip and my rework, I haven't managed to find time to actually get through the book. So here's one of the best ways that I can help people. Don't do what I'm doing. 

No. I'm really actually excited about taking some time to get through that book. So thank you for sharing that and actually reminding me and encouraging me to take it in.

Randy, it has been honestly and sincerely an enjoyable discussion. Before we wrap up, I'd like to know do you have any final thoughts or comments for our listeners?

Randy Shoup: I don't. Like I say, genuine, this is a ton of fun, Brian, lots of mind-melding I think going on, coming a lot from the same place. Yeah, as we said, I said it a couple of times, but I think successful organizations have to get all these things right. It's not impossible. Companies do it, but it requires a lot of work. There's a lot of intentionality and a lot of thinking about what matters that goes into these things.

You've got to have culture. You've got to have organization. You've got to have the right people, and you've got to have the technology and the practices. All those things ladder up and make us great. 

Brian Dawson: Awesome. Thank you, Randy Shoup. This was a great interview. I appreciate you taking an extended amount of time with us. I look forward to hopefully having conversations with you in the future. 

Randy Shoup: With luck, maybe even in person.

Brian Dawson: Yes, six feet away though. All right. Have a great one, Randy. Take care. 

Randy Shoup: Take care, Brian. It was a lot of fun.                 



Brian Dawson

Brian is a DevOps evangelist and practitioner with a focus on agile, continuous integration (CI), continuous delivery (CD) and DevOps practices. He has over 25 years as a software professional in multiple domains including quality assurance, engineering and management, with a focus on optimization of software development. Brian has led an agile transformation consulting practice and helped many organizations implement CI, CD and DevOps.

Follow Brian Dawson on Twitter.