Dave Ezrakhovich of Taboola on Making DevOps Happen With Data
Taboola Senior Engineer Dave Ezrakhovich joins host Brian Dawson at DevOps World | Jenkins World to discuss how data has become an integral part of the DevOps process at his organization.
Brian Dawson: Hello. This is Brian Dawson. I'm here with Dave from Taboola. Dave, how you doing today?
Dave Ezrakhovich: I am great. Thank you. It's amazing so far.
Brian: Awesome. So you've been enjoying the conference. Have you been to Jenkins World before?
Dave: No. This is the first time.
Brian:This is the first time. Did it meet your expectations so far?
Dave:It's been bypassing them.
Brian:Awesome. That's great to hear. We talked a bit before we started, kind of, about your background, the company, organization you work for, and where some of your passions are. Let's share some of that with some of the listeners. So, to start, I'm hoping that you can give our listeners a brief overview of your background and, again, where your passion and focus is.
Dave: Okay. Hi, guys. My name is Dave. I work in Taboola. My journey in tech and computers was real interesting because I didn't learn computers at all. I learned agriculture.
Dave: And one day when I was drafted to the Army, they decided that computers should be the main thing that I should do. So I was drafted. I joined my team, and one of the guys told me, "Listen, tomorrow I'm ending my Army duty, and I need to pass everything, all my knowledge to you."
Dave: So we sat down for five hours something like that, and he just was blasting me with knowledge and things that they do and how to do those. I didn't remember anything the day after that. The only thing I remembered was the fact that they had a database and that was ...
Brian: And something like SQL or something.
Dave: Right, right. So I started just running various, all day for, I don't know, a week or two, until I found out the relationship between tables and data was all around. So I started to understand the system from the bottom up, only using data. So after a week or two, I found out that it's got a UI. It was actually SharePoint.
Brian:Oh, interesting. That's funny. Okay.
Dave: That's funny. Since then, I'm really enthusiastic about data, and I got to the DevOps world and I was really looking on how we can combine those two, data and DevOps and Jenkins.
Brian: That's interesting, yeah.
Dave: That's my passion. That's what I'm doing for the past two years, I think, in my company. It's really fun, especially because that's what my company does. We are a content recommendation company called Taboola. Probably you've met us. You browse through the Internet to sites and you were suggested by... I want to call it ... well, it's articles that might interest you. So that's us.
Brian: All right. And I will absolutely say Taboola has directed me to new content quite frequently, probably multiple times a day. And for those that can't actually see, I'll have you know that Dave is here in a Data Nerd shirt.
Brian: Thanks for sharing that story. That's a great story. And I will confess that I am a bit jealous or envious of you. I secretly want to be a data scientist. I secretly have a passion for data. I'm the guy with the crazy spreadsheets and running an SQL database on my laptop because I want to store personal information. So I can fully appreciate it. I'm gonna want to go back to this whole concept, though, of how do we use data and data analytics to help enhance our CI/CD pipeline and delivery process. But I'll take you there in just a minute. I first want to ask, I don't know if you know. The theme of this year's Jenkins World is transform. My understanding is that you've implemented DevOps yourself.
Dave: No. Actually, I joined a team of 12 people.
Dave: We were always in constant transformation on making things easier, to make the user experience better, and making, I guess, the deployments and the builds CI/CDs, the CI/CD mechanism much better and much faster and efficient.
Brian: Okay. So it sounds like you're saying it's not like you guys went in and you sat down and you said, "We're gonna plan this big DevOps transformation, but rather you joined a culture where you guys were continuously improving or continuously transforming.
Dave: Well, I can say that in the past year or so, we kind of changed our mindset. We did make a big transformation with understanding that we are in charge and we need to understand what's going on in our company, in our DevOps world. So we started realizing that maybe we should track our processes. Maybe we should get PagerDuty, as other developers in our company, when they write a new feature, they will monitor it, they will understand what's going on. So we decided that we should join forces and be part of that culture and world. That was our biggest transformation in the past year.
Brian: Okay. So that actually leads me right to something else I wanted to ask that will get us to the discussion about analytics in DevOps. Once you guys decided that you were gonna make this shift, how did you know that you were succeeding? What were key indicators? What were key outcomes that told you that you guys were making the right decisions?
Dave: Well, first of all, it was speed.
Brian: Okay, interesting.
Dave: The fact that we wanted to get to ... we wanted to understand how much time it takes from a feature request being opened to the fact that it got to production because it's crucial when you are in a company that wants to be agile and fast with their solutions. We can write a feature in a couple of days, and we can find about in the same day that it got to production because, you know, sometimes we miss things. So the concept of fixing the bug is also development. Sometimes it's turnkey like feature flags and stuff like that, but sometimes you need to develop something really fast and solve that.
Dave: So the time to move to delivery and time to production is really crucial. So that was our key, let's call it the best KPIs we should measure.
Brian: Okay. Are you familiar with the concept of value streams?
Dave: Yeah, of course.
Brian: Do you guys manage or do you guys view that sort of measurement of end-to-end as a value stream?
Dave: Yeah, totally I think. I think it's crucial that everybody use that, and everybody will understand that eventually. That's the bread and butter of our work.
Dave: Yeah, slow and understanding that the proper user experience, and eventually it's measuring the values. I can't phrase it even better. Value stream is the best way to call it.
Brian: Right, okay. Actually, I'm curious on your thoughts. There's a lot of people if you look at the traditional way, we'll call it, I sometimes say legacy, the old way of developing software. I'd say that we had tended to move slow, out of fear of making mistakes. We needed to identify and design 100 percent correct requirement. We needed to do complete regression to validate. Of course that is our, on the high side or short side, a three-month development cycle, but oftentimes it's a six-month release or a 12-month release. Based on you saying that speed is your key KPI, I'm wondering what are your thoughts on the response to the slow and careful way of doing things, saying that if we move faster, we're actually safer because we can identify and fix things way faster than we would have if we followed a traditional waterfall cycle. Any sort of comments on that position or thoughts on that position?
Dave: That's actually an amazing question, because I don't know if we are a small company, but we have something like 3,000 servers.
Brian: Oh, that's not small.
Dave: And something like 250 builds per day. The deployment takes … every day we are releasing to production. All of those servers and new version, it's something like 45 releases per day.
Brian: That's impressive. That's not nothing.
Dave: Yeah, that's nothing. It's not all of the services, obviously, but that's the front end. That's what the user experiences. I think that quick development is really important, especially when it comes to, hopefully what I prefer to be real good DevOps engineers. I don't look at myself as an engineer, but more as a developer as well.
Dave: So all of that really combines really good because when you have data of your production, for example, you develop a new feature, so you are starting to collect direct metrics on how that feature is performing and, obviously, you have the entire ecosystem being monitored as well, whether it's Graphite or MetricTank and so on. You know as the build cycle what's going on with your performance as well. For example what we are doing is on a daily basis, when we are testing our features and even in production, we always take a snapshot of what we are having. What's our baseline and what is it gonna be? So we can see the progression. We can see events and blog if something went wrong. So we have live and true data of what's going on in our system, and we know if we can release it. It's automatic usually. And if something is fishy, someone will actually go and take a look.
Brian: Okay. And kind of what I hear there is the process of establishing a rapid delivery cycle has dictated that you automate and capture metrics, and that means that in a rapid delivery cycle you have more insights than you would have ever possibly had with a traditional waterfall development lifecycle.
Dave: Yeah, yeah.
Brian: Is that fair to say, as an artifact of the automation? So data. So let's talk a little more about data. It seems like we're hitting a point in the DevOps, CD and DevOps adoption curve, where we're starting to move beyond tackling the core concept of connecting things and automating, and we're starting to reach a little further, and we're starting to talk more about observeability solutions. We're starting to talk about value stream management solutions, analytics, machine learning applied to CD. Now all of this of course has the common theme of data, right, and big data. I'd really like to hear your thoughts on why data is becoming so important and what the future of data and DevOps looks like, in your opinion.
Dave: Okay. I think when you're looking at development, everything is data. Everything is possible because it's data. So if you collect a lot of it, you can understand relationships between objects and metrics, for example, and you know what's going on in your system.
Dave: So when we first started using Jenkins, for example, everything was up there in the UI. If you wanted to know the state of your pipeline, you could go to the page of the pipeline and see if it's green, if it's red, if it's yellow, if something is ...
Brian: Get an automatic binary indication where it stood, right?
Dave: But sometimes that's not enough. Sometimes you need to see the entire ecosystem to find out if something is behaving. For example, if you have 50 build systems, well, 50 slaves. I don't know if it's good call them like that.
Brian: We call them agents. It's 50 agents, yes.
Dave: Agents. You have 50 agents and you build is good. You have good speed until you release, but one of those is misbehaving. It will get lost in a way, because you won't see it's misbehaving because ...
Brian: Because there's so much other stuff.
Dave: Yeah, because there's so much other build agents. So it's easy to miss things and it can slow you down. It can cause problems, and it's really hard to pinpoint where is the problem, where is the bottleneck in all of that system. You know what? Even if you don't have that, you can always improve if you collect data. So that's our goal eventually.
Brian: Yeah, and it sounds ...
Brian: Sorry. Go ahead.
Dave: No, no, no. That's good.
Brian: I guess there's a couple of thoughts there, two thoughts I have. One is it sounds like we can sum it up by, at the end of the day CD and DevOps is about optimizing our development and delivery process.
Brian: And as people say, as my colleague was just reminding me the other day, you can't optimize what you can't measure.
Dave: That's right. That's totally right.
Brian: So it sounds like what you're doing and your view on data personally, and at Taboola, is that that's a perfect marriage, right?
Brian: We're focused on optimization. We need to have an intelligent way to grab, to capture and analyze data. The other question that I have and this we didn't discuss, is it would seem like big data takes on an even more important role, or data collection and really big data takes on a more important role as we move to cloud native app development and microservices. Is the idea of managing multiple, you know, tens or dozens of services something that you feel or have thought about the data space addressing?
Dave: I think so because that's part of our daily job in a way. We do have microservices, obviously. Well, not obviously, but like other companies do, and monitoring it and understanding where the pipeline is being at, it's really easy this way. I think that, first of all, we need to ... well, there's a big problem in understanding what big data is.
Brian: Right, yes, yes.
Dave: And I have the perfect definition, at least ...
Brian: I would love to hear it because I don't have one.
Dave: Big data can be one file with a lot of text in it. What makes it big data is querying and getting the results really fast.
Brian: Ah, okay. It's not about the collection of the data. It's about enabling fast access to that data.
Dave: Yeah. It can be a single file of, I don't know, one terabyte of data, but if you can find quick results, I won't call it big data. MySQL can be a big data solution and, obviously, you have solutions like Athena Project and BigQuery. So everything is big data, as long as you can get results by querying really fast.
Brian: I'm gonna take that one away. I actually had not identified a standard definition of big data, so I appreciate you sharing that. So another question I have for you is you are actually speaking here at DevOps World | Jenkins World, right?
Brian: What's the name of your talk?
Dave: It was a panel about quantifying and measuring DevOps.
Brian: Were there any key takeaways coming out of that panel or any key observations, discussions or questions that you could share with us?
Dave: I think the most important stuff that we talked about was collect as much as you can.
Brian: Interesting. Okay.
Dave: Collect as much as you can because you won't know what your need will be. Today, you will make aggregations on the processes and the state of the stage, if it was green, if it was yellow. But tomorrow you will need to know, for example, how much time did it take to do the actual build or run a single test.
Dave: And upload it, and how much time did it take to upload it to your factory or whatever you would use, if it's JFrog or a Nexus.
Brian: Right. So it's sort of a plan for the future.
Brian: Assume you're gonna need more than you're taking out. I assume that you've probably lived the experience. It's easier to ignore or prune a dataset than it is to try to make up for data that you never captured.
Dave: Exactly. You don't want to find yourself with data loss and trying to fill the gap because that's really hard to do and you need to do it in a really efficient way. So collect as much as you can and make it as robust as you can, and you will benefit.
Brian: Awesome. Thank you. I'm gonna prime. I mean to ask you what are your final thoughts, but I'm gonna ask you in answering that question, to also address what are any final thoughts, but also, what are comments on what the future of data and DevOps looks like. What will we see that we're not seeing today?
Dave: I think we will see constant improvement also in our work because it can track problems. It can track your managers, your manager KPI the VP R&D, even the product because they will know what it can assume to be the release time of new features. It's a crucial part of your product and your clients will be impacted. So why not collect it? Why now understand what's going behind who? So you can release faster. You can do better work. That's our future, collecting and understanding and solutions such as DevOptics will be the future of our user experiences as engineers.
Brian: Right. Yeah, I agree fully. I forgot the word I was gonna say. So it sounds, and I'm gonna try to summarize what you said, the future of data and DevOps is us embracing data and using data to continuously improve or using data to make ourselves better.
Dave: Yeah. It's funny in a way because you see developers doing that. You see us releasing products that does that. And we were kind of missing that idea in usage.
Brian: Internally you mean?
Dave: Yeah, internally. Why are you not using AI when it comes to DevOps as well? We will get there.
Brian: I'm really excited about that.
Dave: It's a matter of time.
Brian: Yeah. I'm excited about that concept where we're able to codify lessons that we learn, and learn lessons through machine learning and AI so that we are actually constantly scraping and interpreting the data that we're collecting, and automatically getting feedback as we move along. Any other final words for our listeners?
Dave: I think you should just enjoy and embrace the DevOps culture, and try to understand what's under the hood. It's actually really enjoyable to understand that you can do so much with it. You can make your, I don't know, Volkswagen Beetle from the '70s to be a Tesla machine, only because you understand and you can fix. You can even make the ... much better because you know what you need to improve.
Brian: Right. Awesome. Thank you. So there's Dave telling us to embrace the power of software development in DevOps. I appreciate your time and your insights into big data. Great talking with you.
Dave: Thank you very much.
Brian: Thank you.
Announcer:Like what you’ve heard today? Don’t miss out on our next episode. Subscribe to DevOps Radio on iTunes or visit our website at CloudBees.com. For more updates on DevOps Radio and industry buzz follow CloudBees on Twitter , Facebook, and LinkedIn.