Honeycomb’s Charity Majors has an evangelical zeal for testing in production and throwing off old ways of thinking about observability.
“Systems used to be small enough that we could fit them in our puny human brains and reason about them,” Charity said during Episode 52 of DevOps Radio. “So we’ve been able to avoid changing the way we think about monitoring and metrics.”
Increasingly, with the adoption of DevOps, microservices and cloud computing, the way we think about observability is changing. The days of pushing products into production and waiting for users to let you know it’s broken are long gone. Or should be.
“You know how many things can break before users notice?” Charity said. “A third of Amazon can go down, and you should never notice if you’ve done your job correctly and balanced across availability zones. Your job is to make it so that lots of things can break, humans can notice at their leisure, and remediate them before it ever gets to the point that users notice.”
While Charity thinks “everyone” tests in production but not always intentionally, she’s set her sights on a different goal: Get everyone to understand that all testing is done in production and plan for that.
The complexity of today’s world doesn’t allow for anyone to predict all the problems developers will encounter during the software development lifecycle. “This is just a fundamentally different world than the 20, 30-year-old system where the whole study of monitoring and metrics were where it comes from,” Charity noted.
Observability was originally defined by control theory as the ability to understand the inner workings of a system by asking questions from the outside, Charity explained during the podcast.
Because you need to be able to dynamically ask ad hoc questions of your data, you need to capture it at a level of granularity where nothing has been pre-aggregated, or even indexed, she said,. Once you get that granularity, you can start getting real observability, which is critical in the world of stacks built with dozens or hundreds of microservices.
In order to support this, Honeycomb wrote its own storage engine, and added compression to speed up query time. And shifted their perspective from looking for events instead of metrics.
Metrics are good for describing the health of the system or the component as a whole, but when a software engineer is debugging the intersection of your code with that system? She said the only thing they care about is: Can each request execute from start to finish, successfully?
With the continuous delivery (CD) becoming more and more integrated into organizations so they remain competitive, the lines are blurring between when testing occurs. Instead, the best way to do so is with feature flags. Developers can selectively test features with just their team or with some beta groups.
“I feel like the only way that you can know that you can gain confidence in what you’re doing is through observability,” she said. “This is key to continuous delivery, because if you don’t have this concept of partial baking and gradually gaining confidence, you’re in for a lot of bad surprises and rollbacks that just kind of grind the whole pipeline to a halt, right?”