How to Build the Process and Culture Behind Using Feature Flags at Scale

Written by: Kara Phelps
9 min read

Feature flags are a great way to release features quickly with very low risk—they allow software teams to make changes without re-deploying code. They have the power to make an organization’s DevOps practices more efficient, enabling testing in production. They can help developers, operations, QA, product, customer success, sales and marketing teams deliver higher-quality features, faster.

But like many powerful tools, feature flags need to be used with care. When an organization adopts feature flags, it needs to simultaneously adopt a set of best practices for using them effectively and safely. This article goes beyond a technical “how-to” guide for implementing feature flags, and into the realm of process and culture. Sure, you can start using them today as an individual or a small team, but to truly realize the benefits of feature flags, the entire organization needs to embrace them—and without the necessary process and cultural shifts, you can accumulate a very large load of technical debt very quickly. At best, you’ll end up with bloated code, and at worst, the bloated code can lead to catastrophic events.

So let’s dive into the “soft side” of feature flags. How can you help your organization understand and embrace feature flags—and then “rinse and repeat?” We’ve gathered some ideas that might help you ease the transition to feature flagging on a process and cultural level.

1. Establish Naming Conventions for Your Flags

You should already have a style guide for writing code. Decide on a system for naming feature flags and update your style guide with these conventions. It’s helpful to be able to point to this guide as you standardize the new practices. It will also help familiarize new employees with your practices as they go through your onboarding process.

We recommend using a namespace, a label and a description to identify your flags. The namespace should briefly describe how a flag is being used. (This is the largest bucket by which to group flags—you can filter down further from there.) The label associates a flag with one or more teams who are using it. The description includes other important details like the flag creation date and any dependencies.

Deciding on a naming convention early and adopting it across teams will help you avoid the need to manage built-up technical debt in the future. (The UI in tools like CloudBees Feature Management enables the creation of namespaces, multiple labels, and descriptions.)

2. Review Flags Early and Often

Set up a process to review flags on a regular basis, as well as a process to clean up flags once it’s determined they need to be removed. Integration with tools like Jira can help monitor the lifespan of each flag by associating a ticket with it. When you’re notified that a flag has reached the end of its usefulness and needs to be retired, you can refer back to the flag’s naming conventions to establish the correct removal process. This is why naming conventions are so important—without them, you’d have a more difficult time establishing dependencies and delegating the flag’s removal to the appropriate teams. The flag would most likely just sit there indefinitely with 100% of users calling it, outliving its intended purpose. If you clean up flags as soon as it’s clear they can be retired, however, you won’t have as much code to test in the future.

It’s also important to establish feedback and revision processes for features being tested in production. Technical and non-technical teams alike should have visibility into these processes. You can keep teams in sync with a centralized dashboard where all feature flags can be seen and changed independently from the code base, although permissions should be carefully controlled. 

Finally, determine who is ultimately responsible for releasing features—is it release managers? Developers? Engineering managers? Product managers? Setting up a chain of command helps clarify what different roles need to do in different scenarios.

3. Build a Culture of Feature Experimentation

At the organizational level, your focus needs to shift from delivering releases to delivering features. The distinction may seem subtle, but it is important. It’s ultimately about prioritizing customer experience. Rather than optimizing for target environments, you start optimizing for users. This opens you up to new possibilities for feature experimentation and feature value measurement. 

With feature flags, features can be rolled out or rolled back independently. Non-technical stakeholders like product or customer success teams can make those decisions, and are empowered to turn features “on” or “off” themselves. Engineering teams can build features in advance and hand off to other teams to roll out whenever they choose. Everyone can be more flexible with experimentation—whether it’s product managers measuring the value of features, engineers monitoring integrations, or any other team experimenting with any aspects of the feature they own. Embracing experimentation is a nice bonus that feature flags make possible.

On the business side, feature flags also impact value stream management (VSM), or the management of an organization’s flow of value from the idea stage all the way through its arrival in customers’ hands. VSM breaks down silos to consider the business as a whole, and urges teams to consider themselves part of a larger company mission. Feature flags help with this—they unite cross-functional teams to deliver meaningful value (i.e., new features) to customers more efficiently.

4. Get Comfortable with Releasing “Incomplete” Code to Production

Feature flags allow you to roll out features to a small, pre-defined audience first (such as internal QA teams). After a feature's successful and safe delivery to a controlled group, teams can incrementally release to larger audiences and roll back at any time. This is not a standard canary release with CD tooling. You don’t need feature branches with feature flags—you can gradually “turn on” new features rather than doing staged deployments. (You can also use feature toggles to conduct an A/B test and learn which version users prefer.)

This means that “merge hell” is a thing of the past. Conflicts won’t arise in long-lived branches because you’re releasing constantly. This also means, however, that you and your teams will need to get comfortable releasing code to production that you’d previously have considered “incomplete.” If you’re releasing constantly, at some point you will probably need to merge code containing incomplete work. You can wrap that incomplete work in a feature flag that is turned off by default, then merge normally. (There are safety bumpers and approval flows in place to ensure that users can’t access incomplete code if you don’t want them to do so.) Feature flags can help you switch context more efficiently without risking your codebase. 

A Note About Software Testing

Feature flags do create multiple permutations of code, making testing more complex than it would have been in a feature branch. You can’t know whether a flag will be turned on or off in production, for example, so it makes sense to test both possible states. Introducing a single feature flag theoretically doubles the amount of testing that needs to be done—and if you had to test each state on every single feature flag you use, you’d quickly end up with an astronomical amount of work.

Fortunately, the situation isn’t all that bad. You don’t need to test every possible variation of code. Many feature flags will never interact with each other, and most releases will only involve changing the properties of a single feature flag. Your first priority is to test the configuration that you expect to go live in production, followed by your fallback configuration. 

Feature flags also make testing in production much safer. “Testing in production” used to be a joke about the compressed testing schedule that seemed inevitable during the reign of waterfall development—but “testing in production” is no longer a ridiculous idea. Production environments are increasingly difficult to replicate in testing, and it’s impossible to anticipate every edge case with test data. Testing in production can give you an advantage when it comes to catching bugs early and maintaining your customer’s trust. It can help you get more accurate data in performance and load testing, and it allows you to A/B test to learn more about user behavior and preferences.

Of course, QA teams still need to start testing as early in the software development lifecycle as possible. You may be technically “testing in production,” but only with tightly controlled groups of users, safely gated by feature flags. “Testing in production” can never completely replace other testing environments, either. Consider it another tool to improve your QA strategy even further. If you have the ability to throw a kill switch and easily reintroduce a previous version at any moment (as feature flags allow you to do), it’s okay to release imperfect code to production—your goal is to perfect it with the help of real data. 

5. Understand that Production Is No Longer a Single Source of Truth

Along similar lines, production can no longer be considered a single source of truth. If your piece of code is in production, you can’t assume that it’s been fully vetted. In fact, if it’s behind a feature flag and your organization is practicing good flag hygiene, your code is probably still being tested in some way. This is a good place to re-emphasize the importance of setting up and maintaining strong, clear naming conventions and review processes—you will need to rely on them. If you’re looking at code wrapped in a feature flag, the current state of the flag will tell you which version of code to trust.

Feature flags create more code paths and increase code complexity, which is why it’s essential to manage them properly. Feature flag management tools can add guardrails and take over the enforcement of some key processes, which may prove helpful as you scale your usage of feature flags.

Conclusion

Shifts in process and culture lay the groundwork for accessing the potential that feature flags offer. Technical and non-technical teams need visibility into how feature flags are used across your organization, and these adjustments help you set the right circumstances. Feature flags have the power to improve the speed and quality of your software delivery, and they deserve a strong foundation.

Additional Resources

Stay up to date

We'll never share your email address and you can opt out at any time, we promise.