A/B Testing with Feature Flags

Written by: Kiley Nichols

6 min read

The following is guest blog post by Swaathi Kakarla.

I test in production.

You're probably wondering why I just said that; after all, no sane developer would test in production (or deploy on Fridays!). But did you know that a lot of organizations test in production? You’ve seen it, actually -- such as when a product gets updated to a dramatically new version and you’re invited to try out the new “beta.”

If you’re a Reddit user, you may have noticed this recently when it moved to its new UI and gave you the option of continuing on the old design or trying out the new one. Over time, the new design will become stable, and it will be the only way to access the UI (unless you dig deep into the settings panel).

Tons of products do this! Only a subset of users, however, get to experience the new feature. This subset is usually selected based on things like usage time, account type, or geography, and sometimes users are allowed to sign up for beta testing voluntarily. Allowing users to try out a new UI design or a new “beta” are actually ways in which organizations test in production using feature flags.

What are feature flags?

Feature flags allow you to act as a puppeteer for your application. You get to choose what features to turn on and off - just like a flag that signals racers to start. Feature flags are simple to create and yet very useful in testing.

To create a feature flag, you just take a conditional and wrap it around a feature; then you can toggle visibility either at runtime or in response to a user attribute. This is pretty straightforward when you’ve got one or two feature flags, but it can get overwhelming as your product matures.

Feature flags are also very useful for sales, marketing, and design teams, and for product managers; for example, they can help them study how users respond to features in a silo. In order to toggle features, however, they will need to reach out to the dev team -- and that's never fun. Worse still, a developer has to manually go to the section of the codebase and toggle it, which is neither safe nor scalable.

As feature flags pile up, you will need to manage their lifecycles. It will become increasingly important to properly retire flags, and you’ll also need to ensure that toggling one flag doesn’t affect another.

As your app and feature flags grow, it will be crucial for your business to adopt a more scalable solution for feature flag management. With CloudBees Feature Flags, organizations can manage feature flags with more efficiency and:

Set custom targeting rules
Gradually rollout and rollback
Perform multivariate testing & experiments
Extract audit logs
Drill down on feature analytics

To learn more about implementing feature flags and managing them in your product, read the 5 best practices for feature flags.

Testing with feature flags

In most organizations, testing happens in closed staging environments with synthesized data. This only allows you to see whether or not your feature is working; it really doesn't allow you to understand how users actually use your product.

A/B testing is a great way to figure out which features will work for your users, perform a staged rollout of big features, and iterate a feature quickly based on user feedback.

Performing an A/B test

So you want to do an A/B test with feature flags? Alright! Let's get to it.

Once you've decided which feature variations you want to test and set the appropriate flags for them, we can start measuring impact.

Step 1: Define user segments

First, you will need to decide which segment of your users will see the experimental features. Typically, you would select these users based on attributes such as longevity, geography, and account type. One popular division is the 90/10 split, where only 10% of your users see the beta features. If you want to hit the ground running and gather results quicker, try the 50/50 split.

Step 2: Create goals

This will help you set a framework for measuring the impact of the new feature. It could be something as simple as seeing if users spend more time on the product, or as complex as calculating whether that time has increased by a certain percent. These questions can be answered by measuring standard metrics such as:

Number of page views
Duration of session
Series of buttons/workflows navigated
Bounce rates
Exit rates
Device types

Step 3: Track goals

With CloudBees Feature Flags, you can view metrics on a timeline, measure progress against previous performances, and more! CloudBees Feature Flags also supports all three major platforms: browsers, servers, and mobile phones -- with API support in a variety of languages. It excels in mobile where you are able to hot-swap code without having to jump through the app store hurdles.

Step 4: Engage users

After deploying the feature flag and testing it with users, it's important to view the results of the experiment and act on them accordingly. Quality assurance teams (who are generally responsible for A/B testing) should share the results with customer service managers, solutions architects, and business users.

These groups have the most knowledge of how your customers use your product. They also have access to more user account data than QA engineers, which allows organizations to provide better support and build better features.

Step 5: Make changes

The goal of A/B testing is to provide insight into what works for your users with minimal risk. The data from these tests should be given to feature owners and developers so that they can make necessary changes. Once you've picked your winner, all feature flags must be safely removed. This ensures that any unused flags do not affect future tests and performance.

Benefits of A/B testing

A/B testing is a low-risk, high-reward construct for production testing. When implemented correctly, you can extract maximum value. Some benefits include:

Reduced bounce rates
Increased conversion rates
Higher value proposition
Reduced abandonment rates
Increased sales

Conclusion

A/B testing and feature flags go hand in hand. You will be able to gather user preferences and react to user feedback quickly while still delivering value. You can perform this test at any scale of the product; it doesn't require much data, but it is extremely useful. However, you should make sure to use a support tool that helps you monitor and manage feature flags, as they can quickly get out of hand.

If you want to know more about feature flagging, check the CloudBees Feature Flags website to sign up for free trial of the product.

Swaathi Kakarla is the co-founder and CTO at Skcript. She enjoys talking and writing about code efficiency, performance and startups. In her free time, she finds solace in yoga, bicycling and contributing to open source.

All Blogs

The Unify Buzz Intensifies

June 26, 2025

CloudBees Adds Native GitHub Actions Support

June 20, 2025

Introducing the CloudBees MCP Server: Connecting the Missing Context Layer for Enterprise AI-Driven DevOps

June 17, 2025

CloudBees Named to Inc’s 2025 Best Workplaces

June 17, 2025

Let The Buzz Begin…Introducing CloudBees Unify

May 28, 2025

Enough with the “Platform”

May 20, 2025

Stay up-to-date with the latest insights

Sign up today for the CloudBees newsletter and get our latest and greatest how-to’s and developer insights, product updates and company news!

What are feature flags?

Testing with feature flags

Performing an A/B test

Benefits of A/B testing

Conclusion

Related posts