How to use A/B testing for better product design at andrewchen

How to use A/B testing for better product design

There’s more than one way to use this tool
A/B testing is a very useful tool that can be used to develop better product designs, rather than just evaluating landing pages.

In a classic A/B test, you’re metrics-driven and want to pick whatever test variant ends up with the higher numbers. This is a useful tool, but is only applicable to scenarios like signup flows where the conversion is obvious. This post will describe some different tactics that are metrics-informed and end up as an aid to your product design process, rather than driving it.

The tactics I’ll describe are for:

Updating your product without negatively impacting numbers
Streamlining your product by measuring and removing unused features
Designing for the right level of prominence

Let’s get started…

Updating your product without negatively impacting numbers
Product teams are constantly pushing small updates to their products in response to customers and what’s happening to the market. When an update affects a key part of the product, particularly to the main signup flow or core viral loop, it’s often important to ensure that it doesn’t hurt the numbers.

For example, let’s say you’re building a new social site and you have a Facebook-integrated “friend finder” option that you want to add. If you build this and test it, you’ll likely find that since it’s unoptimized, it’ll have worse initial numbers. A classic A/B test will often eliminate the new design because it performs worse. But instead of killing it prematurely, you can use an A/B test to iteratively “bake” the new design with a small % of users until it’s ready to replace the old one.

If you know that it’s important to have this type of Facebook integration in your product design, what you do is leave it in, but only expose 10% of your users to it. Then keep making small updates to the design, working on the copy, call to action, and other aspects, until the new design performs as well as the original.

In this way, you can update your product without impacting the numbers negatively. And unlike a classic A/B test where you aim to just pick a winner, instead you are using it to incrementally benchmark a new design until it’s ready to replace the existing one. For this, you are design-led because you know you want to execute this product in a particular way, but you use the A/B test as a safety net to make sure you don’t push out something that’s not ready.

Streamlining your product by measuring feature usage
There’s an important design principle that says, “Do less, but better.” I’ll elaborate on my POV of this philosophy more in a future post, nevertheless many product teams struggle to remove features, or even to quantify unused features.

For example, you might have a legacy feature that suggests people to follow on your social site, which you’d like to replace with a Facebook-based “friend finder” screen instead. Sometimes it can be difficult to get rid of navigation on something like this because it’s not clear how many people are really using it and how that affects their behavior overall, especially new users

A nifty way of using A/B tests to handle this is to run an A/B test to remove the feature, and get the following information back:

How many people actually get exposed to this feature? (Based on what % of people get added into the experiment versus your active users during the test’s time period)
What metrics are affected by people who have this feature removed? (As long as the metrics are neutral to positive, then you can remove it safely)
If some metrics are bad, can you counteract it by adding something else to the new design?

Similar to the process of updating your product, the important notion here is that you have a particular action you want to take on a design level (simplify the UX) and you use the A/B test as a tool to aid that design goal. In this case, rather than going with whatever has better metrics, instead the goal is to go with the better design as long as it’s neutral or better on the numbers.

Designing for the right level of prominence
As you model out the key metrics for your product, there’s often important assumptions that need to be made on things like what % of your users invite their friends, or how many friends they invite, etc. Oftentimes, entire product strategies hinge on making sure that certain kinds of metrics get hit- it could mean the difference between being a viral eyeballs business versus one based on lifetime value and ad spend.

From a product standpoint, this manifests itself as trying to figure out how prominent to make things like “Invite friends” or “Import your addressbook” or “Subscribe to the Pro version.” To build a great UX, you often want to make something as low-prominence as possible while still making sure it’s easy and accessible for users.

A/B testing can help a lot here since you can test multiple versions of prominence and see where it takes you. If you want to prove that a model is even possible (for example, in the very best case could we get 20% of our users to invite their friends?) then you can make a popup that asks for friend invites constantly and see if you are even close. The point here isn’t that you would ever actually close the experiment with the obnoxious popup, but rather, it helps you do a sensitivity analysis of what might even be possible, to see are realistic values within your model.

You can use this technique hand-in-hand with the other ones listed above so that you eventually take a high-prominence version of it and iterate until it’s acceptable to show to 100% of the users.

Final thoughts
The thing that all of these ideas share is that you are using A/B testing as a tool to aid in a broader and stronger design POV rather than slavishly following whatever has the better metrics outcome. As others have discussed before, it’s the difference between data-informed versus data-driven. Many features you’ll want to do in your product have lots of qualitative value, even if the short-term quantitative benefits are difficult to measure or not there at all- using these advanced tactics lets you continue to push out dramatic new designs but without hurting the metrics your business depends on.

PS. Get new updates/analysis on tech and startups

I write a high-quality, weekly newsletter covering what's happening in Silicon Valley, focused on startups, marketing, and mobile.

Views expressed in “content” (including posts, podcasts, videos) linked on this website or posted in social media and other platforms (collectively, “content distribution outlets”) are my own and are not the views of AH Capital Management, L.L.C. (“a16z”) or its respective affiliates. AH Capital Management is an investment adviser registered with the Securities and Exchange Commission. Registration as an investment adviser does not imply any special skill or training. The posts are not directed to any investors or potential investors, and do not constitute an offer to sell -- or a solicitation of an offer to buy -- any securities, and may not be used or relied upon in evaluating the merits of any investment.

The content should not be construed as or relied upon in any manner as investment, legal, tax, or other advice. You should consult your own advisers as to legal, business, tax, and other related matters concerning any investment. Any projections, estimates, forecasts, targets, prospects and/or opinions expressed in these materials are subject to change without notice and may differ or be contrary to opinions expressed by others. Any charts provided here are for informational purposes only, and should not be relied upon when making any investment decision. Certain information contained in here has been obtained from third-party sources. While taken from sources believed to be reliable, I have not independently verified such information and makes no representations about the enduring accuracy of the information or its appropriateness for a given situation. The content speaks only as of the date indicated.

Under no circumstances should any posts or other information provided on this website -- or on associated content distribution outlets -- be construed as an offer soliciting the purchase or sale of any security or interest in any pooled investment vehicle sponsored, discussed, or mentioned by a16z personnel. Nor should it be construed as an offer to provide investment advisory services; an offer to invest in an a16z-managed pooled investment vehicle will be made separately and only by means of the confidential offering documents of the specific pooled investment vehicles -- which should be read in their entirety, and only to those who, among other requirements, meet certain qualifications under federal securities laws. Such investors, defined as accredited investors and qualified purchasers, are generally deemed capable of evaluating the merits and risks of prospective investments and financial matters. There can be no assurances that a16z’s investment objectives will be achieved or investment strategies will be successful. Any investment in a vehicle managed by a16z involves a high degree of risk including the risk that the entire amount invested is lost. Any investments or portfolio companies mentioned, referred to, or described are not representative of all investments in vehicles managed by a16z and there can be no assurance that the investments will be profitable or that other investments made in the future will have similar characteristics or results. A list of investments made by funds managed by a16z is available at https://a16z.com/investments/. Excluded from this list are investments for which the issuer has not provided permission for a16z to disclose publicly as well as unannounced investments in publicly traded digital assets. Past results of Andreessen Horowitz’s investments, pooled investment vehicles, or investment strategies are not necessarily indicative of future results. Please see https://a16z.com/disclosures for additional important information.

Andrew Chen Archives

How to use A/B testing for better product design