5 warning signs: Does A/B testing lead to crappy products?


Above: Hollywood sequels follow from risk-averse design decisions, like the widely panned Godfather Part 3

The dangers of the metrics-driven design process
Many readers of this blog are expert practitioners of metrics-driven product development, and with this audience in mind, my post today is on the dangers of going overboard with analytics.

I think that this is an important topic because the metrics-driven philosophy has come to dominate the Facebook/OpenSocial ecosystem, with negative consequences. App developers have pursued short-term goals and easy money – leading to many copycat and uninspired products.

At the same time, it’s clear that A/B testing and metrics culture serves only to generate more data-points, and what you do with that data is up to you. Smart decisions made by entrepreneurs must still be employed to reach successful outcomes. (Thus, my answer to the title question is that no, A/B testing does NOT lead to crappy products, but poor decision-making around data can absolutely lead to it)

So let’s talk about the dangers of being overly metrics-driven – here are a couple of the key issues that can come up:

  1. Risk-averse design
  2. Lack of cohesion
  3. Quitting too early
  4. Customer hitchhiking
  5. Metrics doesn’t replace strategy 

Let’s dive in deeper…

#1 Risk-averse design
The first big issue is that when you design for metrics, it’s easy to become risk-averse. Why try to create an innovative interaction when something proven like status/blogging/friends/profiles/forums/mafia/etc already exists? By copying something, you’re more likely to quickly converge to a mediocre outcome quickly, rather than spending a ton of effort potentially creating something bad – but of course this also eliminates amazing, ecstatic design outcomes as well. 

This risk-averse product design can lead to watered down experiences that combine a mish-mash of features that your audience has already seen elsewhere, and done better too. So while it’s an efficient use of effort, it’s unlikely that your experience will ever be a great one. It’s a recipe for mediocrity. 

Risk aversion is responsible for a whole bunch of bad product decisions outside of the Internet industry as well: Why do Hollywood sequels get made, even though they are usually much worse than the original? Why do companies continually do “brand extensions” that dilute the value of their brand position? The reason is that it’s an efficient thing to do, and it’s pretty easy to make some money even if the end product is not that great. But it’ll hurt in the long run, since these products will inherently be mediocre rather than great.

In my opinion, the only way to avoid this is to never get lazy about design, and to always take the time to create innovative product experiences. Of course you’ll always have parts of your product which will borrow from the tried-and-true, yet I think it’s always important that the core of the experience is differentiated and compelling.

#2 Lack of cohesion
As hinted above, the next issue is that A/B tested designs often create severe inconsistency within an experience. The bottoms-up design process that results from lots of split testing is likely to come up with many local effects, which may rule global design principles.

Here’s a thought experiment to demonstrate this: Let’s say you tested every form input on your website, with different labels, fonts, sizes, buttons, etc. You’re likely, if you picked the best-performing candidate, to have wildly different looking forms across the site. While it may perform better, it also makes the experience inconsistent and confusing.

Ultimately, I think resolving this has to do with striking a balance between global design principles and local effects. One great way to do this is to split out the extremely critical parts of your product funnel to be locally optimized, and keep the rest of the experience the same. For a social gaming site, the viral loop and the transaction funnel should be optimized separately, whereas the core of the game experience should be very internally consistent.

#3 Quitting too early
Another way to get to uninspired products is to quit too early while iterating an experience because of early test data. When metrics are easy to collect on a new product feature, it’s often very tempting to launch a very rough feature and use the initial metrics to judge the success of the overall project. And unfortunately, when the numbers are negative, there can be a huge urge to quit early – this is a very human reaction to wanting to not waste a bunch of time on something that’s perceived to fail.

Sometimes a product requires features A, B, and C to work right, and if you’ve only done A, it’s hard to figure out how the entire experience will work out. Maybe the overall data is negative? Or maybe it inspire dynamics that go away once all the features are bundled? Interim data is often just that – interim. But people are great at extrapolating data, but sometimes the right approach is just to play out your hand, see where things go, and evaluate once the entire design process has completed.

#4 Customer hitchhiking
A colleague of mine once used the term “customer hitchhiking” to describe how it’s easy to follow the customer on whatever they want to do, rather than having an internal vision of where YOU want to go. This can happen whenever the data overrules internal discussion and resists interpretation, because it’s so uncompromising as hard evidence. The important thing to remember, of course, is that the analysis is only as good as the analyst, and it’s up to the entrepreneur to put the data into the context of strategy, design, team, and all the other perspectives that occur within a business.

Today, I think you see a lot of this customer hitchhiking whenever companies string together a bunch of unrelated features just to please a target audience. This reminds me of what is often called the “portal strategy” of the late-90s. Just combine a bunch of stuff in one place, and off you go. The danger of that, of course, is that that it leads to incoherent user experience, company direction, and numerous other sources of confusion.

In the Facebook/OpenSocial ecosystem, of course, this manifests itself as companies that have many unrelated apps. You can dress this up as a “portfolio” or a “platform” but at the same time, it can be a recipe for crappy product experiences.

#5 Metrics doesn’t replace strategy
What do you think it would be like to write a novel, one sentence at a time, without thinking about the broader plot? I’m sure it’d be a terrible novel, and similarly, I bet that testing one feature at a time is likely to lead to a crappy product.

Ultimately, every startup needs to decide what they want to do when they grow up – this is a combination of entrepreneurial judgement, instinct, and strategy.

Every startup has to figure out how big the market is, they have to deliver a compelling product, and they need a powerful marketing strategy to get their services in front of millions of people. Without a long-term vision of how these things will happen, an excessive amount of A/B testing will surely lead to a tiny business.

To use a mountaineering analogy: Metrics can be very helpful in helping you scale the mountain once you’re on top of the right one – but how do you figure out whether you’re scaling the right peak? Analytics are unlikely to help you there.

Conclusions
My point on this – nothing is ever a silver-bullet, and as much as I am an evangelist for metrics-driven approaches to startup building, I’m also very aware of the shortcomings. In general, these tools are great for optimizing specific, local outcomes, but they need to be combined with a larger framework to reach big successes.

Ultimately, quantitative metrics are just another piece of data that can be used to guide decision-making for product design – you have to combine this with all the other bits of information to get it right.

Agree or disagree? Have more examples? Leave me a comment! 

Want more?
If you liked this post, please subscribe or follow me on Twitter.

Published by

Andrew Chen

Andrew Chen is a general partner at Andreessen Horowitz, investing in startups within consumer and bottoms up SaaS. Previously, he led Rider Growth at Uber, focusing on acquisition, new user experience, churn, and notifications/email. For the past decade, he’s written about metrics, monetization, and growth. He is an advisor/investor for tech startups including AngelList, Barkbox, Boba Guys, Dropbox, Front, Gusto, Product Hunt, Tinder, Workato and others. He holds a B.S. in Applied Mathematics from the University of Washington

Exit mobile version