Andrew Chen Archives

Subscribe · Featured · Recent · The Cold Start Problem 📘
Dear readers, I have moved to Substack and I will be writing here from now on:
In the meantime, I will leave up for posterity. Enjoy!

5 warning signs: Does A/B testing lead to crappy products?

Above: Hollywood sequels follow from risk-averse design decisions, like the widely panned Godfather Part 3

The dangers of the metrics-driven design process
Many readers of this blog are expert practitioners of metrics-driven product development, and with this audience in mind, my post today is on the dangers of going overboard with analytics.

I think that this is an important topic because the metrics-driven philosophy has come to dominate the Facebook/OpenSocial ecosystem, with negative consequences. App developers have pursued short-term goals and easy money – leading to many copycat and uninspired products.

At the same time, it’s clear that A/B testing and metrics culture serves only to generate more data-points, and what you do with that data is up to you. Smart decisions made by entrepreneurs must still be employed to reach successful outcomes. (Thus, my answer to the title question is that no, A/B testing does NOT lead to crappy products, but poor decision-making around data can absolutely lead to it)

So let’s talk about the dangers of being overly metrics-driven – here are a couple of the key issues that can come up:

  1. Risk-averse design
  2. Lack of cohesion
  3. Quitting too early
  4. Customer hitchhiking
  5. Metrics doesn’t replace strategy 

Let’s dive in deeper…

#1 Risk-averse design
The first big issue is that when you design for metrics, it’s easy to become risk-averse. Why try to create an innovative interaction when something proven like status/blogging/friends/profiles/forums/mafia/etc already exists? By copying something, you’re more likely to quickly converge to a mediocre outcome quickly, rather than spending a ton of effort potentially creating something bad – but of course this also eliminates amazing, ecstatic design outcomes as well. 

This risk-averse product design can lead to watered down experiences that combine a mish-mash of features that your audience has already seen elsewhere, and done better too. So while it’s an efficient use of effort, it’s unlikely that your experience will ever be a great one. It’s a recipe for mediocrity. 

Risk aversion is responsible for a whole bunch of bad product decisions outside of the Internet industry as well: Why do Hollywood sequels get made, even though they are usually much worse than the original? Why do companies continually do “brand extensions” that dilute the value of their brand position? The reason is that it’s an efficient thing to do, and it’s pretty easy to make some money even if the end product is not that great. But it’ll hurt in the long run, since these products will inherently be mediocre rather than great.

In my opinion, the only way to avoid this is to never get lazy about design, and to always take the time to create innovative product experiences. Of course you’ll always have parts of your product which will borrow from the tried-and-true, yet I think it’s always important that the core of the experience is differentiated and compelling.

#2 Lack of cohesion
As hinted above, the next issue is that A/B tested designs often create severe inconsistency within an experience. The bottoms-up design process that results from lots of split testing is likely to come up with many local effects, which may rule global design principles.

Here’s a thought experiment to demonstrate this: Let’s say you tested every form input on your website, with different labels, fonts, sizes, buttons, etc. You’re likely, if you picked the best-performing candidate, to have wildly different looking forms across the site. While it may perform better, it also makes the experience inconsistent and confusing.

Ultimately, I think resolving this has to do with striking a balance between global design principles and local effects. One great way to do this is to split out the extremely critical parts of your product funnel to be locally optimized, and keep the rest of the experience the same. For a social gaming site, the viral loop and the transaction funnel should be optimized separately, whereas the core of the game experience should be very internally consistent.

#3 Quitting too early
Another way to get to uninspired products is to quit too early while iterating an experience because of early test data. When metrics are easy to collect on a new product feature, it’s often very tempting to launch a very rough feature and use the initial metrics to judge the success of the overall project. And unfortunately, when the numbers are negative, there can be a huge urge to quit early – this is a very human reaction to wanting to not waste a bunch of time on something that’s perceived to fail.

Sometimes a product requires features A, B, and C to work right, and if you’ve only done A, it’s hard to figure out how the entire experience will work out. Maybe the overall data is negative? Or maybe it inspire dynamics that go away once all the features are bundled? Interim data is often just that – interim. But people are great at extrapolating data, but sometimes the right approach is just to play out your hand, see where things go, and evaluate once the entire design process has completed.

#4 Customer hitchhiking
A colleague of mine once used the term “customer hitchhiking” to describe how it’s easy to follow the customer on whatever they want to do, rather than having an internal vision of where YOU want to go. This can happen whenever the data overrules internal discussion and resists interpretation, because it’s so uncompromising as hard evidence. The important thing to remember, of course, is that the analysis is only as good as the analyst, and it’s up to the entrepreneur to put the data into the context of strategy, design, team, and all the other perspectives that occur within a business.

Today, I think you see a lot of this customer hitchhiking whenever companies string together a bunch of unrelated features just to please a target audience. This reminds me of what is often called the “portal strategy” of the late-90s. Just combine a bunch of stuff in one place, and off you go. The danger of that, of course, is that that it leads to incoherent user experience, company direction, and numerous other sources of confusion.

In the Facebook/OpenSocial ecosystem, of course, this manifests itself as companies that have many unrelated apps. You can dress this up as a “portfolio” or a “platform” but at the same time, it can be a recipe for crappy product experiences.

#5 Metrics doesn’t replace strategy
What do you think it would be like to write a novel, one sentence at a time, without thinking about the broader plot? I’m sure it’d be a terrible novel, and similarly, I bet that testing one feature at a time is likely to lead to a crappy product.

Ultimately, every startup needs to decide what they want to do when they grow up – this is a combination of entrepreneurial judgement, instinct, and strategy.

Every startup has to figure out how big the market is, they have to deliver a compelling product, and they need a powerful marketing strategy to get their services in front of millions of people. Without a long-term vision of how these things will happen, an excessive amount of A/B testing will surely lead to a tiny business.

To use a mountaineering analogy: Metrics can be very helpful in helping you scale the mountain once you’re on top of the right one – but how do you figure out whether you’re scaling the right peak? Analytics are unlikely to help you there.

My point on this – nothing is ever a silver-bullet, and as much as I am an evangelist for metrics-driven approaches to startup building, I’m also very aware of the shortcomings. In general, these tools are great for optimizing specific, local outcomes, but they need to be combined with a larger framework to reach big successes.

Ultimately, quantitative metrics are just another piece of data that can be used to guide decision-making for product design – you have to combine this with all the other bits of information to get it right.

Agree or disagree? Have more examples? Leave me a comment! 

Want more?
If you liked this post, please subscribe or follow me on Twitter.

PS. Get new updates/analysis on tech and startups

I write a high-quality, weekly newsletter covering what's happening in Silicon Valley, focused on startups, marketing, and mobile.

Views expressed in “content” (including posts, podcasts, videos) linked on this website or posted in social media and other platforms (collectively, “content distribution outlets”) are my own and are not the views of AH Capital Management, L.L.C. (“a16z”) or its respective affiliates. AH Capital Management is an investment adviser registered with the Securities and Exchange Commission. Registration as an investment adviser does not imply any special skill or training. The posts are not directed to any investors or potential investors, and do not constitute an offer to sell -- or a solicitation of an offer to buy -- any securities, and may not be used or relied upon in evaluating the merits of any investment.

The content should not be construed as or relied upon in any manner as investment, legal, tax, or other advice. You should consult your own advisers as to legal, business, tax, and other related matters concerning any investment. Any projections, estimates, forecasts, targets, prospects and/or opinions expressed in these materials are subject to change without notice and may differ or be contrary to opinions expressed by others. Any charts provided here are for informational purposes only, and should not be relied upon when making any investment decision. Certain information contained in here has been obtained from third-party sources. While taken from sources believed to be reliable, I have not independently verified such information and makes no representations about the enduring accuracy of the information or its appropriateness for a given situation. The content speaks only as of the date indicated.

Under no circumstances should any posts or other information provided on this website -- or on associated content distribution outlets -- be construed as an offer soliciting the purchase or sale of any security or interest in any pooled investment vehicle sponsored, discussed, or mentioned by a16z personnel. Nor should it be construed as an offer to provide investment advisory services; an offer to invest in an a16z-managed pooled investment vehicle will be made separately and only by means of the confidential offering documents of the specific pooled investment vehicles -- which should be read in their entirety, and only to those who, among other requirements, meet certain qualifications under federal securities laws. Such investors, defined as accredited investors and qualified purchasers, are generally deemed capable of evaluating the merits and risks of prospective investments and financial matters. There can be no assurances that a16z’s investment objectives will be achieved or investment strategies will be successful. Any investment in a vehicle managed by a16z involves a high degree of risk including the risk that the entire amount invested is lost. Any investments or portfolio companies mentioned, referred to, or described are not representative of all investments in vehicles managed by a16z and there can be no assurance that the investments will be profitable or that other investments made in the future will have similar characteristics or results. A list of investments made by funds managed by a16z is available at Excluded from this list are investments for which the issuer has not provided permission for a16z to disclose publicly as well as unannounced investments in publicly traded digital assets. Past results of Andreessen Horowitz’s investments, pooled investment vehicles, or investment strategies are not necessarily indicative of future results. Please see for additional important information.