Metrics, Cowardice, and Mistrust

why we need better ideologies than "data driven decision making"

Oct 17, 2023

A puzzle about the Tyranny of the Marginal User is that companies optimizing for simple engagement metrics aren’t even being economically rational. They could make more money if they optimized for something more complex that accounts for some users being more valuable than others, for goodwill, for ecosystem effects, etcetera. So why don’t they?

You might think it’s because these more complex things are hard to measure. I don’t buy it. Measuring daily active users takes a single database query that costs pennies to run. You’re using 0.0001% of your compute to define a metric that you use the remaining 99.9999% to brutally optimize. There’s no way this is the right trade-off of measurement and optimization.

So what’s the real reason simple dumb metrics win? Let’s first look at why we use metrics at all. On the surface, the idea of metrics seems kind of deranged. You take a product like TikTok, which shows millions of videos of every conceivable variety to billions of people every day, causing everything from moments of ecstasy to psychotic breakdowns every minute, and you’re just going to, what, summarize its effects on the world with 1 number? Imagine doing this in other parts of your life. “How were the Canadian Rockies?”. “Even better than I expected, at least 96/100”. “Tell me about your childhood”. “8/10 childhood, no complaints”. Come on!

One reason we use metrics is to further our understanding of the underlying phenomena, the way a chemist might measure the maximum temperature of a reaction. Typically this use of metrics is very fluid and iterative - because you’re going for understanding, you’re constantly iterating on the measurement process itself. A large proportion of a scientist’s work is inventing and building new ways to measure reality.

But the other reason we use metrics, sadly much more common, is due to cowardice (sorry, risk-aversion) and mistrust.

Cowardice because nobody wants to be responsible for making a decision. Actually trying to understand the impact of a new feature on your users and then making a call is an inherently subjective process that involves judgment, i.e. responsibility, i.e. you could be fired if you fuck it up. Whereas if you just pick whichever side of the A/B test has higher metrics, you’ve successfully outsourced your agency to an impartial process, so you’re safe.

Mistrust because not only does nobody want to make the decision themselves, nobody even wants to delegate it! Delegating the decision to a specific person also involves a judgment call about that person. If they make a bad decision, that reflects badly on you for trusting them! So instead you insist that “our company makes data driven decisions” which is a euphemism for “TRUST NOONE”. This works all the way up the hierarchy - the CEO doesn’t trust the Head of Product, the board members don’t trust the CEO, everyone insists on seeing metrics and so metrics rule.

Adversarially Robust Metrics

Coming back to our original question: why can’t we have good metrics that at least try to capture the complexity of what users want? Again, cowardice and mistrust. There’s a vast space of possible metrics, and choosing any specific one is a matter of judgment. But we don’t trust ourselves or anyone else enough to make that judgment call, so we stick with the simple dumb metrics.

This isn’t always a bad thing! Police departments are often evaluated by their homicide clearance rate because murders are loud and obvious and their numbers are very hard to fudge. If we instead evaluated them by some complicated CRIME+ index that a committee came up with, I’d expect worse outcomes across the board.

Nobody thinks “number of murders cleared” is the best metric of police performance, any more than DAUs are the best metric of product quality, or GDP is the best metric of human well-being. However they are the best in the sense of being hardest to fudge, i.e. robust to an adversary trying to fool you. As trust declines, we end up leaning more and more on these adversarially robust metrics, and we end up in a gray Brezhnev world where the numbers are going up, everyone knows something is wrong, but the problems get harder and harder to articulate.

Solutions

How can we counteract this tendency to use adversarially robust but terrible metrics?

A popular attempt at a solution is monarchy. If your company or product has a single decision maker with real power and legitimacy, you sidestep the problem of mistrust and can make decisions based on intuition, not metrics. This is the way Steve Jobs’ Apple developed products. This is how Mark Zuckerberg was able to remove news from the Facebook News Feed at a substantial cost to engagement and advertising revenue.

The big problem with monarchy is that it doesn’t scale. The monarch or founder has extremely limited attention, and as soon as they delegate, cowardice and mistrust start to creep back in. Though you can keep these forces at bay for a while by cultivating trustworthy and courageous lieutenants - Catherine the Great, Napoleon, and Jeff Bezos come to mind as exemplars of exceptionally skillful lieutenant-cultivation.

A more decentralized and scalable solution is developing an ideology: a self-reinforcing set of ideas nominally held by everyone in your organization. Having a shared ideology increases trust, and ideologues are able to make decisions against the grain of natural human incentives. This is why companies talk so much about “mission alignment”, though very few organizations can actually pull off having an ideology: when real sacrifices need to be made, either your employees or your investors will rebel.

You can think of metrics as degenerate ideologies whose content is simply “more users better”. We desperately need richer and more humane ideologies! A few ideological tenets I’m most excited about are

Cyborgism: technology should augment human imagination, not replace our agency.
Machine Love: machines should provide support for humans to autonomously pursue their own growth and development.
Institute for Meaning Alignment: the ground of human values is not revealed preference, but attentional policies resulting from constitutive judgment.

but the field is still wide open. If you think you have a new ideology for product design, or want to help create one, please get in touch.

Thanks to Taylor Rogalski, Ernie French and Joe Edelman for helpful discussions on ideas in this essay.

undefined — Caravaggio, *The Calling of St Matthew*

Ben Mathes

Oct 24, 2023Edited

This starts with a great point, but I think you're accidentally straw-manning a bit. While it's true that there are countless ways that using only data as your epistemology can go awry (goodhart's law, mcnamara fallacy, etc.), so too are there countless ways that using only qualitative expertise can go awry.

Reject both extremes! You can't make great decisions without data. You can't make great decisions without qualitative expertise.

In short: "data driven" shouldn't even be an ideology. It's a *method*, when used well, in pursuit of a set of values (e.g. an ideology, for example, though candidly I think anything that's an ideology is a bug-ridden, half-implemented value system)

Expand full comment

Abel

Jul 24, 2024

Dan Davies' notion of Accountability Sinks seems useful for this discussion

Nothing Human

Discussion about this post