Skip to content

Chapter 16: Causes Trump Statistics

Core idea

In one of Kahneman and Tversky’s classic experiments, participants were told that a cab was involved in a hit-and-run accident. In a city where 85% of cabs are Green and 15% are Blue, a witness identified the cab as Blue. The witness’s identification accuracy under the testing conditions was 80%. What is the probability the cab was Blue?

The mathematical answer, via Bayes’ theorem, is 41% — because the prior probability of a Blue cab is only 15%, and the witness is wrong 20% of the time. Most participants answer around 80% — the witness reliability figure — essentially ignoring the base rate.

Now change the problem: instead of stating the base rates statistically (“85% of cabs are Green”), say instead: “Green cabs are involved in 85% of accidents.” Suddenly, participants incorporate the base rate correctly. The base rate is now causal — it says something about why accidents happen, implicating the Green company’s negligence or driver quality. Statistical base rates are ignored. Causal base rates are used.

This is the core of this chapter: causes trump statistics. When you have both a causal narrative and a statistical fact, the narrative wins. The statistical fact is psychologically inert unless it can be connected to a causal explanation.

Why it matters

Stereotyping as causal base rate

Stereotypes are causal base rates. They say something about a mechanism — about why members of a group behave as they do. When someone says “engineers tend to be introverted,” this feels like a causal claim about engineers as a type, not a mere statistical summary. The causal framing activates representativeness: we judge individuals by whether they resemble the stereotype.

Kahneman distinguishes two types of base rates:

  • Statistical base rates: pure frequency information with no causal interpretation (“15% of cabs are Blue”).
  • Causal base rates: frequencies with a causal explanation attached (“Green cabs cause more accidents because the company has lower safety standards”).

Statistical base rates are largely ignored when a specific case description is available. Causal base rates are incorporated naturally because they speak to mechanism, not just frequency.

The resistance of training

In medicine and clinical psychology, training in statistical base rates often fails to produce the intended behavior change. Clinicians who know the actuarial base rates of a diagnosis often ignore them when faced with a patient who fits the prototype of the condition. The vivid case presentation overrides the statistical prior.

This creates a systematic diagnostic error: rare conditions that match a vivid prototype get over-diagnosed; common conditions that don’t fit a vivid prototype get under-diagnosed. The base rates would correct this — but they are not causal, so they are not used.

A partial solution: make the base rate causal

One way to increase the weight of statistical base rates is to give them a causal interpretation. Instead of presenting the base rate as pure frequency information, connect it to a mechanism. “Why do most of these projects run over budget? Because initial estimates systematically ignore implementation complexity — a pattern documented across the industry.” A base rate attached to a mechanism functions like a causal story and gets used.

Key takeaways

Key takeaways

  • Causes trump statistics: causal base rates (frequencies with a mechanism) are incorporated into judgments; statistical base rates (pure frequency) are largely ignored when a case description is available.
  • The cab problem: telling participants that 85% of accidents involve Green cabs (causal) activates base rate reasoning; telling them that 85% of cabs are Green (statistical) does not.
  • Stereotypes are causal base rates — they imply a mechanism, not just a frequency, which is why they have psychological force even when participants know the statistics.
  • Clinical base rate neglect: trained professionals often ignore actuarial priors when a vivid case presentation matches a prototype — a systematic source of diagnostic error.
  • The inertness of statistics: statistical facts without a causal interpretation do not update judgments — they are known but psychologically non-functional.
  • Fix: give statistical base rates a causal explanation. 'Most X projects fail because of Y' is used; 'X projects fail at rate Z%' is largely ignored.

Mental model

Read it as: When you have a specific case description, statistical base rates are psychologically overridden. Causal base rates — ones that explain why the frequency is what it is — get incorporated naturally. The difference is not logical; it is psychological. Mechanism makes information stick; frequency alone does not.

Practical application

Application domains:

  • Medical diagnosis: a patient who matches the vivid presentation of a rare condition may get that diagnosis over a more common condition with overlapping symptoms. The correct move is to explicitly invoke the base rate before evaluating the specific case.
  • Legal judgment: jurors routinely neglect base rates about defendant populations when presented with a compelling case narrative. Statistical evidence about group base rates does not move verdicts; only evidence about the specific case does.
  • Performance evaluation: when a new hire performs impressively in their first month, the causal explanation (“she’s excellent”) overrides the statistical base rate (“most strong early performers regress toward the group average”). This produces over-promotion of regression artifacts.

Example

A startup accelerator evaluates applications. They know from ten years of data that 85% of companies accepted to the program that have no revenue at application stage fail within three years — a statistical base rate. But when a specific founding team presents a compelling vision, demonstrates technical depth, and has a charismatic dynamic, the panel judges them as “exceptional” — overriding the base rate.

The way to use the statistical base rate is to convert it: “Companies at this stage fail because they haven’t validated market demand — the exceptions that survive are those who find a paying customer in the first 90 days.” Now the panel can ask a causal question about the specific team: have they validated demand? That causal question uses the base rate and evaluates the specific case.

Jump to…

Type to filter; press Enter to open