Skip to content

Chapter 14: Tom W's Specialty

Core idea

Tom W is described as a graduate student who is intelligent but lacks creativity; orderly and meticulous, with a need for neatness and a passion for science fiction and computer systems. Now: which of the nine graduate specialties — including humanities, education, social science, business, law, medicine, engineering, library science, and computer science — is Tom most likely studying?

Most people answer: computer science. The description fits the prototype. But there are roughly 20 times as many students in the humanities and social sciences combined as in computer science. If Tom is drawn at random from the graduate population, the base rate probability that he is in computer science is quite small — and the description, however evocative, is not sufficiently diagnostic to overcome that prior.

The representativeness heuristic: when estimating the probability that something belongs to a category, people substitute the question “How probable is this?” with “How similar is this to the typical member of the category?” Similarity and probability are not the same thing. When they diverge, the representativeness estimate is wrong.

Why it matters

Base rate neglect

The most consequential consequence of representativeness is base rate neglect: when a description is available, people largely ignore the prior probability (base rate) of the category. The base rate — the proportion of graduate students in each specialty — is the statistically correct starting point for any probability estimate. A good Bayesian reasoner would start with the base rate and update based on how diagnostic the description is. System 1 skips the base rate entirely, jumping straight to the similarity judgment.

This is not a failure of information — participants in Kahneman and Tversky’s studies were told about the base rates. They just did not use them. The vivid description dominated the statistical fact.

Representativeness and the inside view

The representativeness heuristic naturally produces what Kahneman calls the inside view: evaluating a case based on its specific features rather than its membership in a reference class. “Does Tom look like a computer scientist?” is an inside-view question. “What fraction of graduate students are computer scientists?” is an outside-view question. System 1 favors the inside view automatically; System 2 is required to import the outside view.

When representativeness works and when it fails

Representativeness is a legitimate input to probability estimation. Similarity to a prototype is often correlated with membership — things that look like ducks often are ducks. The heuristic fails when:

  1. The base rate of the category is very low, making even a poor description dominate the prior.
  2. The description is deliberately constructed (or coincidentally vivid) in a way that triggers prototype matching while carrying little actual information about the category.
  3. The category boundary is defined precisely enough that similarity misleads (stereotypes vs. actual distributions).

Key takeaways

Key takeaways

  • Representativeness heuristic: the probability that something belongs to a category is judged by how similar it is to the typical member of that category — not by base rate probability.
  • Base rate neglect: when a description is available, the prior probability of the category is largely ignored, even when participants are explicitly told the base rates.
  • The Tom W problem: a vivid description of someone matching the computer science prototype makes computer science seem more likely, even when base rates make it improbable.
  • Inside view vs. outside view: representativeness naturally produces inside-view judgments (case features) and suppresses outside-view reasoning (base rates and reference classes).
  • Representativeness is useful when the description is genuinely diagnostic and base rates are roughly equal; it fails when base rates are extreme or descriptions are misleadingly vivid.
  • Confidence in representativeness-based judgments is unaffected by the statistical invalidity of the judgment — people feel certain even when they are wrong.

Mental model

Read it as: The correct probability reasoning starts with the base rate and updates based on how diagnostic the description is. System 1 skips the base rate and jumps to prototype matching — how similar is this person to a typical member of the category? The similarity judgment is confident and vivid; the statistical reasoning is effortful and suppressed. Base rate neglect is the output.

Practical application

Common failure modes:

  • Hiring based on “culture fit”: judging a candidate’s fit with the team based on prototype similarity overrides base rate reasoning about what skills actually predict performance.
  • Clinical stereotyping: a patient whose symptoms fit the prototype of a rare disease gets the rare disease diagnosis — base rate neglect systematically overestimates rare conditions when symptoms match their vivid prototypes.
  • Investment in “classic successes”: a company with the characteristics of a historical success story feels like it will succeed — ignoring the base rate of companies with those characteristics that failed.

Example

A venture capital firm is evaluating two pitches. Pitch A: a founder who is a Stanford CS dropout, speaks in the cadence of famous tech entrepreneurs, and is building software for a large market. Pitch B: a mid-career professional with domain expertise, a working prototype, paying early customers, and modest projections. The firm funds Pitch A.

The representativeness heuristic drove the decision: Pitch A matches the prototype of the successful startup founder more strongly. But the base rate of companies that fit that vivid founder prototype is not better than the base rate of companies with Pitch B’s characteristics — and Pitch B has more concrete evidence of traction. The VC funded the more prototype-consistent story, not the more statistically promising company.

Jump to…

Type to filter; press Enter to open