Texas Sharpshooter Fallacy¶

Prime #: 1233
Origin domain: Reasoning Rhetoric And Fallacies
Subdomain: pattern inference under multiplicity → Reasoning Rhetoric And Fallacies

Core Idea¶

Constructing a hypothesis after inspecting the data — drawing the explanatory boundary around an observed cluster — and then evaluating it as though it had been specified in advance, so the apparent evidence is inflated by unaccounted boundary-choice freedom.

How would you explain it like I'm…

Paint The Target After

Imagine someone shoots lots of arrows at a wall all over the place, then walks up and draws a bullseye around the spot where a few arrows happened to land close together. Now they brag, 'Look, bullseye!' But that's cheating — they drew the target *after* shooting. You only really hit a target if you pick it *first*.

Bullseye After Shooting

The Texas Sharpshooter Fallacy is when you decide what counts as a 'hit' only *after* you've looked at the results, then act like you called it in advance. The name comes from a shooter who fires a bunch of bullets at a barn wall, then paints a target around wherever the most holes cluster together, and claims to be a great marksman. The trick is that with enough random shots, *some* cluster always shows up just by luck — so a target drawn around it proves nothing. The fairness rule is simple: pick your target before you shoot, or honestly account for how many different targets you *could* have drawn.

Post-Hoc Target Drawing

The Texas Sharpshooter Fallacy is the mistake of building a hypothesis *after* inspecting the data — drawing the explanatory boundary around a cluster you already observed — and then judging it as if it had been specified in advance. The image: a shooter fires at a barn wall, then paints a target around the densest cluster of holes, claiming marksmanship never actually shown. The apparent strength of the evidence comes entirely from the freedom to choose *where* to draw the boundary, exercised silently after seeing the data. It needs three pieces: a noisy source that throws off many possible patterns by chance, so *some* cluster is bound to appear; a post-hoc boundary drawn around an observed cluster to define the claim; and an evidence calculation that tests it as if pre-specified, counting only the data inside the chosen boundary while ignoring all the boundaries that could have been drawn. The fix is to demand either a hypothesis fixed in advance and then tested on fresh data, or a calculation that prices in the boundary-choice freedom.

The Texas Sharpshooter Fallacy is the inferential failure of constructing a hypothesis *after* inspecting the data — by drawing the explanatory boundary around an observed cluster — and then evaluating it as though it had been specified in advance. The image is a shooter who fires at a barn wall and then paints a target around the densest cluster of holes, claiming a marksmanship he never exhibited. The apparent strength of the evidence comes entirely from freedoms of cluster-selection that the analyst exercised after seeing the data and failed to account for in the evidence calculation. The pattern has three structural elements. First, a *noisy substrate* generating many possible patterns by chance, with enough dimensionality or independent samples that *some* cluster of some shape will arise. Second, a *post-hoc boundary* drawn around an observed cluster to define the hypothesis — the disease is caused by the chemical in the area with the cluster; the trader has a strategy that worked in these months; the prophet predicted these events. Third, an *evidence calculation* that tests the hypothesis as if pre-specified, crediting only the data inside the chosen boundary while ignoring the multiplicity of boundaries that could have been drawn. The claim is sharper than 'noisy data fooled someone' or 'people see patterns in randomness': it is that post-hoc cluster-fitting produces apparent evidence disproportionate to its true informativeness, because the freedom of boundary-choice is spent silently before the test. Recognizing it means demanding one of two things — a hypothesis specified in advance and tested on data the analyst had no access to during specification, or a multiplicity-aware calculation that prices in the boundary-choice freedom. Either removes the unaccounted degree of freedom the fallacy converts into illusory support.

Broad Use¶

Epidemiology: a town's cancer cluster blamed on a nearby plant, when the town, years, and cancer type were chosen after the cluster appeared.
Finance: a fund's winning track record is the survivor around which the boundary was drawn.
Prophecy / conspiracy: a "prediction" boundary drawn around verses that, after the event, can be read as matching.
Empirical science: HARKing, p-hacking, and the garden of forking paths — the replication crisis as aggregate symptom.
Machine learning: choosing a feature set by repeated evaluation on a dev set, then reporting it as pre-specified.
Forensic statistics: a DNA match found by searching millions carries less weight than the same match on a pre-identified suspect.

Clarity¶

Separates a prediction's strength from a post-hoc match's strength: the same observation is informative if specified in advance and far less so if identified after the fact.

Manages Complexity¶

Collapses HARKing, p-hacking, survivorship bias, the file-drawer problem, and the look-elsewhere effect into one mechanism with a multiplicity factor — the size of the implicit search — as the load-bearing quantity.

Abstract Reasoning¶

Licenses the pre-vs-post specification test and substrate scaling: the richer the search space, the larger the post-hoc inflation, so vigilance concentrates on high-dimensional, exploratory pipelines.

Knowledge Transfer¶

Statistics: multiple-comparisons correction and false-discovery control price the freedom in.
High-energy physics: the same mechanism is the "look-elsewhere effect."
Meta-science: registered reports remove the freedom by pre-specifying.

Example¶

A researcher tests 1000 genes at p < 0.05; with no true effect, ~50 cross the line by chance, and reporting "these 7 are associated" credits each as a sole pre-specified test — ignoring the 1000-fold implicit search.

Relationships to Other Primes¶

Foundational — no parent edges in the catalog.

Children (1) — more specific cases that build on this

Multiple Comparisons Correction is a kind of, typical Texas Sharpshooter Fallacy — The file frames them as disease and cure: multiple_comparisons_correction is precisely the 'price it in' branch of the fallacy's OWN corrective. TENTATIVE relation only — recorded as a remedy-of link, NOT a true subsumption; the fallacy is broader (covers the uncountable informal forking-paths the enumerated correction misses). Owner: treat as a related/remedy edge, not parent-child if that distorts the DAG.

Not to Be Confused With¶

Texas Sharpshooter Fallacy is not Multiple-Comparisons Correction because the fallacy is the error (post-hoc boundary credited as pre-specified) whereas the correction is the remedy — and the fallacy is broader, covering uncountable informal forking paths a Bonferroni factor never enumerates.
Texas Sharpshooter Fallacy is not Selection Bias because selection bias distorts which units enter the sample whereas the fallacy distorts which hypothesis is drawn around an already-collected sample.
Texas Sharpshooter Fallacy is not Confirmation Bias because confirmation bias is motivated weighting toward a prior belief whereas the fallacy inflates evidence structurally, even for an honest analyst with no prior.