Cromwell's Rule¶
Core Idea¶
Never assign a prior probability of exactly 0 or 1 to a contingent proposition, because Bayesian updating multiplies prior by likelihood and zero is absorbing: a credence pinned at the boundary is evidence-sterile, unmovable by any observation.
How would you explain it like I'm…
You Might Be Wrong
Keep A Tiny Maybe
Never Zero, Never Certain
Broad Use¶
- Bayesian statistics: a prior with zero mass on a parameter value can never be updated to nonzero mass, so proper priors stay positive across the support.
- Machine learning / NLP: zero-count events yield undefined estimates, so smoothing (Laplace, Good–Turing, Kneser–Ney) keeps unseen events at small but nonzero probability.
- Law: the presumption of innocence is a deliberate nonzero prior, and appeals exist because no finding should be infinitely irrevisable.
- Intelligence analysis: confidence scales rather than "100% certain" assessments implement the rule institutionally.
- Reinforcement learning: epsilon-greedy and Thompson sampling keep nonzero exploration on apparently dominated options.
- Ideology: dogma, conspiracy, and fundamentalism pin a proposition at 1, with predictable evidence-sterility.
Clarity¶
Sharpens the categorical gap between low probability and zero, and between high and certainty — one permits learning, the other forbids it.
Manages Complexity¶
Collapses a long catalogue of "why won't this system learn?" failures into one diagnosis (a boundary commitment) with one family of fixes: floor, smooth, institutionalize revisability.
Abstract Reasoning¶
Treats any failure-to-update as a question about where credence sits rather than about the quality of the evidence; the endpoints 0 and 1 are absorbing states, not extreme confidences.
Knowledge Transfer¶
- Statistics → law: a positive prior and revisable verdicts (appeals as posterior updates).
- Language modeling → policy: smoothing of unseen events becomes contingency planning for unseen scenarios.
- RL → strategy: exploration noise becomes funding long-shot projects at small nonzero levels.
Example¶
A naive Bayes spam classifier that scores a word never seen in spam at probability 0 multiplies the whole document score to zero — Laplace smoothing floors every word at small positive probability so unseen words can no longer sterilize all other evidence.
Relationships to Other Primes¶
Parents (1) — more general patterns this builds on
- Cromwell's Rule is a kind of Bayesian Updating — Cromwell's rule is a specific boundary CONSTRAINT on bayesian_updating: because zero is absorbing for multiplication, contingent priors must stay off 0 and 1 or the update is inert. A corollary of the algebra, a specialization of the general updating mechanism.
Path to root: Cromwell's Rule → Bayesian Updating → Inductive Reasoning
Not to Be Confused With¶
- Cromwell's Rule is not Bayesian Updating because the rule is a specific boundary constraint (keep priors off 0 and 1), whereas Bayesian updating is the general mechanism of revising prior by likelihood.
- Cromwell's Rule is not Falsifiability because the rule is a property of an agent's credence (pinned at the multiplicative boundary), whereas falsifiability is a property of a theory's content; a falsifiable theory held at probability 1 still violates Cromwell.
- Cromwell's Rule is not Epistemic Humility because the rule is a mechanical fact (boundary credence is inert regardless of sincerity), whereas humility is an attitude; the defect lives in the prior, not in temperament.