Researcher Degrees of Freedom¶
Core Idea¶
Researcher degrees of freedom are the unpinned analytic choices between a question and a reported result — exclusions, transformations, covariates, tests, stopping rules, subgroups — where a silently explored decision tree collapses into a single declared comparison. The multiplicity is invisible, so standard corrections cannot apply, and false confidence inflates even when no individual choice was made in bad faith.
How would you explain it like I'm…
Secret Forking Paths
Garden Of Forking Paths
Broad Use¶
- Statistics and science: plausible analytic flexibility can push a nominal 5% false-positive rate above 60%; the "garden of forking paths."
- Machine-learning evaluation: test-set tuning, architecture sweeps, benchmark and prompt selection summarised as one number.
- Financial backtesting: hundreds of strategy variants on the same history, reporting the best — "backtest overfitting."
- Policy evaluation: choice of outcome window, comparison group, and treatment definition surviving into the published estimate.
- Audit and accounting: inventory method, depreciation schedule, and accrual timing forking the picture of one business.
- Journalism and intelligence: choice of framing, weighted sources, and compared timeframes reproducing the structure.
Clarity¶
Explains the field-wide over-statement of confidence that single-study failures cannot, by separating "was each choice defensible?" (usually yes) from "does the silent comparison budget warrant the confidence?" (usually no).
Manages Complexity¶
Compresses a sprawling list of micro-choices into one quantity — how many de-facto comparisons did the report collapse into one? — and makes pre-registration, holdout separation, and multiverse reporting commensurable as the same move.
Abstract Reasoning¶
Models the analysis as a tree whose branches are the unfixed choices, with warrant discounted by the size of the tree that could have been selectively reported, regardless of any branch's good faith.
Knowledge Transfer¶
- Statistics → finance: pre-registration becomes a held-out test period plus deflated-Sharpe correction.
- Empirical science → ML: declaring the analytic plan becomes physically separating exploration data from evaluation data.
- Research methodology → policy/audit: multiverse and specification-curve reporting reveal the distribution across all defensible branches.
Example¶
A psychology team explores two outcomes, three exclusion rules, a transform, a covariate choice, and two subgroups — roughly 72 paths — finds the one significant leaf, and writes it up as a single declared comparison; the reader sees one p-value and cannot audit the tree it was selected from.
Relationships to Other Primes¶
Parents (1) — more general patterns this builds on
- Researcher Degrees of Freedom is a kind of, typical Bias — RDF is a systematic, directional inferential error (false-positive inflation) produced by an un-audited comparison budget — a specialized inferential bias arising at the analysis/reporting stage, distinct from random noise. is-a bias in the inference pipeline.
Path to root: Researcher Degrees of Freedom → Bias
Not to Be Confused With¶
- Researcher Degrees of Freedom is not Multiple Comparisons Correction because here the comparisons are silent analytic choices no one can count, whereas multiple-comparisons handling corrects declared tests.
- Researcher Degrees of Freedom is not Overfitting because it is a reporting pathology (one leaf declared as if pre-planned), whereas overfitting is a model fitting noise in training data.
- Researcher Degrees of Freedom is not Regret because it is an inferential-warrant error from un-auditable multiplicity, whereas regret is a backward-looking valuation of a forgone outcome.