Confidence Intervals¶

Prime #: 436
Origin domain: Statistics & Experimental Design
Aliases: Interval Estimate, Margin of Error, CI, Credible Interval Frequentist
Related primes: Statistical Significance (p-Value), Hypothesis Testing (Null vs. Alternative), Effect Size, Statistical Power, Sampling (Representativeness), Bayesian Updating, Reproducibility & Replicability, Randomization

Core Idea¶

Confidence Intervals provide a range of plausible values for a parameter (like a mean, difference in means, or proportion), constructed so that if the experiment were repeated many times, a specified percentage (often 95%) of those intervals would contain the true parameter.

How would you explain it like I'm…

No faithful explanation at this level. All three generators marked NA with the same reasoning: kindergarten vocabulary cannot represent the long-run frequency coverage property of the construction procedure without collapsing into the Bayesian-flavored misinterpretation ('we're 95% sure the true value is in this range') that the catalog explicitly warns against.

A range of likely values

A confidence interval is a range of values computed from your data that is meant to bracket some unknown true number, like the true average height of all 10-year-olds. The trick is in how you build the range: you use a recipe that, if you repeated your whole study many times with fresh samples, would catch the true number a known fraction of the time — usually 95 times out of 100. So the guarantee is about the recipe over many studies, not about any one particular interval you happen to get.

Range estimate with coverage guarantee

A confidence interval is an interval [L, U] computed from sample data that is designed to cover the true unknown parameter a pre-specified fraction of the time — typically 95% — under repeated sampling from the same data-generating process. The coverage guarantee is a property of the construction procedure across hypothetical repetitions, not a probability statement about any specific realized interval. Wider intervals mean less precise estimates; narrower ones mean more precise. Confidence intervals communicate both a point estimate and its uncertainty at once, which is more informative than a yes/no significance test. The persistent misreading is to say 'there is a 95% probability the true value lies in this interval,' which is a Bayesian-flavored statement that frequentist confidence intervals do not licence.

A confidence interval is an interval [L(X), U(X)] computed from sample data X such that, under repeated sampling from the same data-generating process, the procedure covers the true parameter θ with a pre-specified long-run frequency 1−α (typically 95%): P(L(X) ≤ θ ≤ U(X)) ≥ 1−α under the assumed model. Coverage is a property of the procedure, not of any realized interval after data are observed — the central frequentist subtlety that distinguishes confidence intervals from Bayesian credible intervals, which do support direct probability statements about θ given the data and a prior. Neyman's 1937 construction obtains the interval by inverting a family of hypothesis tests: the CI is the set of parameter values that would not be rejected at level α, which is why a 95% CI excludes a null iff a two-sided test rejects at α = 0.05. CIs come in many flavors — Wald, score (Wilson), likelihood-ratio, exact (Clopper-Pearson), bootstrap, simultaneous (Bonferroni, Tukey, Scheffé) — each suited to different sample-size and modeling regimes. CIs combine point estimate and uncertainty in one summary, which is why the New Statistics movement and major reform statements (ASA 2016, 2019) advocate CI-centric reporting over p-value dichotomies.

Broad Use¶

Epidemiology: An interval for infection rate in a population, e.g., "15% to 20% at 95% confidence," acknowledging sampling variability.
Finance: Interval estimates of expected return or risk, showing the uncertainty around a point estimate.
Manufacturing: Confidence intervals for defect rates after sampling batches, used to gauge quality control.
Psychology Experiments: Confidence intervals around group means or effect sizes (Cohen's d) illustrating probable ranges.

Clarity¶

Goes beyond a single estimate—like "the average is 50"—to convey the margin of error and thus the reliability or precision of the measurement.

Manages Complexity¶

Rather than presenting a false sense of exactness, intervals help incorporate inherent sampling variability, giving a more nuanced picture.

Abstract Reasoning¶

Underscores that data-based estimates are rarely absolute points but distributions, bridging the concept that knowledge about parameters is probabilistic in nature.

Knowledge Transfer¶

Climate Science: Confidence intervals for temperature anomalies or greenhouse gas effects illustrate uncertainties.
Polling: Political poll results are more transparent when including "±3% margin of error."

Example¶

A drug efficacy trial might estimate a 10% improvement in recovery rates, with a 95% confidence interval of (3%, 17%), telling clinicians the true benefit plausibly lies anywhere within that range given the data.

Relationships to Other Abstractions¶

Current abstraction Confidence Intervals Prime

Parents (2) — more general patterns this builds on

Confidence Intervals is a kind of Uncertainty Prime

Confidence intervals are a specific kind of uncertainty quantification, supplying interval estimates with calibrated long-run coverage.
Confidence Intervals presupposes Statistical Inference Prime

Confidence intervals presuppose statistical inference because they are an interval-estimate procedure whose calibrated coverage is defined within the inferential framework.

Hierarchy paths (5) — routes to 4 parentless roots

Confidence Intervals → Uncertainty

Show alternative paths (4)

Not to Be Confused With¶

Confidence Intervals is not Probability because Probability is the mathematical measure of likelihood for single events or propositions, while Confidence Intervals use probability theory to construct ranges of values that capture population parameters with specified long-run coverage frequency.
Confidence Intervals is not Statistical Inference because Statistical Inference is the broader process of drawing conclusions about populations from samples (including estimation, hypothesis testing, prediction), while Confidence Intervals are a specific tool for interval estimation of parameters.
Confidence Intervals is not Statistical Significance (p-Value) because Statistical Significance tests whether an effect differs from a null value through p-values, while Confidence Intervals estimate the range of plausible parameter values with known coverage properties.
Confidence Intervals is not Hypothesis Testing (Null vs. Alternative) because Hypothesis Testing makes binary decisions comparing observed data to null predictions, while Confidence Intervals provide interval estimates capturing the range of compatible parameter values.
Confidence Intervals is not Calibration because Calibration is aligning subjective probability judgments with observed frequencies, while Confidence Intervals construct intervals with guaranteed coverage frequency under repeated sampling.