Regression To Mean Guardrail¶

Prevent ordinary reversion after extreme observations from being credited to an intervention, person, punishment, reward, or event without a credible counterfactual.

Essence¶

Regression-to-the-Mean Guardrail prevents ordinary reversion after extreme observations from being mistaken for causal effect, stable personal change, managerial success, punishment effectiveness, or treatment failure. It is a design and decision discipline, not a slogan applied after results are known.

When cases are selected because a noisy measurement is unusually high or low, that observation usually contains both persistent signal and transient deviation. On remeasurement, the transient part is unlikely to recur in the same direction. The selected group therefore tends to look less extreme even if nothing causal happens.

The guardrail does not assume every change is spurious. It estimates expected no-effect movement, strengthens temporal and concurrent comparisons, preserves measurement integrity, reports controlled effect and uncertainty, and constrains consequential attribution until evidence exceeds plausible reversion.

Compression statement¶

When cases are selected because an imperfect measurement is unusually high or low, estimate their expected no-effect movement, strengthen baseline and counterfactual evidence, decompose observed change, calibrate causal language, and constrain decisions until improvement or deterioration exceeds plausible regression to the mean.

Canonical formula: observed change = expected reversion + temporal process + measurement change + concurrent causes + intervention effect + unresolved uncertainty

When to Use This Archetype¶

Use it whenever treatment, inspection, coaching, reward, sanction, funding, investigation, or follow-up begins because an outcome crossed a threshold, entered the top or bottom of a ranking, reached crisis severity, or otherwise looked exceptional. Use it when the next observation will be compared with that extreme baseline.

The risk grows as selection becomes more extreme, measurement becomes less reliable, and sample size becomes smaller. It is also important when the reference population changes, outcomes trend or cycle, follow-up is selective, or the decision affects rights, safety, access, reputation, or livelihood.

Do not turn the guardrail into fatalism. Genuine treatment effects, persistent extremes, structural harm, and durable improvement exist. The objective is to demand an appropriate counterfactual and calibrated claim, not to insist that every observed change is random.

Structural Problem¶

Human attention and institutional action are triggered by extremes. A patient seeks care when symptoms peak; a school is targeted after a bad year; a manager praises a record month or sanctions a disastrous one; an inspector intervenes after a defect spike. The timing creates a compelling before-intervention/after-intervention story.

That story is structurally biased because the selection observation is not an ordinary baseline. Without repeated baseline data, reliability evidence, a similarly selected comparison, and temporal context, the expected no-effect path is already toward a less extreme observation. Raw change therefore combines reversion with any genuine effect.

The central tension is urgent response versus credible learning. Extreme cases often need action precisely when clean experimentation is hardest. The archetype supports action while separating service or safety decisions from overconfident attribution and preserving evidence for later learning.

Intervention Logic¶

Guardrail step 1. Define the causal claim and decision consequence. Record the owner, selection timing, evidence, uncertainty, and the downstream claim or decision constrained by the result.
Guardrail step 2. Record whether selection depended on an extreme noisy observation. Record the owner, selection timing, evidence, uncertainty, and the downstream claim or decision constrained by the result.
Guardrail step 3. Define the relevant population, subgroup, time regime, and mean target. Record the owner, selection timing, evidence, uncertainty, and the downstream claim or decision constrained by the result.
Guardrail step 4. Estimate measurement reliability and transient variation. Record the owner, selection timing, evidence, uncertainty, and the downstream claim or decision constrained by the result.
Guardrail step 5. Collect repeated pre-intervention baselines where feasible. Record the owner, selection timing, evidence, uncertainty, and the downstream claim or decision constrained by the result.
Guardrail step 6. Map trend, seasonality, maturation, recovery, and concurrent events. Record the owner, selection timing, evidence, uncertainty, and the downstream claim or decision constrained by the result.
Guardrail step 7. Construct a similarly selected concurrent counterfactual. Record the owner, selection timing, evidence, uncertainty, and the downstream claim or decision constrained by the result.
Guardrail step 8. Preserve assignment and exposure integrity. Record the owner, selection timing, evidence, uncertainty, and the downstream claim or decision constrained by the result.
Guardrail step 9. Estimate a no-effect reversion benchmark. Record the owner, selection timing, evidence, uncertainty, and the downstream claim or decision constrained by the result.
Guardrail step 10. Decompose observed movement and report residual uncertainty. Record the owner, selection timing, evidence, uncertainty, and the downstream claim or decision constrained by the result.
Guardrail step 11. Report controlled effect magnitude rather than raw change. Record the owner, selection timing, evidence, uncertainty, and the downstream claim or decision constrained by the result.
Guardrail step 12. Review subgroup and persistent-tail cases. Record the owner, selection timing, evidence, uncertainty, and the downstream claim or decision constrained by the result.
Guardrail step 13. Stabilize outcome measurement. Record the owner, selection timing, evidence, uncertainty, and the downstream claim or decision constrained by the result.
Guardrail step 14. Grade attribution language. Record the owner, selection timing, evidence, uncertainty, and the downstream claim or decision constrained by the result.
Guardrail step 15. Hold or limit consequential decisions when design is weak. Record the owner, selection timing, evidence, uncertainty, and the downstream claim or decision constrained by the result.
Guardrail step 16. Replicate across time and cohorts. Record the owner, selection timing, evidence, uncertainty, and the downstream claim or decision constrained by the result.

The lifecycle begins before outcome review when possible. Once an extreme baseline is celebrated or condemned, later analysis is vulnerable to selective comparison. Pre-specifying selection, outcomes, comparators, and claim thresholds preserves both learning and fairness.

Key Components¶

Regression-to-the-Mean Guardrail protects causal interpretation when a case enters the story precisely because it was extreme. The Extreme Selection Check asks why the case received attention now; if the answer is "because the measure was unusually high or low," ordinary reversion must be part of the evaluation rather than a surprise. The Baseline Distribution replaces a single dramatic value with context — historical variation, subgroup norms, the ordinary range — so later movement can be judged against what the measure usually does. The Expected Reversion Reference states what movement back toward the ordinary range would be plausible without any intervention at all, supplying the no-intervention counterpoint against which credit will be evaluated. The Subgroup Reversion Boundary keeps that reference anchored to the right comparison population, since a subgroup may have a different baseline level or variability than the overall average.

The remaining components add evidence that can separate true effect from ordinary rebound and discipline the conclusions drawn. The Comparison Group provides a reference trajectory of similar untreated or differently treated extreme cases; if they also rebound, the focal intervention deserves less credit than the before/after story implies. The Repeated Measurement Plan prevents one noisy baseline and one later reading from carrying the whole causal claim, making stability and timing visible. The Pre-Intervention Stability Check asks whether the extreme value was a stable condition or a temporary spike, which determines how much rebound to expect. The Measurement Error Check tests whether the original extreme reading was distorted by instrument, coding, or timing noise — a critical question because noisy extremes are especially likely to look improved on remeasurement. The Evaluation Window sets when post-intervention change will be judged, balancing the risk of capturing only ordinary rebound against the risk of mixing in unrelated changes. Finally, the Causal Claim Guard translates the strength of this evidence into language and decisions, preventing reports and reviews from saying the intervention caused the change when the design only supports a weaker claim.

Component	Description
Causal Claim and Decision Scope ↗	Defines the claimed improvement or deterioration, the intervention or event credited, the affected decision, and the consequence of a false attribution. Semantic canonical mapping retained the complete legacy component record: {"slug":"causal_claim_and_decision_scope","name":"Causal Claim and Decision Scope","role":"Defines the claimed improvement or deterioration, the intervention or event credited, the affected decision, and the consequence of a false attribution.","notes":"The guardrail is strongest when it starts before results are seen. It distinguishes descriptive change from a causal claim and identifies which promotion, treatment, sanction, funding, or policy decision could be distorted.","component_type":"required_structural_component","reuse_scope":"cross_domain","maturity":"provisional","host_archetypes":["regression_to_mean_guardrail"],"evidence_expectation":"Record selection timing, comparison logic, uncertainty, owner, affected decision, and evidence that distinguishes reversion from intervention or stable change.","not_a_mechanism_because":"This is a persistent attribution responsibility, not a particular control group, repeated measure, model, plot, simulation, or statistical test."}
Extreme Selection and Trigger Rule ↗	Documents whether cases entered attention, treatment, evaluation, or follow-up because an initial measurement was unusually high, low, bad, good, or threshold-crossing. Semantic canonical mapping retained the complete legacy component record: {"slug":"extreme_selection_and_trigger_rule","name":"Extreme Selection and Trigger Rule","role":"Documents whether cases entered attention, treatment, evaluation, or follow-up because an initial measurement was unusually high, low, bad, good, or threshold-crossing.","notes":"Regression to the mean is especially relevant when selection is conditioned on an extreme noisy observation. The trigger must include thresholds, ranking windows, retries, discretion, and who was not selected.","component_type":"required_structural_component","reuse_scope":"cross_domain","maturity":"provisional","host_archetypes":["regression_to_mean_guardrail"],"evidence_expectation":"Record selection timing, comparison logic, uncertainty, owner, affected decision, and evidence that distinguishes reversion from intervention or stable change.","not_a_mechanism_because":"This is a persistent attribution responsibility, not a particular control group, repeated measure, model, plot, simulation, or statistical test."}
Reference Population and Mean Target ↗	Defines the population, process, subgroup, time regime, and reference expectation toward which repeated observations may revert. Semantic canonical mapping retained the complete legacy component record: {"slug":"reference_population_and_mean_target","name":"Reference Population and Mean Target","role":"Defines the population, process, subgroup, time regime, and reference expectation toward which repeated observations may revert.","notes":"There is no meaningful regression-to-mean claim without a stable comparison target. Population drift, subgroup mixing, seasonality, or changing measurement can move the relevant mean.","component_type":"required_structural_component","reuse_scope":"cross_domain","maturity":"provisional","host_archetypes":["regression_to_mean_guardrail"],"evidence_expectation":"Record selection timing, comparison logic, uncertainty, owner, affected decision, and evidence that distinguishes reversion from intervention or stable change.","not_a_mechanism_because":"This is a persistent attribution responsibility, not a particular control group, repeated measure, model, plot, simulation, or statistical test."}
Signal Reliability and Noise Decomposition ↗	Separates persistent case differences from transient variation, measurement error, random shocks, and context-specific noise. Semantic canonical mapping retained the complete legacy component record: {"slug":"signal_reliability_and_noise_decomposition","name":"Signal Reliability and Noise Decomposition","role":"Separates persistent case differences from transient variation, measurement error, random shocks, and context-specific noise.","notes":"Greater unreliability and more extreme selection imply stronger expected reversion. Reliability evidence should be local to the measure, interval, and population rather than imported from an unrelated setting.","component_type":"required_structural_component","reuse_scope":"cross_domain","maturity":"provisional","host_archetypes":["regression_to_mean_guardrail"],"evidence_expectation":"Record selection timing, comparison logic, uncertainty, owner, affected decision, and evidence that distinguishes reversion from intervention or stable change.","not_a_mechanism_because":"This is a persistent attribution responsibility, not a particular control group, repeated measure, model, plot, simulation, or statistical test."}
Pre-Intervention Repeated Baseline ↗	Uses multiple observations before the intervention or focal event to estimate the case's typical level and determine whether the selection observation was exceptional. Semantic canonical mapping retained the complete legacy component record: {"slug":"pre_intervention_repeated_baseline","name":"Pre-Intervention Repeated Baseline","role":"Uses multiple observations before the intervention or focal event to estimate the case's typical level and determine whether the selection observation was exceptional.","notes":"One baseline cannot reveal whether an extreme value is persistent, trending, seasonal, or transient. Repeated baselines reduce dependence on a lucky or unlucky selection moment.","component_type":"required_structural_component","reuse_scope":"cross_domain","maturity":"provisional","host_archetypes":["regression_to_mean_guardrail"],"evidence_expectation":"Record selection timing, comparison logic, uncertainty, owner, affected decision, and evidence that distinguishes reversion from intervention or stable change.","not_a_mechanism_because":"This is a persistent attribution responsibility, not a particular control group, repeated measure, model, plot, simulation, or statistical test."}
Time-Process and Regime Map ↗	Maps trend, seasonality, maturation, decay, recurrence, recovery cycles, policy timing, and other temporal processes that can resemble or mask ordinary reversion. Semantic canonical mapping retained the complete legacy component record: {"slug":"time_process_and_regime_map","name":"Time-Process and Regime Map","role":"Maps trend, seasonality, maturation, decay, recurrence, recovery cycles, policy timing, and other temporal processes that can resemble or mask ordinary reversion.","notes":"Regression to the mean is not a universal pull toward a timeless average. Temporal structure can change the expected counterfactual and must be separated from reversion.","component_type":"required_structural_component","reuse_scope":"cross_domain","maturity":"provisional","host_archetypes":["regression_to_mean_guardrail"],"evidence_expectation":"Record selection timing, comparison logic, uncertainty, owner, affected decision, and evidence that distinguishes reversion from intervention or stable change.","not_a_mechanism_because":"This is a persistent attribution responsibility, not a particular control group, repeated measure, model, plot, simulation, or statistical test."}
Concurrent Counterfactual Comparison ↗	Constructs a credible untreated, unexposed, not-yet-treated, or otherwise comparable path subject to the same selection and measurement environment. Semantic canonical mapping retained the complete legacy component record: {"slug":"concurrent_counterfactual_comparison","name":"Concurrent Counterfactual Comparison","role":"Constructs a credible untreated, unexposed, not-yet-treated, or otherwise comparable path subject to the same selection and measurement environment.","notes":"If both selected treated and comparable untreated cases improve, the shared movement is evidence against attributing all change to treatment. Comparators must mirror selection and observation, not merely resemble average cases.","component_type":"required_structural_component","reuse_scope":"cross_domain","maturity":"provisional","host_archetypes":["regression_to_mean_guardrail"],"evidence_expectation":"Record selection timing, comparison logic, uncertainty, owner, affected decision, and evidence that distinguishes reversion from intervention or stable change.","not_a_mechanism_because":"This is a persistent attribution responsibility, not a particular control group, repeated measure, model, plot, simulation, or statistical test."}
Assignment and Exposure Integrity ↗	Records randomization, staggered timing, eligibility, crossover, adherence, co-interventions, and contamination that determine whether outcome differences support attribution. Semantic canonical mapping retained the complete legacy component record: {"slug":"assignment_and_exposure_integrity","name":"Assignment and Exposure Integrity","role":"Records randomization, staggered timing, eligibility, crossover, adherence, co-interventions, and contamination that determine whether outcome differences support attribution.","notes":"Random or plausibly exogenous assignment can separate intervention effect from expected reversion, but only when exposure, follow-up, and analysis preserve the assignment contrast.","component_type":"required_structural_component","reuse_scope":"cross_domain","maturity":"provisional","host_archetypes":["regression_to_mean_guardrail"],"evidence_expectation":"Record selection timing, comparison logic, uncertainty, owner, affected decision, and evidence that distinguishes reversion from intervention or stable change.","not_a_mechanism_because":"This is a persistent attribution responsibility, not a particular control group, repeated measure, model, plot, simulation, or statistical test."}
Expected Reversion Benchmark ↗	Estimates how much movement would be expected from extreme selection and imperfect reliability even if no causal intervention occurred. Semantic canonical mapping retained the complete legacy component record: {"slug":"expected_reversion_benchmark","name":"Expected Reversion Benchmark","role":"Estimates how much movement would be expected from extreme selection and imperfect reliability even if no causal intervention occurred.","notes":"The benchmark may use repeated historical episodes, untreated extremes, simulation, shrinkage, or reliability-based expectations. It should be a range, not a rhetorical label.","component_type":"required_structural_component","reuse_scope":"cross_domain","maturity":"provisional","host_archetypes":["regression_to_mean_guardrail"],"evidence_expectation":"Record selection timing, comparison logic, uncertainty, owner, affected decision, and evidence that distinguishes reversion from intervention or stable change.","not_a_mechanism_because":"This is a persistent attribution responsibility, not a particular control group, repeated measure, model, plot, simulation, or statistical test."}
Observed Change Decomposition ↗	Partitions observed movement among expected reversion, secular trend, maturation, seasonality, measurement change, concurrent events, intervention effect, and unresolved uncertainty. Semantic canonical mapping retained the complete legacy component record: {"slug":"observed_change_decomposition","name":"Observed Change Decomposition","role":"Partitions observed movement among expected reversion, secular trend, maturation, seasonality, measurement change, concurrent events, intervention effect, and unresolved uncertainty.","notes":"The decomposition prevents the residual from being treated as known merely because named alternatives were considered. Some shares may remain unidentified and should stay explicit.","component_type":"required_structural_component","reuse_scope":"cross_domain","maturity":"provisional","host_archetypes":["regression_to_mean_guardrail"],"evidence_expectation":"Record selection timing, comparison logic, uncertainty, owner, affected decision, and evidence that distinguishes reversion from intervention or stable change.","not_a_mechanism_because":"This is a persistent attribution responsibility, not a particular control group, repeated measure, model, plot, simulation, or statistical test."}
Effect Size and Uncertainty Contract ↗	Reports the intervention contrast, uncertainty, practical meaning, denominator, and sensitivity to selection and baseline choices rather than celebrating raw before-after movement. Semantic canonical mapping retained the complete legacy component record: {"slug":"effect_size_and_uncertainty_contract","name":"Effect Size and Uncertainty Contract","role":"Reports the intervention contrast, uncertainty, practical meaning, denominator, and sensitivity to selection and baseline choices rather than celebrating raw before-after movement.","notes":"A large raw change among extreme cases can coexist with a small or uncertain causal effect. The contract keeps magnitude, precision, and decision relevance separate.","component_type":"required_structural_component","reuse_scope":"cross_domain","maturity":"provisional","host_archetypes":["regression_to_mean_guardrail"],"evidence_expectation":"Record selection timing, comparison logic, uncertainty, owner, affected decision, and evidence that distinguishes reversion from intervention or stable change.","not_a_mechanism_because":"This is a persistent attribution responsibility, not a particular control group, repeated measure, model, plot, simulation, or statistical test."}
Subgroup, Heterogeneity, and Tail-Case Review ↗	Checks whether expected reversion, reliability, treatment effect, or decision harm differs across subgroups, severity levels, sites, cohorts, or rare cases. Semantic canonical mapping retained the complete legacy component record: {"slug":"subgroup_heterogeneity_and_tail_case_review","name":"Subgroup, Heterogeneity, and Tail-Case Review","role":"Checks whether expected reversion, reliability, treatment effect, or decision harm differs across subgroups, severity levels, sites, cohorts, or rare cases.","notes":"Pooled correction can over-shrink genuine persistent extremes or erase groups with different baselines. Guardrails must not turn regression-to-mean awareness into automatic disbelief of severe cases.","component_type":"required_structural_component","reuse_scope":"cross_domain","maturity":"provisional","host_archetypes":["regression_to_mean_guardrail"],"evidence_expectation":"Record selection timing, comparison logic, uncertainty, owner, affected decision, and evidence that distinguishes reversion from intervention or stable change.","not_a_mechanism_because":"This is a persistent attribution responsibility, not a particular control group, repeated measure, model, plot, simulation, or statistical test."}
Outcome Measurement and Observer Independence ↗	Protects follow-up measurement from changed instruments, awareness, incentives, differential scrutiny, and selective outcome choice after an extreme trigger. Semantic canonical mapping retained the complete legacy component record: {"slug":"outcome_measurement_and_observer_independence","name":"Outcome Measurement and Observer Independence","role":"Protects follow-up measurement from changed instruments, awareness, incentives, differential scrutiny, and selective outcome choice after an extreme trigger.","notes":"If measurement changes after selection, apparent reversion may reflect instrument or observer behavior. Stable definitions and independent measurement preserve the comparison.","component_type":"required_structural_component","reuse_scope":"cross_domain","maturity":"provisional","host_archetypes":["regression_to_mean_guardrail"],"evidence_expectation":"Record selection timing, comparison logic, uncertainty, owner, affected decision, and evidence that distinguishes reversion from intervention or stable change.","not_a_mechanism_because":"This is a persistent attribution responsibility, not a particular control group, repeated measure, model, plot, simulation, or statistical test."}
Attribution Language and Evidence Grade ↗	Calibrates claims from observed change through association to credible causal effect according to design strength and unresolved regression risk. Semantic canonical mapping retained the complete legacy component record: {"slug":"attribution_language_and_evidence_grade","name":"Attribution Language and Evidence Grade","role":"Calibrates claims from observed change through association to credible causal effect according to design strength and unresolved regression risk.","notes":"Communication should name selection, expected reversion, comparator evidence, and uncertainty. It should avoid both triumphal causal language and blanket dismissal of real improvement.","component_type":"required_structural_component","reuse_scope":"cross_domain","maturity":"provisional","host_archetypes":["regression_to_mean_guardrail"],"evidence_expectation":"Record selection timing, comparison logic, uncertainty, owner, affected decision, and evidence that distinguishes reversion from intervention or stable change.","not_a_mechanism_because":"This is a persistent attribution responsibility, not a particular control group, repeated measure, model, plot, simulation, or statistical test."}
Decision Hold, Release, and Reversal Guardrail ↗	Sets rules for delaying, limiting, reversing, or independently reviewing decisions when raw change is likely to be dominated by reversion or when evidence cannot separate causes. Semantic canonical mapping retained the complete legacy component record: {"slug":"decision_hold_release_and_reversal_guardrail","name":"Decision Hold, Release, and Reversal Guardrail","role":"Sets rules for delaying, limiting, reversing, or independently reviewing decisions when raw change is likely to be dominated by reversion or when evidence cannot separate causes.","notes":"The guardrail gives analysis operational force. High-stakes reward, punishment, treatment, or continuation decisions should not rest on a single extreme-before/ordinary-after sequence.","component_type":"required_structural_component","reuse_scope":"cross_domain","maturity":"provisional","host_archetypes":["regression_to_mean_guardrail"],"evidence_expectation":"Record selection timing, comparison logic, uncertainty, owner, affected decision, and evidence that distinguishes reversion from intervention or stable change.","not_a_mechanism_because":"This is a persistent attribution responsibility, not a particular control group, repeated measure, model, plot, simulation, or statistical test."}
Replication, Monitoring, and Learning Loop ↗	Tracks whether effects persist across later measurements, new cohorts, less extreme entrants, and repeated implementation while updating the expected-reversion benchmark. Semantic canonical mapping retained the complete legacy component record: {"slug":"replication_monitoring_and_learning_loop","name":"Replication, Monitoring, and Learning Loop","role":"Tracks whether effects persist across later measurements, new cohorts, less extreme entrants, and repeated implementation while updating the expected-reversion benchmark.","notes":"True effects should demonstrate durability or reproducible contrasts appropriate to the process. Monitoring also detects when reliability, population, or selection rules change.","component_type":"required_structural_component","reuse_scope":"cross_domain","maturity":"provisional","host_archetypes":["regression_to_mean_guardrail"],"evidence_expectation":"Record selection timing, comparison logic, uncertainty, owner, affected decision, and evidence that distinguishes reversion from intervention or stable change.","not_a_mechanism_because":"This is a persistent attribution responsibility, not a particular control group, repeated measure, model, plot, simulation, or statistical test."}

Common Mechanisms¶

Mechanism	Description
Extreme-Selection Risk Flag (`extreme_selection_risk_flag`) ↗	Type: `screening` Marks evaluations triggered by threshold crossing, top/bottom ranking, crisis entry, exceptional performance, or repeated testing after an extreme result. Selection constraints and retained implementation evidence: mirror the extreme selection process; preserve timing and measurement comparability; report uncertainty and unresolved causal alternatives; Complete legacy mechanism record retained: {"slug":"extreme_selection_risk_flag","name":"Extreme-Selection Risk Flag","mechanism_type":"screening","role":"Marks evaluations triggered by threshold crossing, top/bottom ranking, crisis entry, exceptional performance, or repeated testing after an extreme result.","maturity":"provisional","instantiates_archetypes":["regression_to_mean_guardrail"],"selection_constraints":["mirror the extreme selection process","preserve timing and measurement comparability","report uncertainty and unresolved causal alternatives"],"not_an_archetype_because":"This is a concrete flag, measurement protocol, comparison, assignment design, simulation, estimator, series analysis, falsification check, or review gate within the broader attribution safeguard."}
Multi-Baseline Measurement Protocol (`multi_baseline_measurement_protocol`) ↗	Type: `measurement` Collects repeated pre-intervention observations to estimate typical level, reliability, trend, and the extremeness of the selection observation. Selection constraints and retained implementation evidence: mirror the extreme selection process; preserve timing and measurement comparability; report uncertainty and unresolved causal alternatives; Complete legacy mechanism record retained: {"slug":"multi_baseline_measurement_protocol","name":"Multi-Baseline Measurement Protocol","mechanism_type":"measurement","role":"Collects repeated pre-intervention observations to estimate typical level, reliability, trend, and the extremeness of the selection observation.","maturity":"provisional","instantiates_archetypes":["regression_to_mean_guardrail"],"selection_constraints":["mirror the extreme selection process","preserve timing and measurement comparability","report uncertainty and unresolved causal alternatives"],"not_an_archetype_because":"This is a concrete flag, measurement protocol, comparison, assignment design, simulation, estimator, series analysis, falsification check, or review gate within the broader attribution safeguard."}
Matched Extreme-Case Comparator (`matched_extreme_case_comparator`) ↗	Type: `comparison` Compares treated extreme cases with untreated or not-yet-treated cases selected by the same threshold and observed on the same schedule. Selection constraints and retained implementation evidence: mirror the extreme selection process; preserve timing and measurement comparability; report uncertainty and unresolved causal alternatives; Complete legacy mechanism record retained: {"slug":"matched_extreme_case_comparator","name":"Matched Extreme-Case Comparator","mechanism_type":"comparison","role":"Compares treated extreme cases with untreated or not-yet-treated cases selected by the same threshold and observed on the same schedule.","maturity":"provisional","instantiates_archetypes":["regression_to_mean_guardrail"],"selection_constraints":["mirror the extreme selection process","preserve timing and measurement comparability","report uncertainty and unresolved causal alternatives"],"not_an_archetype_because":"This is a concrete flag, measurement protocol, comparison, assignment design, simulation, estimator, series analysis, falsification check, or review gate within the broader attribution safeguard."}
Randomized or Staggered Assignment (`randomized_or_staggered_assignment`) ↗	Type: `experimental_design` Creates an intervention contrast that is less confounded by expected reversion after extreme eligibility. Selection constraints and retained implementation evidence: mirror the extreme selection process; preserve timing and measurement comparability; report uncertainty and unresolved causal alternatives; Complete legacy mechanism record retained: {"slug":"randomized_or_staggered_assignment","name":"Randomized or Staggered Assignment","mechanism_type":"experimental_design","role":"Creates an intervention contrast that is less confounded by expected reversion after extreme eligibility.","maturity":"provisional","instantiates_archetypes":["regression_to_mean_guardrail"],"selection_constraints":["mirror the extreme selection process","preserve timing and measurement comparability","report uncertainty and unresolved causal alternatives"],"not_an_archetype_because":"This is a concrete flag, measurement protocol, comparison, assignment design, simulation, estimator, series analysis, falsification check, or review gate within the broader attribution safeguard."}
Reliability-Based Reversion Simulation (`reliability_based_reversion_simulation`) ↗	Type: `simulation` Uses plausible reliability, variance, and selection thresholds to estimate the no-effect distribution of follow-up movement. Selection constraints and retained implementation evidence: mirror the extreme selection process; preserve timing and measurement comparability; report uncertainty and unresolved causal alternatives; Complete legacy mechanism record retained: {"slug":"reliability_based_reversion_simulation","name":"Reliability-Based Reversion Simulation","mechanism_type":"simulation","role":"Uses plausible reliability, variance, and selection thresholds to estimate the no-effect distribution of follow-up movement.","maturity":"provisional","instantiates_archetypes":["regression_to_mean_guardrail"],"selection_constraints":["mirror the extreme selection process","preserve timing and measurement comparability","report uncertainty and unresolved causal alternatives"],"not_an_archetype_because":"This is a concrete flag, measurement protocol, comparison, assignment design, simulation, estimator, series analysis, falsification check, or review gate within the broader attribution safeguard."}
Shrinkage-Aware Expectation (`shrinkage_aware_expectation`) ↗	Type: `estimation` Combines noisy case evidence with a relevant group expectation so extreme predictions are not treated as perfectly persistent. Selection constraints and retained implementation evidence: mirror the extreme selection process; preserve timing and measurement comparability; report uncertainty and unresolved causal alternatives; Complete legacy mechanism record retained: {"slug":"shrinkage_aware_expectation","name":"Shrinkage-Aware Expectation","mechanism_type":"estimation","role":"Combines noisy case evidence with a relevant group expectation so extreme predictions are not treated as perfectly persistent.","maturity":"provisional","instantiates_archetypes":["regression_to_mean_guardrail"],"selection_constraints":["mirror the extreme selection process","preserve timing and measurement comparability","report uncertainty and unresolved causal alternatives"],"not_an_archetype_because":"This is a concrete flag, measurement protocol, comparison, assignment design, simulation, estimator, series analysis, falsification check, or review gate within the broader attribution safeguard."}
Controlled Before–After Contrast (`controlled_before_after_contrast`) ↗	Type: `evaluation` Compares change over the same interval between selected treated and credible comparison paths. Selection constraints and retained implementation evidence: mirror the extreme selection process; preserve timing and measurement comparability; report uncertainty and unresolved causal alternatives; Complete legacy mechanism record retained: {"slug":"controlled_before_after_contrast","name":"Controlled Before–After Contrast","mechanism_type":"evaluation","role":"Compares change over the same interval between selected treated and credible comparison paths.","maturity":"provisional","instantiates_archetypes":["regression_to_mean_guardrail"],"selection_constraints":["mirror the extreme selection process","preserve timing and measurement comparability","report uncertainty and unresolved causal alternatives"],"not_an_archetype_because":"This is a concrete flag, measurement protocol, comparison, assignment design, simulation, estimator, series analysis, falsification check, or review gate within the broader attribution safeguard."}
Interrupted Series with Pretrend Check (`interrupted_series_with_pretrend_check`) ↗	Type: `time_series` Uses multiple pre- and post-event observations to separate a level or slope change from trend, seasonality, and transient extremes. Selection constraints and retained implementation evidence: mirror the extreme selection process; preserve timing and measurement comparability; report uncertainty and unresolved causal alternatives; Complete legacy mechanism record retained: {"slug":"interrupted_series_with_pretrend_check","name":"Interrupted Series with Pretrend Check","mechanism_type":"time_series","role":"Uses multiple pre- and post-event observations to separate a level or slope change from trend, seasonality, and transient extremes.","maturity":"provisional","instantiates_archetypes":["regression_to_mean_guardrail"],"selection_constraints":["mirror the extreme selection process","preserve timing and measurement comparability","report uncertainty and unresolved causal alternatives"],"not_an_archetype_because":"This is a concrete flag, measurement protocol, comparison, assignment design, simulation, estimator, series analysis, falsification check, or review gate within the broader attribution safeguard."}
Placebo Time, Outcome, or Threshold Check (`placebo_time_outcome_or_threshold_check`) ↗	Type: `falsification` Tests whether apparent effects also appear where the intervention should not operate or at alternative selection moments. Selection constraints and retained implementation evidence: mirror the extreme selection process; preserve timing and measurement comparability; report uncertainty and unresolved causal alternatives; Complete legacy mechanism record retained: {"slug":"placebo_time_outcome_or_threshold_check","name":"Placebo Time, Outcome, or Threshold Check","mechanism_type":"falsification","role":"Tests whether apparent effects also appear where the intervention should not operate or at alternative selection moments.","maturity":"provisional","instantiates_archetypes":["regression_to_mean_guardrail"],"selection_constraints":["mirror the extreme selection process","preserve timing and measurement comparability","report uncertainty and unresolved causal alternatives"],"not_an_archetype_because":"This is a concrete flag, measurement protocol, comparison, assignment design, simulation, estimator, series analysis, falsification check, or review gate within the broader attribution safeguard."}
Attribution-Claim Review Gate (`attribution_claim_review_gate`) ↗	Type: `governance` Requires selection, counterfactual, expected-reversion, uncertainty, subgroup, and persistence evidence before approving a causal success or failure claim. Selection constraints and retained implementation evidence: mirror the extreme selection process; preserve timing and measurement comparability; report uncertainty and unresolved causal alternatives; Complete legacy mechanism record retained: {"slug":"attribution_claim_review_gate","name":"Attribution-Claim Review Gate","mechanism_type":"governance","role":"Requires selection, counterfactual, expected-reversion, uncertainty, subgroup, and persistence evidence before approving a causal success or failure claim.","maturity":"provisional","instantiates_archetypes":["regression_to_mean_guardrail"],"selection_constraints":["mirror the extreme selection process","preserve timing and measurement comparability","report uncertainty and unresolved causal alternatives"],"not_an_archetype_because":"This is a concrete flag, measurement protocol, comparison, assignment design, simulation, estimator, series analysis, falsification check, or review gate within the broader attribution safeguard."}

Attribution-Claim Review Gate
Controlled Before–After Contrast
Extreme-Selection Risk Flag
Interrupted Series with Pretrend Check
Matched Extreme-Case Comparator
Multi-Baseline Measurement Protocol
Placebo Time, Outcome, or Threshold Check
Randomized or Staggered Assignment
Reliability-Based Reversion Simulation
Shrinkage-Aware Expectation

Parameter / Tuning Dimensions¶

Selection extremity: Distance from the relevant expectation and number of selection opportunities.
Measurement reliability: Persistent signal relative to transient noise in the actual population and interval.
Baseline depth: Number, spacing, and comparability of pre-intervention observations.
Reference mean: Population, subgroup, cohort, site, and time regime used for expected reversion.
Comparator fidelity: Similarity in selection rule, timing, measurement, opportunity, and concurrent conditions.
Assignment strength: Random, staggered, quasi-random, matched, or observational exposure contrast.
Temporal complexity: Trend, seasonality, maturation, recurrence, recovery, and regime changes.
Reversion benchmark width: Uncertainty in the expected no-effect movement.
Effect threshold: Controlled magnitude needed for practical or causal claims.
Follow-up horizon: Time needed to distinguish transient rebound from durable change.
Subgroup granularity: Extent to which reliability, baseline, and effect differ across groups.
Measurement independence: Protection against changed instruments, observers, incentives, and scrutiny.
Decision reversibility: Ability to delay, limit, reverse, or monitor action while evidence develops.
Claim grade: Observed change, association, suggestive effect, credible effect, or replicated effect.
Replication breadth: New cohorts, less extreme entrants, sites, and time periods.
Ethical comparison constraint: Limits on withholding, delay, randomization, or data use.

Tune jointly. More extreme selection raises reversion concern; higher reliability reduces it. Better comparators strengthen attribution but may be unavailable or unethical. Longer follow-up improves durability evidence but delays decisions. In urgent contexts, act for safety while limiting the causal claim.

Invariants to Preserve¶

Selection rule remains explicit.
Reference mean is population- and time-relevant.
Reliability and measurement change remain visible.
Comparison mirrors extreme selection.
Raw change remains distinct from controlled effect.
Unresolved causes remain unresolved.
Persistent extremes and vulnerable cases are not dismissed.
Causal language matches design strength.
Decision safeguards respond to evidence quality.
Learning extends beyond one cohort.
Regression to the mean remains a probabilistic expectation, not a deterministic law for each case.
A weak causal design never becomes strong merely by applying a statistical correction.
Urgent service and safety obligations remain distinct from claims that the intervention caused improvement.

These invariants allow mechanisms to vary while protecting attribution, dignity, and learning. They prevent both causal exaggeration and misuse of statistical caution to dismiss real effects or persistent need.

Target Outcomes¶

Fewer false success and failure claims.
Better treatment and program evaluation.
Fairer reward, sanction, and personnel decisions.
More credible effect estimates.
Less overreaction to threshold recrossing.
Better protection of genuine persistent extremes.
Clearer uncertainty and counterfactual reasoning.
More reproducible causal learning.

Success means decisions and claims reflect what the evidence can distinguish. The system may still act under uncertainty, but it no longer converts an extreme-before/ordinary-after sequence into automatic proof.

Tradeoffs¶

Rapid action versus stronger baseline evidence: resolve using outcome reliability, selection extremity, causal stakes, ethical constraints, decision reversibility, and who bears false attribution.
Ethical treatment access versus untreated comparison: resolve using outcome reliability, selection extremity, causal stakes, ethical constraints, decision reversibility, and who bears false attribution.
Simple explanation versus causal uncertainty: resolve using outcome reliability, selection extremity, causal stakes, ethical constraints, decision reversibility, and who bears false attribution.
Individual evidence versus pooled shrinkage: resolve using outcome reliability, selection extremity, causal stakes, ethical constraints, decision reversibility, and who bears false attribution.
Local comparator fit versus sample size: resolve using outcome reliability, selection extremity, causal stakes, ethical constraints, decision reversibility, and who bears false attribution.
Repeated measurement versus burden: resolve using outcome reliability, selection extremity, causal stakes, ethical constraints, decision reversibility, and who bears false attribution.
Conservative claims versus learning speed: resolve using outcome reliability, selection extremity, causal stakes, ethical constraints, decision reversibility, and who bears false attribution.
Randomization strength versus operational feasibility: resolve using outcome reliability, selection extremity, causal stakes, ethical constraints, decision reversibility, and who bears false attribution.
Follow-up duration versus decision latency: resolve using outcome reliability, selection extremity, causal stakes, ethical constraints, decision reversibility, and who bears false attribution.
Stable outcome definitions versus evolving practice: resolve using outcome reliability, selection extremity, causal stakes, ethical constraints, decision reversibility, and who bears false attribution.
False positive effect claims versus false dismissal of real change: resolve using outcome reliability, selection extremity, causal stakes, ethical constraints, decision reversibility, and who bears false attribution.
Standardized guardrails versus domain context: resolve using outcome reliability, selection extremity, causal stakes, ethical constraints, decision reversibility, and who bears false attribution.

Evaluation rigor has cost, but poor attribution also has cost: ineffective programs are scaled, useful ones are abandoned, people are rewarded or punished for noise, severe cases are disbelieved, and future learning is trained on false stories.

Failure Modes¶

Extreme Selection Omitted¶

Cause. The analysis never records that intervention or attention began because the baseline was unusually high or low.

Mitigation. Make the trigger rule and unselected cases explicit before outcome review. Recheck the attribution grade and the downstream decision after remediation.

Single Baseline Attribution¶

Cause. One noisy extreme observation defines the baseline and all later movement is credited to treatment or management.

Mitigation. Use repeated pre-intervention measures and a reliability estimate. Recheck the attribution grade and the downstream decision after remediation.

Average Comparator Mismatch¶

Cause. Selected extreme cases are compared with ordinary average cases rather than similarly selected untreated cases.

Mitigation. Mirror selection threshold, timing, opportunity, and measurement in the comparator. Recheck the attribution grade and the downstream decision after remediation.

Timeless Mean Fallacy¶

Cause. Reversion is asserted toward a historical mean despite trend, seasonality, maturation, or regime change.

Mitigation. Define the relevant population-time expectation and map temporal structure. Recheck the attribution grade and the downstream decision after remediation.

Raw Change Equals Effect¶

Cause. Before-after movement is reported as causal effect without expected-reversion or counterfactual decomposition.

Mitigation. Report the controlled contrast, expected reversion range, and unresolved alternatives. Recheck the attribution grade and the downstream decision after remediation.

Overcorrection To Null¶

Cause. Regression-to-mean awareness becomes a reason to dismiss every improvement, decline, or persistent extreme.

Mitigation. Use design evidence and persistence; do not treat the guardrail as proof of no effect. Recheck the attribution grade and the downstream decision after remediation.

Unfair Shrinkage¶

Cause. Individuals or subgroups with real persistent extremes are pulled mechanically toward an inappropriate pooled mean.

Mitigation. Use relevant subgroup expectations, heterogeneity review, and protected tail-case escalation. Recheck the attribution grade and the downstream decision after remediation.

Measurement Change After Trigger¶

Cause. Follow-up uses different instruments, observers, incentives, or scrutiny than the selection measurement.

Mitigation. Stabilize outcome definitions and audit observer and instrument independence. Recheck the attribution grade and the downstream decision after remediation.

Selective Followup And Survivorship¶

Cause. Only cases with convenient follow-up or continued exposure remain in the apparent success sample.

Mitigation. Preserve cohort flow, missing outcomes, exposure, and intention-based contrasts. Recheck the attribution grade and the downstream decision after remediation.

Counterfactual Contamination¶

Cause. Comparison units receive the intervention, face different concurrent events, or are observed on different schedules.

Mitigation. Track assignment, exposure, spillover, co-intervention, and measurement parity. Recheck the attribution grade and the downstream decision after remediation.

Claim Language Outpaces Design¶

Cause. A weak before-after pattern is communicated as proof, cure, accountability, or ability.

Mitigation. Grade attribution language to design strength and disclose selection and reversion risk. Recheck the attribution grade and the downstream decision after remediation.

One Cycle Learning¶

Cause. A single cohort's rebound becomes permanent policy without replication across later, less extreme, or differently timed cases.

Mitigation. Require persistence, repeated cohorts, and updated reversion benchmarks. Recheck the attribution grade and the downstream decision after remediation.

Neighbor Distinctions¶

`effect_size_standardization`¶

Standardizes magnitude for interpretation. It does not by itself distinguish causal effect from ordinary reversion after extreme selection.

`effort_based_vs_inherent_ability_attribution`¶

Calibrates performance explanations among effort, ability, strategy, and luck. Regression-to-the-Mean Guardrail is a general evaluation structure for any extreme-selected intervention or event.

`heuristic_calibration_and_confidence_judgment`¶

Improves judgment confidence and heuristic use. This archetype focuses specifically on repeated outcomes after extreme selection and the risk of causal misattribution.

`knowledge_threshold_crossing_communication`¶

Communicates when evidence crosses a knowledge threshold. The guardrail determines whether apparent post-selection change supplies valid causal evidence in the first place.

`time_series_cross_section_analysis`¶

Combines temporal and cross-sectional comparisons as an analysis design. It can implement the guardrail but does not own its selection, reversion, attribution, and decision safeguards.

`trend_detection_and_removal`¶

Separates systematic trend from residual movement. Trend is one competing temporal explanation; regression to the mean arises from extreme selection plus imperfect persistence.

`counterfactual_comparison`¶

Compares actual and plausible alternative paths broadly. This guardrail is a distinct frozen neighbor specialized to extreme-selection reversion and its attribution hazards.

`confounder_control`¶

Controls common causes of exposure and outcome. Regression to the mean can occur without a third-variable confounder and requires explicit selection and reliability evidence.

`selection_bias_mitigation`¶

Addresses nonrepresentative inclusion and conditioning generally. This archetype focuses on movement expected when the inclusion variable is itself an extreme noisy measure.

`effect_size_reporting`¶

Reports magnitude and practical meaning. A well-reported raw before-after effect can still be mostly expected reversion.

`stationarity_validation`¶

Checks whether past predictive conditions persist. The guardrail uses a relevant mean and temporal map but owns extreme-selection causal attribution rather than general regime stability.

`distributional_assumption_governance`¶

Governs probability-family commitments. It may support a reversion benchmark, while this q46 archetype governs design and attribution after extreme selection.

Frozen reconciliation provides the strongest boundary evidence: Counterfactual Comparison is a broad causal/value comparison archetype and explicitly lists regression_to_mean_guardrail as a neighbor risk. The guardrail survives because extreme-selection reversion has its own triggers, evidence requirements, misuse risks, and decision protections.

Cross-Domain Examples¶

Clinical Care¶

Estimate expected symptom reversion after crisis-triggered enrollment before crediting all improvement to treatment.

Severity selection, natural fluctuation, recovery, and high-stakes continuation decisions coincide. The guardrail documents selection, reliability, relevant expectation, counterfactual, expected reversion, controlled effect, claim grade, and decision response.

Education¶

Compare lowest-scoring schools with similarly selected not-yet-treated schools before crediting intervention for score rebound.

Noisy rankings and threshold targeting create predictable convergence. The guardrail documents selection, reliability, relevant expectation, counterfactual, expected reversion, controlled effect, claim grade, and decision response.

Workforce And Sales¶

Avoid crediting a warning for an unusually poor month or praise for an unusually strong month without repeated performance evidence.

Reward and sanction follow extreme metrics with substantial transient noise. The guardrail documents selection, reliability, relevant expectation, counterfactual, expected reversion, controlled effect, claim grade, and decision response.

Public Safety¶

Evaluate hotspot intervention against matched extreme areas and pretrends rather than raw decline after a spike.

Sites are selected after exceptional incident counts and face common temporal forces. The guardrail documents selection, reliability, relevant expectation, counterfactual, expected reversion, controlled effect, claim grade, and decision response.

Quality Management¶

Distinguish natural return after an extreme defect period from durable process improvement.

Inspection and corrective action are triggered by control-limit breaches. The guardrail documents selection, reliability, relevant expectation, counterfactual, expected reversion, controlled effect, claim grade, and decision response.

Risk Screening¶

Require stable retest and outcome evidence before treating threshold recrossing as risk elimination.

Measurement error and threshold selection can produce apparent normalization. The guardrail documents selection, reliability, relevant expectation, counterfactual, expected reversion, controlled effect, claim grade, and decision response.

Extended school-improvement example¶

A school district directs support to the ten schools with the largest one-year score decline. A naïve evaluation would compare the unusually low selection year with the next year and credit all rebound to support. The guardrail records the ranking and threshold, reconstructs multiple prior years, estimates score reliability, maps cohort and policy changes, and selects comparison schools using the same extreme-decline rule in later rollout waves. Analysts estimate expected no-effect rebound from historical extreme episodes and compare controlled changes, subgroup outcomes, attendance, and independent measures. The result shows substantial ordinary reversion plus a smaller uncertain intervention contrast. Leaders continue low-risk support, avoid punitive claims about schools whose scores remain volatile, pre-register the next cohort, and require persistence before scaling expensive elements.

Across domains, the intervention is stable: reveal selection on an extreme, estimate ordinary reversion, strengthen comparison, constrain attribution, and preserve action proportional to evidence.

Non-Examples¶

Calling every post-crisis improvement a treatment cure
Calling every improvement mere regression without comparison
Comparing worst performers with average performers
Reporting raw before-after change as controlled effect
Mechanically shrinking every rare or severe case to a pooled mean
Using a single reversion correction with no decision safeguard

A warning about regression to the mean is not sufficient. The archetype requires a relevant mean, reliability, comparison, uncertainty, attribution grade, and enforceable decision safeguard.

Abstractions this archetype builds on — directly (a source ingredient) or as a related pattern. Links follow the typed catalog namespace.

Built directly on (1)

Regression to the Mean: Extremes return toward average.

Also references 22 related abstractions

Bayesian Updating: Update beliefs with evidence.
Bias: Systematic, directional error distinct from random noise.
Calibration: Aligning a system's output to a trusted reference by measuring deviation, adjusting to reduce it, and monitoring for drift.
Causality: Cause-effect relationships.
Confounding: Hidden variable interference.
Counterfactual Reasoning: Hypothetical alternatives.
Counterfactuals: Alternate hypothetical scenarios.
Data Integrity: Accuracy and consistency preserved.
Effect Size: Magnitude of effect.
Experimental Design: Structuring an investigation through deliberate intervention, controlled assignment, and measurement so that causation can be distinguished from mere correlation and confounding.

▸ Show 12 more

Variants¶

Narrower or domain-specific specializations that share this archetype's core structure. Recognized variants are established; candidate variants are provisional.

Clinical and Service Recovery Guardrail · domain variant · recognized

Guard treatment or service claims when people enter care at unusually severe moments and may improve partly through ordinary fluctuation or recovery.

Distinct from parent: The parent also governs performance, policy, quality, screening, and organizational evaluations.
Use when: entry follows symptom or performance crisis; outcomes fluctuate naturally; before-after improvement drives continuation or funding.
Typical domains: medicine, mental health, social services, rehabilitation
Common mechanisms: multi baseline measurement protocol, matched extreme case comparator, controlled before after contrast

Performance Reward and Sanction Guardrail · governance variant · recognized

Guard reward, punishment, promotion, dismissal, coaching, or accountability claims triggered by unusually high or low performance.

Distinct from parent: Other applications evaluate treatments, policies, or processes rather than people or units under reward and sanction.
Use when: actors are selected from rankings or threshold breaches; performance measures contain substantial noise; follow-up movement is credited to praise, sanction, or management.
Typical domains: education, sports, sales, workforce management
Common mechanisms: extreme selection risk flag, shrinkage aware expectation, attribution claim review gate

Threshold Screening and Reclassification Guardrail · risk or failure variant · recognized

Guard interpretation when eligibility, diagnosis, alarm, inspection, or reclassification begins after one measurement crosses a threshold.

Distinct from parent: Selection can also arise from rankings, crises, discretion, or retrospective extreme-case analysis.
Use when: measurement noise moves cases across a cutoff; retesting follows a positive or extreme screen; crossing back below the threshold is treated as intervention success.
Typical domains: screening, quality control, risk alerts, regulatory inspection
Common mechanisms: extreme selection risk flag, multi baseline measurement protocol, placebo time outcome or threshold check

Policy and Program Extreme-Unit Guardrail · scale variant · recognized

Guard program claims when jurisdictions, sites, schools, teams, or periods are targeted because they recently had exceptional outcomes.

Distinct from parent: Individual and simple measurement settings may not require clustered policy counterfactuals.
Use when: intervention targets worst or best units; unit outcomes vary over time; policy success is inferred from post-selection convergence.
Typical domains: public policy, school improvement, public safety, organizational quality
Common mechanisms: randomized or staggered assignment, controlled before after contrast, interrupted series with pretrend check

Near names: Regression-to-the-Mean Guardrail, Extreme-Selection Attribution Check, Statistical Regression Guardrail.