A system-wide change produces an aggregate outcome that conceals
systematically heterogeneous unit-level changes — the same intervention yields a
vector of per-unit effects, often in different directions, whose distribution
matters separately from its summary statistic.
Imagine your class gets candy and on average everyone got more. But that average hides that some kids got a big pile and some kids actually got less than before. Just knowing the average doesn't tell you who won and who lost. You have to look at how the candy was split up.
What The Average Hides
Distributional effects are when a single change affects different people in different amounts, and even in opposite directions, but the overall number hides all of that. Suppose a new rule makes the town "one percent richer on average." That average could mean the richest families gained a lot while the poorest actually lost money. The single summary number can't tell you that, you have to look at how the effect is spread across everyone. And how you choose to summarize it, average, middle person, or something that cares more about the worst-off, is really a choice about what you think is fair.
Spread Behind The Summary
Distributional Effects name the pattern where a system-wide change produces an aggregate outcome that hides systematically different changes at the unit level, different people or groups affected by different amounts and often in opposite directions. The same shock yields a whole vector of unit-level effects, and the shape of that distribution matters separately from any single summary number. A policy that raises average welfare one percent might raise the top tenth by five percent and lower the bottom by four, and the aggregate signal conceals this. Three things make it specific: the differences are structural, keyed to identifiable properties like income, location, or age rather than random; the aggregate is genuinely uninformative about the distribution, since a positive average is fully compatible with most people losing if the gains are concentrated; and the choice of how to aggregate, mean, median, weighted sum, a fairness-weighted measure, is itself a value judgment that picks different "right answers" from the same effects. That's why the prime carries a value-laden surface exactly at the point where the vector gets collapsed into a single number.
Distributional Effects name the pattern in which a system-wide change produces an aggregate outcome that conceals systematically heterogeneous changes at the unit level, different subpopulations, components, or instances are affected to different degrees and often in different directions. The structural commitment is that the same intervention or shock yields a vector of unit-level effects whose distribution across the population matters separately from its summary statistic. A policy that raises average welfare by one percent may raise the top decile by five percent and lower the bottom by four; the aggregate signal hides this. The signature is therefore an intervention, a population of units, a per-unit effect function keyed to unit-specific properties, and a distribution of effects whose shape changes the appropriate evaluation in ways no scalar summary can recover. Three details distinguish it from siblings. First, the heterogeneity is structural, keyed to identifiable unit properties, income, location, age, genotype, network position, not random; the same unit type re-experiences the same effect direction under repetition. Second, the aggregate measure is not informative about the distribution: a positive average is fully compatible with majority loss when gains are concentrated. Third, the choice of aggregation rule, mean, median, weighted sum, a social-welfare function with curvature, a Pareto criterion, is itself a normative commitment, and different rules pick different right answers from the same unit-level effect vector. The prime thus carries an explicit value-laden surface exactly where the vector is collapsed to a scalar.
It exposes aggregate measures as projections of a higher-dimensional object,
and separates "raises the average" from "is good" — surfacing the load-bearing
aggregation rule.
It compresses heterogeneous-unit analysis to three primitives — intervention,
population with properties, per-unit effect function — so the same four
questions apply across economics, epidemiology, and machine learning.
Disaggregate to recover the vector, conditionally aggregate by subgroup, and
recognise that the aggregation rule (utilitarian sum, Rawlsian worst-off,
Pareto) encodes a value judgment selecting a different optimum.
Welfare economics → ML fairness: group-conditional accuracy is a distributional-effects analysis, with the income decile becoming the demographic group.
Drug trials → policy: responder-stratification becomes targeting an intervention on the population it helps.
Reliability → UX: the Weibull-tail mindset — the average part is fine but the failure-driving subpopulation is not — sharpens retention work on the struggling long tail.
A classifier at 95% overall accuracy can post 99% on a large majority group and
60% on a small minority, the headline number concealing the gap entirely — and a
worst-group rule selects a different "best" model than the mean does.
Parents (1) — more general patterns this builds on
Distributional EffectspresupposesAggregation — distributional_effects is the critical recognition of what the aggregation operation conceals — the vector behind the scalar; it presupposes aggregation as the collapsing step.
Distributional Effects is not Effect Size because distributional effects are the vector behind the scalar, whereas effect size is the scalar magnitude that collapses it (and can hide a majority harmed).
Distributional Effects is not Aggregation because distributional effects are the recognition that the collapse discards a value-laden distribution, whereas aggregation is the neutral operation of collapsing units to a summary.
Distributional Effects is not Selection Bias because distributional effects assume the per-unit effects are honestly measured and ask how they distribute, whereas selection bias corrupts the estimate through non-representative sampling.