Correlated-Source Attribution Failure¶

Prime #: 755
Origin domain: Statistics & Experimental Design
Subdomain: identification and estimation → Statistics & Experimental Design

Core Idea¶

When an estimator decomposes an observed effect across several sources that share underlying variation, the joint inference stays strong while attribution to any individual source collapses — becoming unstable, sign-flipping, or arbitrary. Identifying N sources needs N independent dimensions of variation; shared variation leaves the missing dimensions in a null space.

How would you explain it like I'm…

Two Friends, One Project

Imagine two friends who always say the exact same thing at the same time. Together they tell you a lot, but you can never tell which friend's idea it really was. Guessing 'it was Sam's idea, definitely' isn't fair, because Alex always said it too. The team's answer is clear, but who-gets-the-credit is a coin flip.

Good Total, Shaky Blame

Sometimes you try to figure out how much each cause added to a result. This works only if the causes change separately, so you can see each one's own effect. When several causes always move together, something strange happens: your guess about the whole group can stay rock-solid while your guess about each single cause wobbles all over the place. The total prediction is great, but the blame you hand to any one cause could flip or even switch direction if you measured again. So a sharp claim like "this one thing was the reason" sounds confident but the data cannot actually back it up.

Strong Whole, Unstable Parts

Correlated-source attribution failure is when an estimator combines several information sources to explain an effect, but those sources share underlying variation, so the joint inference stays strong while the attribution to any individual source becomes unstable, contradictory, or manipulable. Telling distinct sources apart requires them to vary independently; when they move together, individual identifiability collapses even though the joint predictive content is untouched. The system still looks well-determined in aggregate, the prediction is good and the joint fit is high, yet the per-source shares swing wildly, swap signs when you resample, or carry error bars wide enough to span any story. The trap is that naive readers hear sharp single-source claims, like "witness A was decisive" or "ad spend is the driver," that the data cannot support, because joint strength and marginal identifiability are two different things and only the second has quietly failed.

Correlated-source attribution failure is the structural pattern in which an estimator combines several sources to attribute an observed effect to its inputs, but the sources share underlying variation, so the joint inference can stay strong while attribution to any individual source becomes unstable, contradictory, or manipulable. The core is geometric. Identifying N sources requires N independent dimensions of input variation; if the inputs span only K < N dimensions, only K joint contrasts are identified, and the remaining N - K dimensions of attribution lie in a null space where any assignment is consistent with the data. The joint fit lives in the column space, which the data constrain; the marginal attribution lives in the basis chosen within that space, which the data do not. The system therefore keeps looking well-determined in aggregate, with good prediction and high joint likelihood, even as per-source attributions oscillate, swap signs on resampling, or carry error bars wide enough to span any interpretation. This afflicts any decomposition of an effect into contributions of correlated inputs: regression coefficients, feature-importance scores, a Bayesian posterior over input weights, or a tribunal's per-witness credibility all inherit the same non-identifiability. Naive practice fuses two structurally distinct things, joint inferential strength and marginal identifiability, and the failure is the silent collapse of the second while the first stays healthy.

Broad Use¶

Regression and econometrics: multicollinearity — correlated regressors leave the fit intact while each coefficient becomes unstable and flips sign on resampling.
Multi-sensor fusion: correlated noise across redundant sensors yields false precision while each sensor's calibration stays unconstrained.
Multi-witness testimony: witnesses sharing a briefing or article sound corroborated, yet each per-witness contribution is thin.
Rater aggregation: raters trained on one rubric produce correlated scores that overstate reliability and hide each rater's marginal contribution.
Ensemble forecasting: similarly-trained models have effective size smaller than their count, weakening per-model attributions.
Causal mediation: with correlated mediators, total mediation is identified but per-mediator contributions cannot be separated.

Clarity¶

It separates two things naive practice fuses — joint inferential strength and marginal identifiability — and redirects the diagnostic from "is my model wrong?" to "are my sources sharing variation I have not acknowledged?"

Manages Complexity¶

It compresses multicollinearity, common-method bias, ensemble redundancy, and group-think corroboration into one shape with a four-move catalogue: decorrelate by design, quantify the shared variance, shift the attribution target, or regularise.

Abstract Reasoning¶

It trains a reasoner to distrust per-source confidence precisely when the joint inference looks healthiest, because the health of the joint fit is exactly what masks the marginal collapse.

Knowledge Transfer¶

Across estimation fields: checking variance-inflation factors, crossing trait with method, and blinding two reviewers are the same move — assess and break source independence.
The portable warning: a confident-looking joint result (high R², a corroborated chorus) is the warning sign, not the reassurance, that the per-source story has hollowed out.

Example¶

A legal tribunal hears a chorus of corroborating witnesses and reads a sharp single-witness claim from it — but because the witnesses all drew on the same news article, the bundle carries no more independent weight than that one shared source, and the fix is to sequester witnesses, not re-weigh credibility.

Relationships to Other Primes¶

Parents (2) — more general patterns this builds on

Correlated-Source Attribution Failure is a kind of Confounding — The island's canonical hub correlation is itself detached, so the bridge is drawn to a GIANT prime: correlated_source_attribution_failure (cand) is the failure where a hidden COMMON DRIVER couples supposedly-independent sources, hollowing out per-source attribution — structurally a confounding situation (a shared latent cause inducing spurious cross-dependence). confounding is canonical and giant. child_of confounding is directionally sound (this is confounding read as an attribution/ dimensionality failure) and bridges the cluster; cross_dimensional_leakage (single shared-variance source inflating apparent cross-covariance) is its near-twin and stays internal. Medium because the prime adds an attribution/identifiability angle beyond bare confounding rather than being a textbook sub-case.
Correlated-Source Attribution Failure presupposes Correlation — The downstream estimation PATHOLOGY that correlation produces: when sources share variation, joint inference stays strong while per-source attribution collapses into a null space. Presupposes correlation as its cause; the file is explicit that 'correlation is the cause, the attribution failure is the structural effect'.

Path to root: Correlated-Source Attribution Failure → Correlation

Not to Be Confused With¶

Correlated-Source Attribution Failure is not Correlation because correlation is the bare cause (sources co-vary), whereas this prime is the downstream estimation effect — collapse of per-source identifiability while prediction stays intact.
Correlated-Source Attribution Failure is not Triangulation because triangulation combines independent sources to converge on truth, whereas this is what happens when that independence fails and the corroboration is illusory.
Correlated-Source Attribution Failure is not Selection Bias because selection bias corrupts the joint inference through non-random sampling, whereas this leaves the joint inference valid and corrupts only the per-source attribution.