Garbage In, Garbage Out¶
Core Idea¶
The quality of a transformation's output is bounded above by the quality of its inputs: no internal sophistication can repair defects already present in the input. The load-bearing claim is the non-substitutability of downstream sophistication for input quality — an output-quality problem is structurally an input problem.
How would you explain it like I'm…
Rotten Eggs, Bad Cake
Bad In, Bad Out
Inputs Set the Ceiling
Broad Use¶
- Computing and data engineering: bad inputs produce bad outputs regardless of program correctness.
- Machine learning: label noise and biased corpora set the ceiling on accuracy and fairness — "you can't model your way out of bad data."
- Statistics and meta-analysis: a synthesis inherits the bias of its primary studies, so evidence frameworks grade the underlying trials.
- Accounting and audit: reports are bounded by the integrity of transaction records; major audit failures are GIGO at the data layer.
- Intelligence analysis: assessment quality is bounded by source quality, with notorious failures traceable to bad source intelligence.
- Policy modelling: sophisticated models on bad input parameters yield high-confidence wrong answers.
- Legal adjudication: verdicts are bounded by evidence quality, hence chain-of-custody.
Clarity¶
Re-orders the diagnosis — when output quality disappoints, ask what is the quality of the inputs? before what is wrong with the processing? — and names the false confidence danger of polished outputs that hide input defects.
Manages Complexity¶
Relocates a confusing class of expensive "model/report/assessment failure" surprises to the input layer, turning them into one checkable question about where the quality floor sits.
Abstract Reasoning¶
Rests on the data-processing inequality — for any chain X → Y → Z, no processing of Y can increase its information about X — so downstream effort yields zero marginal return once the input floor is binding.
Knowledge Transfer¶
- Computing → statistics → ML: the principle became a methodological refrain, then data-centric AI as its programmatic form.
- Across substrates: auditors cite ML failures, ML researchers cite intelligence failures, intelligence analysts cite replication failures — one shared diagnosis.
- Anywhere a transformation maps quality-bearing inputs: locate the input-quality floor, intervene at the input, recognize sophistication is not a substitute.
Example¶
A validated clinical-decision-support model under-flags one patient population; the fix is not a bigger model but the input — the target proxy "future spending" diverges from "future need" along access lines, so only redefining the target raises the ceiling.
Relationships to Other Primes¶
Parents (1) — more general patterns this builds on
- Garbage In, Garbage Out presupposes Transformation — GIGO is a quality-MONOTONICITY constraint on a transformation (fidelity to ground truth cannot rise across a single-input map — the data-processing inequality); it presupposes the transformation whose output quality it bounds. The file: transformation is its genus, GIGO 'a constraint on one dimension of it'.
Path to root: Garbage In, Garbage Out → Transformation
Not to Be Confused With¶
- Garbage In, Garbage Out is not Transformation in general because a transformation is any input-to-output mapping with no claim about quality direction, whereas GIGO is the specific quality-monotonicity constraint that fidelity cannot rise across a single-input map.
- Garbage In, Garbage Out is not the negation of Refinement because refinement improves an artifact against a goal (which processing genuinely can do), whereas GIGO bounds the artifact's fidelity to ground truth (which a function of the defective input cannot raise).
- Garbage In, Garbage Out is not a Robustness deficit because robustness is graceful degradation under perturbed inputs, whereas GIGO is the orthogonal claim that no processing recovers fidelity the input never carried.