Predictive Coding¶
Core Idea¶
Predictive coding is the structural pattern in which a system maintains an internal generative model that continuously predicts its incoming signal, compares the prediction against the actual input, and then transmits, stores, or acts upon only the residual — the prediction error. The essential commitment is that the expected part is suppressed and only the surprising part propagates; the model is then updated by the error so that future predictions improve. It is a predict–compare–correct loop, not merely a smaller encoding.
How would you explain it like I'm…
Pay attention only to surprises
Predict, compare, send only surprise
Predict, Compare, Send the Error
Broad Use¶
- Computational neuroscience: cortical hierarchies pass prediction errors upward while higher levels send predictions downward (Rao & Ballard; Friston's free-energy account).
- Signal processing: differential pulse-code modulation (DPCM) and linear predictive coding transmit the difference between a predicted and actual sample, slashing bandwidth.
- Control and estimation (non-obvious): the Kalman filter advances a state prediction and corrects it by the innovation (measurement minus prediction), the exact same residual loop.
- Machine learning: autoregressive and self-supervised models learn by predicting the next token/frame and back-propagating the error.
- Perception and reading: expectation fills in the predicted; attention and effort spike at violated predictions (garden-path sentences, visual surprise).
- Organizations: forecast-and-variance management reports only deviations from plan ("management by exception").
Clarity¶
Naming predictive coding lets practitioners see that information lives in the unexpected: a system can be efficient precisely because it spends resources only where reality departs from its model. It distinguishes the model (what is expected) from the error channel (what must be explained), making "surprise" a first-class, measurable quantity.
Manages Complexity¶
It bounds processing and bandwidth to the residual stream rather than the full signal, and it localizes learning to wherever predictions fail. A high-dimensional input is reduced to (stable model) + (sparse error), so attention, memory, and computation concentrate on the small, informative remainder.
Abstract Reasoning¶
The pattern licenses reasoning about prediction error as the engine of both perception and learning, about hierarchical message-passing (predictions down, errors up), and about pathologies of mis-set precision (e.g., over- or under-weighting surprise). It frames "explaining away" — once predicted, a signal needs no further transmission.
Knowledge Transfer¶
The Kalman innovation, the DPCM residual, and the cortical prediction error are recognizably one structure, so estimator-design intuitions (precision-weighting, gain) transfer to models of attention and to anomaly-detection systems that flag only deviations from a learned baseline.
Relationships to Other Primes¶
Parents (2) — more general patterns this builds on
- Predictive Coding presupposes Compression — Predictive coding presupposes compression because transmitting only the prediction error exploits the predictable signal's redundancy to shorten its representation.
- Predictive Coding presupposes Feedback — Predictive coding presupposes feedback because the predict-compare-correct loop routes prediction-error output back to update the generative model.
Children (1) — more specific cases that build on this
- Pattern Completion (Filling the Incomplete) presupposes Predictive Coding — Pattern completion presupposes predictive coding because filling incomplete input requires a generative model whose predictions span the missing parts.
Path to root: Predictive Coding → Feedback
Not to Be Confused With¶
- Predictive coding is not compression (top neighbor, 0.684): compression minimizes the size of a representation by removing redundancy statically, whereas predictive coding is a dynamic forward-model-and-correct loop in which the residual, not the code length, is the object of interest (compression is one downstream use).
- Predictive coding is not foreseeing/prediction because prediction merely forms a belief about a future state, whereas predictive coding additionally compares that belief to reality and propagates only the error.
- Predictive coding is not Pattern Completion (Filling the Incomplete) (its referrer): pattern completion fills missing parts of a stored pattern from partial cues, while predictive coding is the ongoing error-driven correction of a generative model against live input.