Locality Of Reference¶
Core Idea¶
Locality of reference is the empirical structural regularity that accesses, events, or interactions are not uniformly spread across their possible targets but cluster — recently-used items are likely to be used again soon (temporal locality), and items near a used one are likely to be used next (spatial locality). The pattern is a statement about the distribution of access, not about any single access: usage is autocorrelated in time and space.
How would you explain it like I'm…
Stuff Near Stuff Gets Used Together
Recently-Used and Nearby Stuff Repeats
Access Clustering in Time and Space
Broad Use¶
- Computer science: memory and disk accesses cluster, which is the reason caches, prefetching, and paging work at all.
- Geography (non-obvious): Tobler's first law — "everything is related to everything else, but near things are more related than distant things" — is spatial locality of interaction.
- Epidemiology: contacts and infections cluster spatially and temporally (households, neighborhoods, recent exposure windows), which is why local containment and contact-tracing are effective.
- Library / information science: a small set of recently and frequently consulted works dominates circulation; reshelving and reserve collections exploit this.
- Linguistics / text: word reuse is bursty — a word just used is disproportionately likely to recur soon, underpinning cache-based language models.
Clarity¶
Locality names why a fast-but-small local store can stand in for a slow-but-large source: not because the local store is complete, but because the access distribution is concentrated. It lets practitioners distinguish workloads with exploitable structure from genuinely random-access ones where no caching strategy can help.
Manages Complexity¶
By asserting that the working set at any moment is small relative to the whole address space, locality justifies maintaining and reasoning about only a tiny active subset, collapsing a vast space of possible accesses into a manageable hot region.
Abstract Reasoning¶
Recognizing locality supports the inference "if accesses cluster, then keeping the recent/near subset close yields a high hit rate" — and conversely warns that when locality is absent (uniform random access), caching, prefetching, and local intervention will fail. It turns "should we cache/prefetch/localize?" into an empirical question about the access distribution's concentration.
Knowledge Transfer¶
The CPU-cache insight transfers directly to epidemiology: both treat a clustered access/contact distribution as the lever — caches keep the hot working set local, contact tracing keeps containment local to the recent/near cluster. It also transfers to recommender and search systems, which prefetch the predictably-near-next items.
Relationships to Other Primes¶
Parents (2) — more general patterns this builds on
- Locality Of Reference is a kind of Recurrence — Locality of reference is a kind of recurrence in which recently or nearby accessed items reappear with predictable frequency.
- Locality Of Reference presupposes Heavy-Tailed Distributions — Locality of reference presupposes heavy-tailed distributions because clustered access patterns produce heavy-tailed access frequencies across the address space.
Children (1) — more specific cases that build on this
- Caching presupposes Locality Of Reference — Caching presupposes locality of reference because exploiting a fast-local copy only pays off when accesses cluster temporally or spatially.
Path to root: Locality Of Reference → Recurrence
Not to Be Confused With¶
Locality of reference is not frame of reference (a high-similarity embedding artifact), which is a chosen coordinate system, not a clustering of accesses. It is not caching, which is the technique that exploits locality; locality is the underlying empirical pattern that makes caching work. It is not buffering, which absorbs rate mismatch over time rather than asserting that accesses are spatially/temporally clustered.