Locality Of Reference¶

Origin domain: Computer Science & Software Engineering
Also from: Library Information Science
Aliases: Locality, Reference Locality

Core Idea¶

Locality of reference is the empirical structural regularity that accesses, events, or interactions are not uniformly spread across their possible targets but cluster — recently-used items are likely to be used again soon (temporal locality), and items near a used one are likely to be used next (spatial locality). The pattern is a statement about the distribution of access, not about any single access: usage is autocorrelated in time and space.

How would you explain it like I'm…

Stuff Near Stuff Gets Used Together

When you play with toys, you grab the same one again and again, and the ones next to it too. You don't run around the whole house picking a different toy each time. Computers do the same thing with what they look at. So we keep the favorite stuff close, where it's fast to reach.

Recently-Used and Nearby Stuff Repeats

If you watch what someone uses next, it's almost never random. People (and computers, and animals) keep returning to whatever they touched recently, and they tend to reach for things right next to whatever they just used. This is called locality of reference. It's why a small shelf of go-to items can cover most of what you need, even when there's a huge warehouse in the back.

Access Clustering in Time and Space

Locality of reference says that uses of a resource cluster instead of spreading evenly. Two flavors: temporal locality means something you used recently is likely to be used again soon, and spatial locality means things stored near a thing you just used are likely to be used next. Computer programs show this strongly: at any moment, a running program only touches a small slice of its memory. That clustering is what makes caches work. If accesses were spread evenly, a small fast cache would catch almost nothing; because they cluster, a tiny cache catches most of the action.

Locality of reference is an empirical regularity about access distributions: when an agent or process repeatedly draws from a space of targets, the draws are not uniform but strongly autocorrelated in time and in position. Temporal locality means recently-accessed items have elevated probability of being accessed again soon; spatial locality means items adjacent (in address space, on disk, on a shelf) to a recently-accessed item have elevated probability of being accessed next. Peter Denning formalized this in computer architecture via the working-set model (1968), showing that a running program touches only a small, slowly-drifting subset of its address space at any instant. The conceptual payoff is general: locality is the precondition that makes any cache, working set, or hot-subset strategy worth attempting. Where access is concentrated, a tiny resident store captures most demand; where it's uniform, no such shortcut exists, and the absence of locality is itself a diagnostic.

Broad Use¶

Computer science: memory and disk accesses cluster, which is the reason caches, prefetching, and paging work at all.
Geography (non-obvious): Tobler's first law — "everything is related to everything else, but near things are more related than distant things" — is spatial locality of interaction.
Epidemiology: contacts and infections cluster spatially and temporally (households, neighborhoods, recent exposure windows), which is why local containment and contact-tracing are effective.
Library / information science: a small set of recently and frequently consulted works dominates circulation; reshelving and reserve collections exploit this.
Linguistics / text: word reuse is bursty — a word just used is disproportionately likely to recur soon, underpinning cache-based language models.

Clarity¶

Locality names why a fast-but-small local store can stand in for a slow-but-large source: not because the local store is complete, but because the access distribution is concentrated. It lets practitioners distinguish workloads with exploitable structure from genuinely random-access ones where no caching strategy can help.

Manages Complexity¶

By asserting that the working set at any moment is small relative to the whole address space, locality justifies maintaining and reasoning about only a tiny active subset, collapsing a vast space of possible accesses into a manageable hot region.

Abstract Reasoning¶

Recognizing locality supports the inference "if accesses cluster, then keeping the recent/near subset close yields a high hit rate" — and conversely warns that when locality is absent (uniform random access), caching, prefetching, and local intervention will fail. It turns "should we cache/prefetch/localize?" into an empirical question about the access distribution's concentration.

Knowledge Transfer¶

The CPU-cache insight transfers directly to epidemiology: both treat a clustered access/contact distribution as the lever — caches keep the hot working set local, contact tracing keeps containment local to the recent/near cluster. It also transfers to recommender and search systems, which prefetch the predictably-near-next items.

Relationships to Other Abstractions¶

Current abstraction Locality Of Reference Prime

Parents (3) — more general patterns this builds on

Locality Of Reference is a kind of Recurrence Prime

Locality of reference is a kind of recurrence in which recently or nearby accessed items reappear with predictable frequency.
Locality Of Reference presupposes Heavy-Tailed Distributions Prime

Locality of reference presupposes heavy-tailed distributions because clustered access patterns produce heavy-tailed access frequencies across the address space.
Locality Of Reference decompose Spatial Indexing Prime

Spatial indexing exploits locality of reference (the access-pattern property) for its output-sensitive payoff.

Children (1) — more specific cases that build on this

Caching Prime presupposes Locality Of Reference

Caching presupposes locality of reference because exploiting a fast-local copy only pays off when accesses cluster temporally or spatially.

Not to Be Confused With¶

Locality of reference is not frame of reference (a high-similarity embedding artifact), which is a chosen coordinate system, not a clustering of accesses. It is not caching, which is the technique that exploits locality; locality is the underlying empirical pattern that makes caching work. It is not buffering, which absorbs rate mismatch over time rather than asserting that accesses are spatially/temporally clustered.