Attention¶
Core Idea¶
Attention is the selective allocation of a limited cognitive, organizational, or computational resource to a subset of available information, options, or tasks — what James (1890) classically described as "the taking possession by the mind, in clear and vivid form, of one out of what seem several simultaneously possible objects or trains of thought." [1] It is the gating mechanism that permits selected items to be processed deeply, while unselected items are filtered, delayed, or discarded, as Broadbent (1958) formalized in his filter theory of selective attention. [2] The scarcity is absolute: no agent (human, organization, algorithm) can process everything simultaneously; attention is how scarcity surfaces upstream of decision-making.
How would you explain it like I'm…
The Mind's Flashlight
Picking What to Think About
Selective Focus
Structural Signature¶
Attention exhibits consistent structural properties across domains:
Resource bottleneck. Processing capacity is finite and fixed per unit time, a constraint Kahneman (1973) developed into a unified capacity model of attentional effort. [3]
Allocation mechanism. Inputs compete for selection via salience, goals, emotion, learned filters, or interrupt protocols, a duality Corbetta and Shulman (2002) mapped onto distinct goal-directed (top-down) and stimulus-driven (bottom-up) brain networks. [4]
Consequence asymmetry. Selected items receive deep processing and influence decisions; unselected items have zero causal effect, regardless of their objective value — a consequence Mack and Rock (1998) demonstrated empirically in their studies of inattentional blindness. [5]
Cost of filter failure. Misallocation (attending to noise, ignoring signal) degrades outcomes; opportunity cost of wrongly allocated attention compounds, as Wickens (2008) shows in his multiple resource theory of mental workload and dual-task interference. [6]
What It Is Not¶
Attention is distinct from but often conflated with:
— Focus: a narrower outcome referring to concentration quality and depth on a single object. Attention gates what can be focused on; focus measures the quality of that gate's output.
— Prioritization: an ordering or ranking of options by value. Prioritization determines which items should receive attention; attention determines which items actually do. An excellent priority list with poor attention allocation yields poor results.
— Vigilance: sustained readiness to detect a rare signal. Vigilance is a mode of attention deployment; attention is the broader resource-allocation phenomenon.
— Consciousness: the subjective experience of processing. Attention is a functional mechanism; consciousness may or may not accompany it.
Broad Use¶
[7] Cognitive psychology (Broadbent 1958, Treisman, Posner & Petersen 1990): selective attention in perception, attention spans, dichotic listening, attentional bottlenecks, top-down (goal-driven) vs bottom-up (stimulus-driven) attention networks, attentional control disorders.
[8] Economics & finance (Simon 1971, Davenport & Beck 2001): attention economy, "wealth of information creates poverty of attention," bounded rationality, market anomalies driven by retail investor attention, algorithmic attention to market microstructure.
[9] Machine learning & AI (Vaswani et al. 2017, transformer architectures): attention mechanisms in sequence-to-sequence models, self-attention, multi-head attention, cross-attention, scaled dot-product attention as differentiable allocation.
[10] Computer science & software engineering (Tanenbaum & Bos 2014): task scheduling, interrupt handling, event-driven systems, cache coherence (memory attention), resource allocation in operating systems.
[11] Organizational management (Ocasio 1997 attention-based view of the firm): executive attention as the scarcest organizational resource, how leadership focus shapes strategic decisions, the role of information channels and issue interpretation in directing attention.
[12] Neuroscience (Desimone & Duncan 1995): neural correlates of selective attention (parietal and frontal networks), attention as gain modulation in sensory cortex, cholinergic attention system, salience networks, attentional disorders (ADHD, neglect syndrome).
[13] Advertising & marketing (Davenport & Beck 2001): eyeballs as a commodity, attention capture as core business function, ad placement optimization, algorithmic feeds designed to maximize engagement (i.e., attention extraction).
Clarity¶
Attention is the gating mechanism—the structural process that selects which inputs are processed. Distinguishing it from related concepts clarifies where scarcity operates and what drives outcomes. A decision-maker with excellent prioritization but poor attention allocation will implement the wrong strategy. An organization with clear objectives but no attention management will scatter effort across low-value signals. Naming this layer makes the cost of inattention visible and measurable.
Manages Complexity¶
Attention transforms an overwhelming problem into a tractable framework. Instead of "I can't process everything" (paralyzing), the framework asks: What is the resource limit? What triggers or guides allocation—salience, emotion, learned heuristics, organizational norms, interrupt protocols? What is the filter rule, and what does it exclude? What is the opportunity cost of misallocation? This applies whether managing human workload, network bandwidth, GPU memory, or organizational focus.
The framework also clarifies where intervention can occur: alter the input stream, change salience cues, modify the allocation rule, design better filters, or measure and reward correct attention patterns.
Abstract Reasoning¶
Attention invites thinking in terms of:
— Signal-to-noise ratios and how salience distorts them.
— Opportunity cost: the value of what is not attended to.
— Attentional capture: involuntary shifts (e.g., sudden loud noise, threat stimulus) that override top-down goals.
— Attention as a competitive landscape: information, stimuli, and claims fight for processing capacity.
— Attentional debt: the accumulation of unprocessed important signals until the system fails (e.g., diagnostic miss, strategic surprise).
— Domain transfer: a mechanism effective in one domain (e.g., interrupt prioritization in operating systems) may transfer to another (e.g., meeting protocols in organizations).
Knowledge Transfer¶
The structural insight recurs across personal productivity, clinical diagnostics, military command, financial markets, neural networks, and organizational hierarchy. A radiologist allocates attention to anatomical regions; a CEO allocates attention to board reports and market signals; a neural network allocates attention weights to input features; a manufacturing plant allocates supervisor attention to bottleneck lines. The mechanism is identical: bounded capacity, selective allocation, feedback loops, consequences of misallocation.
Mechanisms from one domain transfer across others: salience weighting (how a clinical flag redirects a radiologist's attention), interrupt protocols (how urgent signals preempt routine processing), curated feeds (how information architecture shapes allocation), and measurement of attention metrics (e.g., dwell time, engagement, processing latency).
Examples¶
Formal/abstract¶
In formal terms, attention is a weight vector w ∈ [0, 1]^n where Σw_i = 1, applied to an input set x to produce y = w ⊙ x. The weight distribution is determined by a query q and the input features, via a function f: w = softmax(f(q, x)), where f can be a learned neural function or a heuristic rule. The allocation rule f encodes the salience function, goal alignment, and learned filters. In humans, this is approximated by the Posner attention networks; in transformers, by scaled dot-product attention; in organizations, by meeting agendas and reporting hierarchies.
Applied/industry¶
Clinical radiology: A radiologist reads 40 chest X-rays per hour. Each image contains ~50 anatomically distinct regions and millions of pixels. Attention capacity is the bottleneck. A small nodule might be present but escape notice amid vascular shadows unless a clinical flag (patient smoking history, prior imaging, AI detector highlighting the region) redirects attention. The radiologist's diagnostic ability is sound; what changed was the allocation of attention. False negatives are attention misallocations; the signal was visible but not selected for processing.
Market trading: A hedge fund manager cannot monitor all 5,000 stocks simultaneously. Attention gets allocated to a curated watchlist, sector rotation signals, and macroeconomic indicators. When a major news event (earnings surprise, regulatory change) breaks, attention shifts via interrupt protocol. Profitable trading depends not on perfect knowledge but on attention being allocated to the right signals at the right time. Opportunity cost of attention is quantifiable: every minute spent analyzing Stock A is not spent on Stock B.
Executive strategy: A CEO receives 200+ emails per day and has 8 hours of meeting capacity. Attention allocation is filtered by executive assistants, meeting agendas, and board priorities. A disruptive competitive threat might be missed entirely if no information channel surfaces it (lack of salience) or if attention is directed elsewhere (competing priorities). The CEO's strategic decisions are constrained not by analytical ability but by which signals received attention.
Mapped back: In all three cases, the outcome (diagnosis, trade return, strategic decision) depends on what was selected for processing, not on the average quality of all available information. Attention acts as a gate. Improving outcomes requires either improving the filter rule (what to attend to) or expanding the bottleneck (increasing processing capacity, e.g., via assistants, AI tools, delegation). Ignoring attention as a structural constraint leads to strategies that look rational on paper but fail in practice because key information was never processed.
Structural Tensions¶
T1: Attention speed vs. attention accuracy. Fast allocation (snap judgments, interrupt-driven) risks attending to noise. Slow, deliberative allocation is more accurate but cannot keep pace with input flow. The tension is resolved by multi-level architectures: rapid filtering of obviously low-value inputs, careful processing of borderline cases. In organizations: quick triage by junior staff, careful review by experts.
T2: Top-down goal alignment vs. bottom-up salience capture. Goal-driven attention selects items that support current objectives; salience-driven attention (loud noise, threat stimulus, emotional content) captures processing regardless of goals. Neither is optimal alone. Evolutionary and organizational design resolves this via weighted combination: goals set baseline allocation, but sufficiently salient inputs override (e.g., a threat stimulus interrupts productive work; new market data interrupts existing strategy). The tension is healthy; resolution requires transparency about the weighting.
T3: Selective attention depth vs. breadth coverage. Deep focus on one item (high processing depth, high output quality) necessarily reduces coverage of other items. Distributed attention across many items (broad monitoring) necessarily reduces depth on each. This is not a trade-off to be "solved" but a fundamental constraint. Different roles require different positions on this spectrum: a surgeon needs extreme focus; a director of strategy needs broad monitoring. Team composition and task design reflect this.
T4: Attention scarcity vs. information abundance. Information production has accelerated (digital systems, social media, real-time data feeds) while human and organizational attention has not. The mismatch is structural. Naive responses (trying to attend to all information) fail; effective responses accept scarcity and design better filters and curation mechanisms. This creates a market for attention direction: news curation, analyst reports, algorithmic recommendations, information design. The tension is unlikely to resolve; managing it is a core organizational competency.
T5: Measurable attention vs. implicit attention. Attention that is conscious and explicit (task focus, listed priorities) can be managed and communicated; implicit attention (heuristic salience filters, learned associations, emotional reactions) is harder to measure and justify but often drives allocation. Organizations often manage only explicit attention while implicit attention dominates outcomes. Resolution requires making implicit allocation transparent: auditing which signals actually received processing, tracing how allocation decisions were made, surfacing hidden salience functions.
T6: Individual attention vs. collective attention. In groups, attention is fractured across members; collective attention is the intersection. A team can attend to more signals in parallel but loses depth and coherence. A single individual can achieve deep focus but has narrow coverage. Organizations oscillate between decentralized attention (individual teams focus deeply) and centralized attention (aligned priorities, loss of specialist focus). The tension is structural to hierarchy and scale.
Structural–Framed Character¶
Attention is a hybrid on the structural–framed spectrum, and it leans structural with only a light frame. Part of it is a bare pattern that holds in any system: a limited resource selectively allocated to some inputs while others are filtered out. Part of it is a vocabulary inherited from psychology, where it was first studied as a property of the mind.
The structural core — a fixed processing bottleneck, gating, and selective allocation under a capacity limit — applies unchanged to cognition, to organizations triaging tasks, and to computational systems weighting inputs, and stating it requires no appeal to human norms. That pattern is largely descriptive and value-neutral. The residual frame comes from the prime's psychological home: the language of the mind "taking possession" of objects, of vivid awareness and trains of thought, carries assumptions about an experiencing subject that a bare resource-allocation pattern does not need. Because the resource-bottleneck pattern carries most of the weight while the mentalistic vocabulary adds only a thin layer, it sits just on the structural side of the middle.
Substrate Independence¶
Attention is about as substrate-independent as a prime can be — composite 5 / 5 on the substrate-independence scale. Its signature is fully substrate-agnostic — a limited resource selectively allocated to a subset of available information, with a bottleneck, an allocation mechanism, and an asymmetric consequence. It spans psychology, economics, computer science, organizational management, and neuroscience, and the examples pair clinical radiology with the formal weight vector of an ML attention mechanism. That breadth backed by strong, concrete evidence marks it as a canonical cross-substrate prime.
- Composite substrate independence — 5 / 5
- Domain breadth — 5 / 5
- Structural abstraction — 5 / 5
- Transfer evidence — 5 / 5
Relationships to Other Primes¶
Foundational — no parent edges in the catalog.
Children (8) — more specific cases that build on this
-
Cognitive Load is a kind of Attention
Cognitive load is a specialization of attention. The general pattern is the selective allocation of a limited cognitive resource to a subset of available information, with absolute scarcity at the gating point. Cognitive load instantiates this as the working-memory budget: the total mental effort consumed by the current allocation, decomposed into intrinsic, extraneous, and germane components. It is attention's scarcity quantified as a load measure with characteristic capacity bounds, where exceeding capacity degrades performance. Manipulating load by chunking or scaffolding is manipulating where the attentional resource gets spent.
-
Curiosity is a kind of Attention
Curiosity is a specialization of attention. The general pattern is the selective allocation of a limited cognitive resource to a subset of available information, gating what gets processed deeply. Curiosity instantiates this with the allocation criterion being a perceived information gap: the reasoner orients selective attention toward stimuli that promise to close the gap between current and possible knowledge, with the gap-resolution itself intrinsically rewarding. Berlyne and Loewenstein's information-gap framing makes curiosity the attentional bias whose targeting rule is gap salience, particularly within the Goldilocks zone of optimal challenge.
-
Attentional Capacity presupposes Attention
Attentional capacity names the finite pool of selective-attention bandwidth a bounded agent can deploy at one moment. It presupposes the prior pattern of attention itself: the selective allocation of a limited cognitive resource that gates which inputs are processed deeply. Without attention as a gating mechanism enforcing scarcity, there is no resource-pool to measure and no characteristic failure modes (interference, slowing, capture) to predict. Attentional capacity quantifies the bound that attention's framing already commits to as absolute.
- Emphasis presupposes Attention
Emphasis presupposes attention because the mechanism of foregrounding selected information against a contrast background is operationally aimed at directing scarce attentional allocation toward the foregrounded element. Without attention's gating function — the selective allocation of limited processing to a subset of available information — emphasis vehicles like stress, typography, position, and syntactic marking would have no asymmetric effect to produce, since all input would be processed equally. Attention supplies the scarcity that makes selective allocation meaningful; emphasis supplies the technique that biases the allocation toward chosen content.
- Movement (Visual Movement) presupposes Attention
Visual movement presupposes attention because compositional flow operates by selectively gating the viewer's perceptual resource: implied lines, diagonals, and rhythmic repetitions only carry meaning when they capture and route the limited gaze the viewer can deploy. Without attention's filter mechanism that selects a subset of available information for deep processing, the directional cues would have no purchase. The eye's traced path through a work is precisely an attentional allocation, sequenced by the artist.
- Oversight Capacity presupposes Attention
Oversight capacity is the structural invariant that any single overseeing entity can handle only a finite number of direct sub-units before coordination and decision quality decay. The binding source of that finitude is bounded attention — the inherently scarce cognitive resource that must be allocated across reports, signals, and decisions. Attention names the gating mechanism by which a limited resource is selectively assigned across competing inputs; oversight capacity is the management-side consequence of attention's scarcity, presupposing the attentional constraint as the upstream cause of any span-of-control limit.
- Priming presupposes Attention
Priming presupposes attention because the spreading-activation mechanism that makes a prior stimulus facilitate processing of related ones operates on the selective allocation of cognitive resource. The prime transiently raises activation of related representations, biasing what attention selects and processes deeply downstream. Without attention's gating function -- limited resource directed at some items at the expense of others -- there is no foreground for the prime's activation residue to bias. Priming exploits and depends on the same selection bottleneck attention names.
- Flow State is a decomposition of Attention
Attention is the selective allocation of a limited cognitive resource to a subset of available inputs, gating what gets processed deeply. Flow state is the particular shape this allocation takes when challenge meets skill at the edge of capability with clear goals and immediate feedback: attention locks onto the task to the exclusion of self-monitoring, time-tracking, and distraction, and action and awareness merge. It is a structurally-particularized instance of selective allocation whose specific configuration is total task-absorption produced by the challenge-skill match.
Neighborhood in Abstraction Space¶
Attention sits among the more crowded primes in the catalog (13th percentile for distinctiveness): several abstractions describe nearly the same structure, so a description that fits it will tend to fit its neighbors too — transporting it usually means disambiguating within this family rather than landing on it exactly.
Family — Perception, Memory & Pattern (13 primes)
Nearest neighbors
- Attentional Capacity — 0.89
- Cognitive Resource Depletion — 0.83
- Cognitive Load — 0.81
- Priming — 0.81
- Processing Fluency — 0.81
Computed from structural-signature embeddings · 2026-05-29
Not to Be Confused With¶
Attention must be distinguished from Emphasis (similarity 0.708), its nearest neighbor. Emphasis is the rhetorical, perceptual, or design technique of making a selected item more salient — through formatting (bold text, bright colors, large size), prominence (top placement, repeated mention), or emotional framing (urgency, fear appeal). Emphasis is about how to make something stand out. Attention is about what gets processed in the first place. The difference is crucial: emphasis works on items already in the attended field — it amplifies or highlights them within the processing that has already been allocated. Attention is the gate that determines which items enter the attended field. You can emphasize something that no one attends to (highlighting a faint color among brighter ones, when that region is unattended); the emphasis has no effect because attention was never allocated to that region. Conversely, an item can receive deep attention without any special emphasis (a stock ticker in the corner of a trader's screen receives attention but no visual emphasis). Emphasis is a tool for increasing the salience of already-attended items; attention is the selection mechanism that determines what gets attended. In practice, emphasis tries to capture attention (by making something more salient, one hopes to trigger attention), but the relationship is asymmetrical: emphasis presumes some attention, while attention determines what emphasis can affect. A well-designed information interface combines both: it allocates attention through filtering and curation (attention architecture) and then emphasizes the most critical items within the attended field (emphasis design).
Attention is not Cognitive Load, though the two interact. Cognitive load is the total processing demand placed on a finite working-memory budget — the sum of mental effort required by all items currently being processed. Attention is the allocation mechanism that decides which items from the total available set receive processing capacity. Cognitive load measures the pressure on the system; attention determines how that capacity is distributed. A task can have low cognitive load (simple items, easy processing) but poor attention allocation (focus on irrelevant items, missing important signals). A task can have high cognitive load (complex items, many competing demands) but good attention allocation (focus on critical items despite the pressure). The relationship is sequential: attention determines which items get capacity; cognitive load measures the resulting burden. Too-high cognitive load indicates that either (1) too many items were allocated attention, or (2) the items allocated attention are inherently demanding. In the first case, attention allocation is the problem; in the second, it may be unavoidable (a surgeon in an emergency must attend to many life-critical items simultaneously, creating high cognitive load). The error is treating cognitive load as if it determines attention — in fact, attention choices create the cognitive load. An individual with good attention discipline can manage high-load tasks better than someone with poor attention who attends to low-priority but cognitively-demanding tasks. Thus, attention discipline reduces experienced cognitive load by filtering what receives capacity in the first place.
Attention is distinct from Prioritization, which is the ranking of items by value, urgency, or importance — determining which items should receive resources. Prioritization is a ranking, a list, a plan. Attention is the actual moment-to-moment allocation of processing to items right now. A perfectly-prioritized list (items ranked by importance, constraints, deadlines, strategic value) with poor attention allocation (the person is focused on low-priority items, distracted by interruptions, or attending to items in the wrong order) yields poor outcomes. The priority list is aspirational and static; attention is the dynamic, real-time mechanism. A classic failure pattern is the well-prioritized todo-list holder whose actual attention is driven by email urgency, meeting schedules, or emotional salience rather than the stated priorities. The list says "Item A is highest priority," but attention goes to "Item B is demanding immediate response." Prioritization is should; attention is is. Improving outcomes requires aligning attention with priorities — actually processing the high-priority items, not just ranking them. This distinction clarifies why many productivity systems fail: they produce good prioritization without managing the attention allocation that makes prioritization actionable. The solution is not a better priority list but better attention architecture — eliminating distractions, curating inputs, designing notification systems to surface priorities rather than letting urgency dominate. Thus, prioritization and attention are sequential: one prioritizes items, then must manage attention to ensure the high-priority items actually get processed.
Solution Archetypes¶
Solution archetypes in the catalog that build on this prime — directly (this prime is a source ingredient) or as a related prime.
Built directly on this prime (4)
- Activation Decay Measurement
- Cascaded Hierarchical Recognition
- Negative Priming Avoidance
- Novelty-Driven Attention Capture
Also a related prime in 13 archetypes
- Ambiguity-Exploitation in Visual Metaphor
- Fluency-Based Preference Exploitation
- Gestalt Continuation and Grouping Activation
- Goal Valence Decomposition and Separation
- Negative Space as Structural Element
- Negative-Mere-Exposure Reversal for Disliked Targets
- Participation Equity and Inclusion Design
- Sacred Object or Totem Introduction
- Symbol-System Coherence in Visual Art
- Synchrony Induction and Rhythm Alignment
Notes¶
Attention is often treated as an individual cognitive trait ("she has poor attention") rather than a structural constraint shared across all agents. This obscures its power as an explanatory lens and mislocates the problem. Poor organizational decisions are often not due to poor intelligence or analysis but to poor attention allocation; the signal was available but not processed.
The rise of digital information abundance has made attention management increasingly central to strategy. Organizations that excel at filtering, curation, and directing attention to high-value signals outcompete those that try to process everything. Similarly, individuals who master attention management (via systems, delegation, prioritization) are more effective than those with higher raw cognitive ability but poor attention discipline.
Attention is also a site of power and asymmetry. Those who control what information is salient (media, algorithms, organizational gatekeepers) effectively control what receives attention and thus what is decided. Attention engineering can be benign (good design) or manipulative (dark patterns, manufactured consent). Transparency about what shapes salience—what algorithms are optimizing for, what narratives are being pushed, what voices are muted—is a prerequisite for defensible attention allocation.
The structural asymmetry between attended and unattended items — that what is not selected has effectively zero causal influence on downstream decisions — is the empirical core of Simons and Chabris's (1999) "gorilla in our midst" demonstrations of sustained inattentional blindness. [14]
The transfer mechanism — that salience weighting, interrupt protocols, curated feeds, and attention metrics are mathematically equivalent across cognitive, organizational, and computational substrates — was made explicit in Itti and Koch's (2001) computational saliency-map framework, which mechanizes the bottom-up control of attentional deployment in a form portable across substrates. [15]
References¶
[1] James, W. (1890). The Principles of Psychology (Vol. 1, Ch. 11: Attention). Henry Holt and Company. Foundational psychological treatise: defines attention as the mind's "taking possession" of one out of several simultaneously possible objects of thought; canonical statement of selective allocation as the essence of attention. ↩
[2] Broadbent, D. E. (1958). Perception and Communication. Pergamon Press. Foundational filter model of selective attention: the limited-capacity channel forces early selection among competing stimuli, establishing attention as a scarce bottleneck around which perception and cognition are organized. ↩
[3] Kahneman, D. (1973). Attention and Effort. Prentice-Hall. Canonical capacity model of attention: argues that attention is a limited mental resource (effort) flexibly allocated across tasks, replacing strict-bottleneck models with a graded-capacity account of finite per-unit-time processing. ↩
[4] Corbetta, M., & Shulman, G. L. (2002). Control of goal-directed and stimulus-driven attention in the brain. Nature Reviews Neuroscience, 3(3), 201–215. Identifies dorsal (intraparietal/superior-frontal) and ventral (temporoparietal/inferior-frontal) attention networks underlying top-down goal-directed selection and bottom-up stimulus-driven reorienting; the neural substrate for attentional capacity allocation. ↩
[5] Mack, A., & Rock, I. (1998). Inattentional Blindness. MIT Press. Empirical demonstration that observers routinely fail to perceive unexpected, fully-visible stimuli when attention is engaged elsewhere; canonical evidence that unattended items have effectively no causal effect on conscious processing. ↩
[6] Wickens, C. D. (2008). Multiple resources and mental workload. Human Factors, 50(3), 449–455. Multiple-resource theory: dual-task interference and alert salience depend on whether tasks share modality (visual/auditory), processing code (spatial/verbal), and stage; predicts that alerts in unused modalities cut through workload more effectively. ↩
[7] Posner, M. I., & Petersen, S. E. (1990). The attention system of the human brain. Annual Review of Neuroscience, 13(1), 25–42. Foundational framework decomposing attention into alerting, orienting, and executive networks; canonical reference for the cognitive-psychology and neuroscience treatment of attentional control. ↩
[8] Simon, H. A. (1971). Designing organizations for an information-rich world. In M. Greenberger (Ed.), Computers, Communications, and the Public Interest (pp. 37–72). Johns Hopkins University Press. Coining of the attention-economy concept: "a wealth of information creates a poverty of attention"; foundational analogue for treating expert capacity as a finite human resource bounded by carrier, not by motivation or willingness. ↩
[9] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. In Advances in Neural Information Processing Systems 30 (NeurIPS 2017) (pp. 5998–6008). Introduces the Transformer architecture with multi-head attention as the sole sequence-mixing mechanism; attention heads constitute a bounded per-layer per-token selection budget — a non-biological instance of the attentional-capacity pattern. ↩
[10] Tanenbaum, A. S., & Bos, H. (2014). Modern Operating Systems (4th ed.). Pearson. Standard operating-systems textbook: develops process scheduling, interrupt handling, event-driven I/O, and resource allocation as the OS-level analogue of attentional gating across competing computational demands. ↩
[11] Ocasio, W. (1997). Towards an attention-based view of the firm. Strategic Management Journal, 18(S1), 187–206. Treats firm behavior as the outcome of how an organization channels and distributes the bounded attention of its decision-makers; foundational for board-level monitoring-capacity and committee-structure attention-allocation analyses. ↩
[12] Desimone, R., & Duncan, J. (1995). Neural mechanisms of selective visual attention. Annual Review of Neuroscience, 18(1), 193–222. Influential biased-competition account of selective attention: develops how parietal and frontal networks bias sensory-cortex competition to amplify behaviorally relevant stimuli — the neural-level signature of attentional gain modulation. ↩
[13] Davenport, T. H., & Beck, J. C. (2001). The Attention Economy: Understanding the New Currency of Business. Harvard Business School Press. Canonical business-strategy treatment of attention as a scarce commodity: develops attention capture, attention markets, and engagement-based business models in advertising and consumer technology. ↩
[14] Simons, D. J., & Chabris, C. F. (1999). Gorillas in our midst: Sustained inattentional blindness for dynamic events. Perception, 28(9), 1059–1074. Classic demonstration of sustained inattentional blindness: roughly half of observers tracking a basketball-passing task fail to notice a person in a gorilla suit walking through the scene; canonical evidence of consequence asymmetry between attended and unattended events. ↩
[15] Itti, L., & Koch, C. (2001). Computational modelling of visual attention. Nature Reviews Neuroscience, 2(3), 194–203. Canonical saliency-map framework: formalizes bottom-up attentional deployment as a substrate-independent computational process, enabling transfer of salience-weighting, winner-take-all selection, and inhibition-of-return mechanisms across biological and artificial systems. ↩