Conceptual¶

Theoretical essays about the Encyclopedia of Abstractions — what the catalog is, why it works as it does, and how its primes behave as objects of cross-domain reasoning. Companion to Applications, which is about putting the catalog to use.

Foundations¶

Structural and Framed Primes — A Typology for Cross-Domain Portability of Abstractions

A prime abstraction is operationally defined as one applying across at least three domains of human knowledge. The threshold settles a binary question — is this prime? — but leaves untouched a further asymmetry: some primes slot into new domains as pure relational structure, while others import an interpretive context (vocabulary, evaluative commitments, institutional assumptions) along with them. This paper develops the refinement that names that asymmetry: structural primes were already domain-stripped at the moment they were named; framed primes carry an interpretive overlay that resists clean extraction. The distinction admits degree, correlates with the disciplinary character of a prime's home domain, and predicts that "super-primes" (those applying across nearly all domains) cluster on the structural end. A decomposition operation extracts a structural core from a framed prime, yielding one of three outcomes: unification with an existing structural prime, a new analytical tool, or loss of identity (a fate related to Bernard Williams's account of thick ethical terms). A bidirectional dynamic — structural primes acquiring frames when deployed in normative contexts — complicates the basic distinction productively. The paper proposes schema additions for the catalog and outlines a research agenda for testing the typology empirically.
Why Analogies Break: Projection and Residue — Structural sharing is necessary but never sufficient — and a catalog of abstractions is what makes the boundary locatable

Everyone knows analogies break down. Almost no one can say where or why with any precision, and the usual explanation — that we carelessly carry over surface details that don't belong — captures only the shallow half of the phenomenon. This essay argues for the deeper half: that even a transfer which shares nothing but genuine relational structure, and shares all of it that was ever checked, will still break when pushed. The reason is not contamination but incompleteness. An abstraction is a projection: it keeps a specified set of relations and discards everything else. Two situations that instantiate the same abstraction are guaranteed to share that projected structure and — if they are genuinely different situations — guaranteed to differ everywhere the projection was silent. Pushing an analogy means drawing inferences that lean on the discarded part. So the breakdown is not a flaw in the analogy; it is the projection's residue reasserting itself, and it occurs precisely at the edge of what was shared.

None of this is a new discovery — structure-mapping theory and the philosophy of scientific analogy have circled it for decades. The claim here is narrower and, we think, more useful: an explicit catalog of abstractions changes what kind of understanding of the phenomenon is available. Without a named inventory of the pure structural cores, a situation cannot be cleanly factored into "the shared abstraction" plus "the rest," and every breakdown looks like the same fog. With one, the boundary becomes a thing you can point at.
Substrate Independence

How to read the substrate-independence scores on each prime page.
The Structural Reasoning Substrate — What a Catalog of Abstractions Provides That a Neural Network Alone Does Not

Augmented Abstract Reasoning (AAR) pairs a large language model with a curated catalog of cross-domain primes and archetypes, and empirically produces structurally faithful, auditable reasoning beyond what bare chain-of-thought reliably yields. This paper asks what the catalog provides conceptually: it argues the catalog functions as a structural reasoning substrate — named patterns with rigorous structural signatures, explicit relations to neighbors, documented cross-domain instantiations, and load-bearing negative-space definitions. The substrate enables a distinct flavor of cognitive rigor — structural rigor (pattern composition, coherence under configuration, consistency-checking against documented constraints) — distinct from the propositional rigor (truth-functional, brittle, narrow) the formal-logic tradition has anchored on. The load-bearing claim: the substrate's rigor depends primarily on its negative-space content — what each pattern is not, how it differs from neighbors, where it overextends, what its failure modes are. Without rigorous negative space, a catalog of abstractions collapses into pattern-matching without real constraint. The paper positions the approach against classical neurosymbolic AI, schema-based reasoning, pattern languages, and pure neural reasoning, and surfaces implications for LLM evaluation, training, and human-AI collaboration.
The Calculus of Abstraction — From a Lexicon of Primes to a Grammar of Operations for Cross-Domain Reasoning

The Encyclopedia of Abstractions began as a catalog: a curated set of prime abstractions — patterns recurring across at least three domains — and solution archetypes. A catalog is a dictionary. This paper argues that a dictionary isn't enough, and that the more interesting object is a grammar: an explicit set of operations for manipulating abstractions, with the catalog as its lexicon. If primes are the nouns, the operations are the verbs, and the pair constitutes a calculus of abstraction. The paper enumerates the verbs already implicit in the Augmented Abstract Reasoning pipeline (lift, lower, compose, decompose, transport, salience-rank, prune, match, evaluate-fit, reconcile); brings category theory to bear (functors as cross-domain transport, free/forgetful adjunction as the lift/lower pair, Yoneda as the formalization of "an abstraction is its neighborhood"); situates the project against pattern catalogs (Alexander, Gang of Four, TRIZ), general systems theory, analogy work (Gentner), structural realism, and structured-prompting / neurosymbolic literature; and characterizes what we have built — structure as scaffold, not structure as solver — marking the frontier where scaffold could become verifier. The argument is anchored to small, confounded experiments rather than assertion; the most consequential finding is that on problems in well-covered regions, the protocol carries more of the value than the catalog.
The Verb Grammar of Abstraction Operations — a reference

Companion reference to The Limits of Runtime Scaffolding and to The Calculus of Abstraction (§5). This document specifies the operations the verb engine performs on abstractions, and grades each one honestly.

Catalog organization¶

From Candidate to Catalog — How Prime Abstractions Are Identified, Drafted, and Refined

The Encyclopedia of Abstractions is a catalog of prime abstractions — recurring structural patterns that travel across at least three domains of human knowledge. The catalog's force depends on more than the entries themselves; it depends on the procedure by which a candidate concept becomes an accepted, fully specified entry. This paper documents that procedure. We describe how a candidate is evaluated against the seven inclusion criteria and the six guidance rules for distinguishing prime from domain-specific patterns; how an accepted candidate is drafted in a concise form that states the thesis; how that draft is then elaborated into a long-form entry with structural signature, neighbor distinctions, formal and applied examples, structural tensions, and verified citations; and how the entry is integrated into the catalog's category, origin-domain, hierarchy, and learnability views. The procedure has evolved across roughly fifty drafting cycles, has been operated end-to-end by a single curator (Kurt Zoglmann) with the substantial assistance of triangulated LLM agents, and remains in tension with itself in places. The point is not that the procedure is finished; the point is that the catalog is the output of something articulable, repeatable, and improvable, rather than the output of taste.
The Hierarchy DAG — A Type System for Prerequisite Relations Between Primes

A catalog of prime abstractions is a set of nouns. To do real work, the nouns must be related — but is-related-to is underspecified, and managing cross-domain reasoning with a single edge type produces a brittle, undifferentiated mess. This paper formalizes the relational type system that emerged from 28 rounds of curation in the Encyclopedia of Abstractions, in which ~920 directed prerequisite edges were proposed, reviewed, committed, and (in many cases) revised. The system has four edge types — subsumption (kind-of), composition (built-from / presupposes, with two flavors), decompose (framed-applied to structural-core), and mutual (bidirectional, outside the acyclic topology) — one metadata attribute (the qualifier field: strict / typical / conditional), and two node classes (primes and connectors, the latter for composite-under-a-gloss bundles on a separate aspect_of edge layer). The system was not designed from first principles; it emerged from disciplined wiring, each refinement traceable to a specific failure mode (subsumption alone insufficient for feedback vs damping; composition alone couldn't name what signaling does to information_asymmetry; the connector layer exists because justice is real and useful but not a single structural pattern). The paper documents the system as it stands and what it enables — search moderation, curriculum tiering, faithfulness auditing, and a legible DAG visualization.
Distinctiveness and the Neighborhood Structure of Abstraction Space — Why some primes are easy to find and others get crowded out

Two primes can be equally apt for a problem and yet differ enormously in how easily a reasoner finds them. The reason is geometric: the primes do not sit in isolation but in a space of neighbors, and some regions of that space are crowded while others are sparse. A prime in a crowded region — surrounded by near-synonyms that describe almost the same structure — is hard to pin down, because a description that fits it fits its neighbors too. A prime in a sparse region stands alone, so a faithful description lands on it precisely. This paper names that property distinctiveness (equivalently, neighborhood density), explains why it, rather than a prime's position on the structural–framed spectrum, is what governs cross-domain retrievability, and describes how the Encyclopedia measures it and surfaces it on each prime's page.
Learnability and Curriculum Construction over a Prime Catalog — A tiering of the primes by first-encounter difficulty, with the algorithm and its honest ceiling

A catalog of 1,402 cross-domain primes is not a curriculum. To turn the catalog into something a learner can actually walk through, the primes need an ordering — a sequence in which the easier ones are met before the harder ones, with prerequisites honored. This paper describes how the Encyclopedia constructs that ordering. The approach is a difficulty-weighted topological sort that combines three kinds of evidence: a per-prime learning-age assessment derived from a triangulated LLM "ELI ladder" (explanations at 5, 10, 15, 18, and specialist levels), a set of word-level signals over the prime's slug and catalog text (Kuperman age-of-acquisition norms, SUBTLEX frequency, Brysbaert concreteness, Flesch-Kincaid-style readability), and the catalog's own typed prerequisite DAG, honored as a hard topological constraint. The output is a single linear order chunked into five display tiers, with the top tier holding the most intuitive umbrella primes and the bottom tier reserved for the small set of primes where no faithful kindergarten explanation exists. This paper says what each signal contributes, what it misses, and where the algorithm hits an honest ceiling that no purely-objective signal can break past.
Training and Measuring Cross-Domain Transfer: The Design of Abstractopia

The central skill of expertise, insight, and analogical reasoning is the ability to see the structure a situation shares with a distant one, independent of the surface features in which that structure is dressed. This ability is also the classic failure point of education: skills learned in one context notoriously fail to transfer to another. Abstractopia is an attempt to attack that failure directly. It treats cross-domain structural recognition as a trainable meta-skill and builds the training on three commitments: (1) teach structure, never surface; (2) measure transfer by requiring recognition of the same abstraction in an ever-farther domain, with a justification of why; and (3) teach the limits of each abstraction — where it breaks — as explicitly as the abstraction itself. The content is organized around a curated set of substrate-neutral "primes" drawn from the Encyclopedia of Abstractions. A late module reframes cognitive biases — themselves abstractions, some prime and some domain-specific — not as defects but as failure modes of good reasoning tools — a tool used on ground it does not fit — and trains the learner to read that fit, with a mastery gate that structurally refuses to certify a reflexive "bias-spotter." We describe the pedagogy, the assessment model, and the item-design discipline that gives the assessment its validity, situate the design against the transfer and learning-science literatures, state the specific predictions it makes, and note honestly the substantial risks — chief among them that far transfer is famously hard to produce and that a well-motivated design is not the same as a demonstrated result.

Operator-driven discovery¶

Operator-Driven Discovery of Prime Abstractions — A saturating search over concept-space, its yield law, and what it found

The Encyclopedia of Abstractions catalogs prime abstractions — domain-general structural patterns of thought, such as feedback, intersection, and path dependence, that recur across unrelated fields. This paper reports a way to grow that catalog by treating discovery as a search over concept-space: the states are primes and the moves are a small, closed set of operators, deterministic transformations that take an existing prime (or a tuple of them) and propose a new one — dualization flips a prime's polarity, generalization strips a constraint to reach its genus, analogy completion extrapolates a fourth term. Because the move-set is closed and the corpus finite, applying every operator to every prime is a finite, checkable job, and completeness becomes something one can measure rather than assert. We built a cheap-to-expensive screening funnel and ran all fourteen operators over a 1,325-prime corpus — on the order of 20,000 applications — surfacing 127 candidate new primes (several of them conspicuous gaps, such as consequentialism absent beside virtue ethics), along with two byproducts that may be worth as much: roughly 295 candidate hierarchy edges and a body of hard negatives for training analogical reasoning. The central result is a yield law: operators that merely recombine existing primes return nothing a mature catalog does not already imply, while only the moves that reach genuinely new structural positions — cross-domain analogy and symmetry-completion — keep paying. Acceptance counts are provisional pending final human curation.

Data & reproducibility. The raw material behind every quantitative claim below — the per-application model outputs for all fourteen operators (from which every NEW-rate, fixed-point census, and convergence count can be independently recomputed), the per-operator result ledgers, the fixed-point/audit censuses, the 128 candidate primes, the ~295 candidate hierarchy edges, the retained negatives, and the exact Stage-0 prompt for each operator — is published as a single bundle: Operator-driven prime discovery — raw data (~2.3 MB). A README inside maps each file to the paper claim it substantiates. See also Appendix B.
The Operator Compendium — A thorough treatment of the fourteen discovery operators

The discovery program in the companion paper is driven by fourteen operators — deterministic moves that transform an existing prime, or combine several, into a candidate new one. This document treats each operator in turn: what the move actually is (built up from everyday intuition), where it comes from, how you concretely apply it, worked examples, and what running it across the catalog taught us. It is written to be legible to a reader not already immersed in the project, and it doubles as a reference for anyone wanting to reproduce or extend the search.

Empirical work¶

The Limits of Runtime Scaffolding: A Null Result for Abstraction Pipelines at the Frontier

The Encyclopedia of Abstractions began as a bet that cross-domain reasoning could be improved by giving a model an explicit architecture for handling abstractions: recognize the operative primes, build a typed relational model, lift it to a domain-stripped meta-model, and transport a solution pattern from a curated catalog. This retrospective reports what happened when we tested the runtime form of that bet — scaffolding a frontier model's reasoning with the architecture at inference time — under blinded, pre-registered evaluation with deliberate confound control. The short answer: the runtime scaffold is largely inert. Varying the control structure (fixed pipeline vs. free planner vs. enforced coverage discipline) did not move design quality, did not raise coverage of load-bearing components even for a weaker solver, and did not change the faithfulness of the model's stated reasoning. We did not find the catalog or the abstraction idea worthless; we found that one of its two original arms — runtime scaffolding — meets the bitter lesson at the frontier on the problems we could construct. The value most plausibly survives one level up: as a target for synthetic training data, and as a curriculum for teaching abstraction to humans. Those two arms remain open, and are where the evidence now points.

Related Work & References — verified prior-art map

Source material for the retrospective's Related Work section. Built from the focused literature scan (2026-05-26) after a citation-verification pass. Each entry carries a relation flag and a verification status.

Other¶

Focused Related Work and Prior Art for an Encyclopedia of Abstractions

The closest prior art does not support a broad novelty claim of “abstraction-first prompting helps LLM reasoning at runtime.” That territory is already crowded by step-back, analogical, plan-and-solve, least-to-most, self-discovered, and graph-structured prompting, several of which reported gains on strong frontier models. But the literature is also much friendlier to the project's skeptical framing than those headline gains suggest: CoT-style structure helps mainly on math/symbolic tasks, can hurt badly off that turf, often shrinks with stronger models, and remains only weakly faithful even in modern reasoning systems. The strongest claim that still looks defensible is narrower: a hand-curated, cross-domain prime-abstraction + solution-archetype corpus, coupled to a typed relational/meta-model transport pipeline, and then tested through careful blinded evaluation, appears different from the existing mix of ontologies, prompting tricks, and process-supervision papers. The most serious pre-emption risk lies not in old knowledge graphs, but in newer work on analogical prompting, rationale distillation, multi-domain process supervision, and cross-domain latent adaptation.
Operator 13b — Cross-Namespace Meta-Model Genus Probe

A proposed, un-run variant of Operator 13 (Cluster-Without-Parent), with a concrete algorithm, one worked example, and honest yield expectations¶
The Compiler's Ceiling — Abstract

For most of recorded history the great reference works were, surprisingly often, the labor of a single obsessive mind. Pliny, Isidore of Seville, Vincent of Beauvais, William Smellie, Samuel Johnson, Noah Webster — each set out, more or less alone, to fold a large slice of the world's knowledge into one coherent object. And each ran into the same wall: a ceiling of a few million words assembled over a few decades, because the binding constraint was never the availability of knowledge. It was a human being's finite lifetime of reading a source and setting it down again. That ceiling has just been lifted, and by roughly an order of magnitude. This essay is about what the lifting does and does not mean — and it argues that the newly reachable volume is the least interesting thing about it, because the scarce ingredient in an encyclopedia was never words. It was the single sensibility that made a million entries feel like one work.