Skip to content

Conceptual

Theoretical essays about the Encyclopedia of Abstractions — what the catalog is, why it works as it does, and how its primes behave as objects of cross-domain reasoning. Companion to Applications, which is about putting the catalog to use.

Foundations

  • Structural and Framed PrimesA Typology for Cross-Domain Portability of Abstractions

    A prime abstraction is operationally defined as one applying across at least three domains of human knowledge. The threshold settles a binary question — is this prime? — but leaves untouched a further asymmetry: some primes slot into new domains as pure relational structure, while others import an interpretive context (vocabulary, evaluative commitments, institutional assumptions) along with them. This paper develops the refinement that names that asymmetry: structural primes were already domain-stripped at the moment they were named; framed primes carry an interpretive overlay that resists clean extraction. The distinction admits degree, correlates with the disciplinary character of a prime's home domain, and predicts that "super-primes" (those applying across nearly all domains) cluster on the structural end. A decomposition operation extracts a structural core from a framed prime, yielding one of three outcomes: unification with an existing structural prime, a new analytical tool, or loss of identity (a fate related to Bernard Williams's account of thick ethical terms). A bidirectional dynamic — structural primes acquiring frames when deployed in normative contexts — complicates the basic distinction productively. The paper proposes schema additions for the catalog and outlines a research agenda for testing the typology empirically.

  • Substrate Independence

    How to read the substrate-independence scores on each prime page.

  • The Structural Reasoning SubstrateWhat a Catalog of Abstractions Provides That a Neural Network Alone Does Not

    Augmented Abstract Reasoning (AAR) pairs a large language model with a curated catalog of cross-domain primes and archetypes, and empirically produces structurally faithful, auditable reasoning beyond what bare chain-of-thought reliably yields. This paper asks what the catalog provides conceptually: it argues the catalog functions as a structural reasoning substrate — named patterns with rigorous structural signatures, explicit relations to neighbors, documented cross-domain instantiations, and load-bearing negative-space definitions. The substrate enables a distinct flavor of cognitive rigor — structural rigor (pattern composition, coherence under configuration, consistency-checking against documented constraints) — distinct from the propositional rigor (truth-functional, brittle, narrow) the formal-logic tradition has anchored on. The load-bearing claim: the substrate's rigor depends primarily on its negative-space content — what each pattern is not, how it differs from neighbors, where it overextends, what its failure modes are. Without rigorous negative space, a catalog of abstractions collapses into pattern-matching without real constraint. The paper positions the approach against classical neurosymbolic AI, schema-based reasoning, pattern languages, and pure neural reasoning, and surfaces implications for LLM evaluation, training, and human-AI collaboration.

  • The Calculus of AbstractionFrom a Lexicon of Primes to a Grammar of Operations for Cross-Domain Reasoning

    The Encyclopedia of Abstractions began as a catalog: a curated set of prime abstractions — patterns recurring across at least three domains — and solution archetypes. A catalog is a dictionary. This paper argues that a dictionary isn't enough, and that the more interesting object is a grammar: an explicit set of operations for manipulating abstractions, with the catalog as its lexicon. If primes are the nouns, the operations are the verbs, and the pair constitutes a calculus of abstraction. The paper enumerates the verbs already implicit in the Augmented Abstract Reasoning pipeline (lift, lower, compose, decompose, transport, salience-rank, prune, match, evaluate-fit, reconcile); brings category theory to bear (functors as cross-domain transport, free/forgetful adjunction as the lift/lower pair, Yoneda as the formalization of "an abstraction is its neighborhood"); situates the project against pattern catalogs (Alexander, Gang of Four, TRIZ), general systems theory, analogy work (Gentner), structural realism, and structured-prompting / neurosymbolic literature; and characterizes what we have built — structure as scaffold, not structure as solver — marking the frontier where scaffold could become verifier. The argument is anchored to small, confounded experiments rather than assertion; the most consequential finding is that on problems in well-covered regions, the protocol carries more of the value than the catalog.

  • The Verb Grammar of Abstraction Operations — a reference

    Companion reference to The Limits of Runtime Scaffolding and to The Calculus of Abstraction (§5). This document specifies the operations the verb engine performs on abstractions, and grades each one honestly.

Catalog organization

  • From Candidate to CatalogHow Prime Abstractions Are Identified, Drafted, and Refined

    The Encyclopedia of Abstractions is a catalog of prime abstractions — recurring structural patterns that travel across at least three domains of human knowledge. The catalog's force depends on more than the entries themselves; it depends on the procedure by which a candidate concept becomes an accepted, fully specified entry. This paper documents that procedure. We describe how a candidate is evaluated against the seven inclusion criteria and the six guidance rules for distinguishing prime from domain-specific patterns; how an accepted candidate is drafted in a concise form that states the thesis; how that draft is then elaborated into a long-form entry with structural signature, neighbor distinctions, formal and applied examples, structural tensions, and verified citations; and how the entry is integrated into the catalog's category, origin-domain, hierarchy, and learnability views. The procedure has evolved across roughly fifty drafting cycles, has been operated end-to-end by a single curator (Kurt Zoglmann) with the substantial assistance of triangulated LLM agents, and remains in tension with itself in places. The point is not that the procedure is finished; the point is that the catalog is the output of something articulable, repeatable, and improvable, rather than the output of taste.

  • The Hierarchy DAGA Type System for Prerequisite Relations Between Primes

    A catalog of prime abstractions is a set of nouns. To do real work, the nouns must be related — but is-related-to is underspecified, and managing cross-domain reasoning with a single edge type produces a brittle, undifferentiated mess. This paper formalizes the relational type system that emerged from 28 rounds of curation in the Encyclopedia of Abstractions, in which ~920 directed prerequisite edges were proposed, reviewed, committed, and (in many cases) revised. The system has four edge types — subsumption (kind-of), composition (built-from / presupposes, with two flavors), decompose (framed-applied to structural-core), and mutual (bidirectional, outside the acyclic topology) — one metadata attribute (the qualifier field: strict / typical / conditional), and two node classes (primes and connectors, the latter for composite-under-a-gloss bundles on a separate aspect_of edge layer). The system was not designed from first principles; it emerged from disciplined wiring, each refinement traceable to a specific failure mode (subsumption alone insufficient for feedback vs damping; composition alone couldn't name what signaling does to information_asymmetry; the connector layer exists because justice is real and useful but not a single structural pattern). The paper documents the system as it stands and what it enables — search moderation, curriculum tiering, faithfulness auditing, and a legible DAG visualization.

  • Distinctiveness and the Neighborhood Structure of Abstraction SpaceWhy some primes are easy to find and others get crowded out

    Two primes can be equally apt for a problem and yet differ enormously in how easily a reasoner finds them. The reason is geometric: the primes do not sit in isolation but in a space of neighbors, and some regions of that space are crowded while others are sparse. A prime in a crowded region — surrounded by near-synonyms that describe almost the same structure — is hard to pin down, because a description that fits it fits its neighbors too. A prime in a sparse region stands alone, so a faithful description lands on it precisely. This paper names that property distinctiveness (equivalently, neighborhood density), explains why it, rather than a prime's position on the structural–framed spectrum, is what governs cross-domain retrievability, and describes how the Encyclopedia measures it and surfaces it on each prime's page.

  • Learnability and Curriculum Construction over a Prime CatalogA tiering of the primes by first-encounter difficulty, with the algorithm and its honest ceiling

    A catalog of 655 cross-domain primes is not a curriculum. To turn the catalog into something a learner can actually walk through, the primes need an ordering — a sequence in which the easier ones are met before the harder ones, with prerequisites honored. This paper describes how the Encyclopedia constructs that ordering. The approach is a difficulty-weighted topological sort that combines three kinds of evidence: a per-prime learning-age assessment derived from a triangulated LLM "ELI ladder" (explanations at 5, 10, 15, 18, and specialist levels), a set of word-level signals over the prime's slug and catalog text (Kuperman age-of-acquisition norms, SUBTLEX frequency, Brysbaert concreteness, Flesch-Kincaid-style readability), and the catalog's own typed prerequisite DAG, honored as a hard topological constraint. The output is a single linear order chunked into five display tiers, with the top tier holding the most intuitive umbrella primes and the bottom tier reserved for the small set of primes where no faithful kindergarten explanation exists. This paper says what each signal contributes, what it misses, and where the algorithm hits an honest ceiling that no purely-objective signal can break past.

Empirical work

  • The Limits of Runtime Scaffolding: A Null Result for Abstraction Pipelines at the Frontier

    The Encyclopedia of Abstractions began as a bet that cross-domain reasoning could be improved by giving a model an explicit architecture for handling abstractions: recognize the operative primes, build a typed relational model, lift it to a domain-stripped meta-model, and transport a solution pattern from a curated catalog. This retrospective reports what happened when we tested the runtime form of that bet — scaffolding a frontier model's reasoning with the architecture at inference time — under blinded, pre-registered evaluation with deliberate confound control. The short answer: the runtime scaffold is largely inert. Varying the control structure (fixed pipeline vs. free planner vs. enforced coverage discipline) did not move design quality, did not raise coverage of load-bearing components even for a weaker solver, and did not change the faithfulness of the model's stated reasoning. We did not find the catalog or the abstraction idea worthless; we found that one of its two original arms — runtime scaffolding — meets the bitter lesson at the frontier on the problems we could construct. The value most plausibly survives one level up: as a target for synthetic training data, and as a curriculum for teaching abstraction to humans. Those two arms remain open, and are where the evidence now points.

  • Related Work & References — verified prior-art map

    Source material for the retrospective's Related Work section. Built from the focused literature scan (2026-05-26) after a citation-verification pass. Each entry carries a relation flag and a verification status.

Other

  • Focused Related Work and Prior Art for an Encyclopedia of Abstractions

    The closest prior art does not support a broad novelty claim of “abstraction-first prompting helps LLM reasoning at runtime.” That territory is already crowded by step-back, analogical, plan-and-solve, least-to-most, self-discovered, and graph-structured prompting, several of which reported gains on strong frontier models. But the literature is also much friendlier to the project's skeptical framing than those headline gains suggest: CoT-style structure helps mainly on math/symbolic tasks, can hurt badly off that turf, often shrinks with stronger models, and remains only weakly faithful even in modern reasoning systems. The strongest claim that still looks defensible is narrower: a hand-curated, cross-domain prime-abstraction + solution-archetype corpus, coupled to a typed relational/meta-model transport pipeline, and then tested through careful blinded evaluation, appears different from the existing mix of ontologies, prompting tricks, and process-supervision papers. The most serious pre-emption risk lies not in old knowledge graphs, but in newer work on analogical prompting, rationale distillation, multi-domain process supervision, and cross-domain latent adaptation.