Theory Of Mind¶

Prime #: 1236
Origin domain: Cognitive Science
Subdomain: social cognition → Cognitive Science
Aliases: Tom, Mentalizing, Mindreading

Core Idea¶

Theory of mind is the structural pattern in which an agent maintains and updates a model of another agent's hidden internal states — beliefs, desires, intentions, knowledge, attention, ignorance — and uses that model to predict, anticipate, and respond to the other's behaviour. The modeller's own world-state and the modelled agent's world-state are kept formally separate: the modeller can hold the belief "I know X, but they do not yet know X" and reason about behaviour that depends on the other's belief rather than on the truth. The commitment is that an agent in a social, adversarial, cooperative, or instructional context cannot rely on its own world-model alone; it must run a second model, indexed to the other agent, with potentially divergent contents, and route decisions through that second model rather than through ground truth.

Three structural moves follow. There is perspective-taking — representing what the other can perceive given their position, attention, and sensors; false-belief tracking — representing what the other believes even when that belief is incorrect; and recursive embedding — representing what the other thinks the modeller thinks they think, to some depth k. Every theory-of-mind operation specifies four parameters: the target agent being modelled, the mental-state type (belief, knowledge, intention, desire, attention, emotion), the content (the proposition the state is directed at), and the depth of recursion. Many real failures are right about three of these and wrong about one — correct agent and content but wrong belief, or correct at depth-1 but wrong at depth-2.

The pattern is the dual of omniscient or ground-truth-only reasoning. An agent that acts on the world as it is, ignoring what others know about it, fails predictably at deception, persuasion, instruction, negotiation, and any task in which the other's behaviour depends on their information state rather than the underlying state of affairs. The operational signature of having the machinery at all is the ability to model a state known to be false — to represent the other looking where they believe the object is, not where it actually is.

How would you explain it like I'm…

Inside Their Head

Imagine you hide a cookie in the blue box while your friend is watching, but then they leave the room and you sneak it into the red box. When your friend comes back, where will they look? They'll look in the blue box — because *they* don't know you moved it, even though *you* do. Knowing that other people can think something different from what you know is the whole trick.

What They Believe

Theory Of Mind is keeping a little model in your head of what *someone else* is thinking — what they know, want, or believe — and using it to guess what they'll do. The key is that their picture of the world can be *different* from yours. Suppose you watch a toy get moved while your friend is away; you know where it really is, but you can also figure out that your friend will look in the old spot because *they* still believe it's there. You have to keep two pictures separate: the true one in your head, and the one inside your friend's head. Without this, you'd be bad at things like surprises, jokes, teaching, or knowing when someone's been tricked — because all of those depend on what the *other person* knows, not just what's true.

False-Belief Tracking

Theory of Mind is the pattern in which an agent *maintains and updates a model of another agent's hidden internal states* — beliefs, desires, intentions, knowledge, attention, ignorance — and uses it to predict and respond to their behavior. The key is that your own world-state and the other agent's are kept *formally separate*: you can hold 'I know X, but they don't know X yet' and reason about behavior that depends on their belief rather than on the truth. Three moves follow: *perspective-taking* (what can they perceive from where they are?), *false-belief tracking* (what do they believe even when it's wrong?), and *recursive embedding* (what do they think I think they think?, to some depth). Every operation specifies four things: the target agent, the mental-state type, the content, and the depth of recursion — and many real failures get three right and one wrong. It is the dual of *omniscient* reasoning that acts only on the world as it truly is; the giveaway that you have the machinery is being able to model a state *known to be false* — picturing someone looking where they *believe* the object is, not where it actually is.

Theory Of Mind is the structural pattern in which an agent maintains and updates a model of another agent's hidden internal states — beliefs, desires, intentions, knowledge, attention, ignorance — and uses that model to predict, anticipate, and respond to the other's behaviour. The modeller's own world-state and the modelled agent's world-state are kept formally separate: the modeller can hold the belief 'I know X, but they do not yet know X' and reason about behaviour that depends on the other's belief rather than on the truth. The commitment is that an agent in a social, adversarial, cooperative, or instructional context cannot rely on its own world-model alone; it must run a second model, indexed to the other agent, with potentially divergent contents, and route decisions through that second model rather than through ground truth. Three structural moves follow: perspective-taking — representing what the other can perceive given their position, attention, and sensors; false-belief tracking — representing what the other believes even when that belief is incorrect; and recursive embedding — representing what the other thinks the modeller thinks they think, to some depth k. Every theory-of-mind operation specifies four parameters: the target agent being modelled, the mental-state type (belief, knowledge, intention, desire, attention, emotion), the content (the proposition the state is directed at), and the depth of recursion; many real failures are right about three and wrong about one — correct agent and content but wrong belief, or correct at depth-1 but wrong at depth-2. The pattern is the dual of omniscient or ground-truth-only reasoning: an agent that acts on the world as it is, ignoring what others know, fails predictably at deception, persuasion, instruction, and negotiation. The operational signature of having the machinery at all is the ability to model a state known to be false — to represent the other looking where they believe the object is, not where it actually is.

Structural Signature¶

the modeller running its own world-model — the separately-indexed second model of the target agent's hidden state — the four parameters (target, state-type, content, recursion depth) — the channel-indexed updating of the second model — the false-belief capacity to represent a state known to be false — the recursion bottoming out at tractable depth

A system has theory of mind when each of the following holds:

A modeller with its own world-model. An agent that carries a representation of the world as it is.
A separately-indexed second model. A model of another agent's hidden internal states — beliefs, desires, intentions, knowledge, attention, ignorance — kept formally separate from the modeller's own, so it can hold "I know X, but they do not yet know X."
Four parameters. Each operation specifies a target agent, a mental-state type, a content (the proposition the state is directed at), and a depth of recursion; many failures are right on three and wrong on one.
Channel-indexed updating. New information updates only the agents who actually had access to its channel, not the truth and not every agent.
A false-belief capacity. The system can represent a state known to be false — the operational signature of having the machinery, e.g. modelling the other looking where they believe the object is, not where it is.
Bounded recursion. The modeller can embed others' models of its own to some depth k; real reasoning bottoms out at depth-2 or depth-3, beyond which it becomes computationally hard.

The components compose the dual of ground-truth-only reasoning: behaviour is routed through the second model rather than the underlying state of affairs, which is what makes deception, persuasion, instruction, and negotiation tractable — and failures localise to the specific parameter on which the second model diverged from the target's actual state.

What It Is Not¶

Not a mental model. mental_model is any internal representation of how something works — a system, a device, a process. Theory of mind is the specific case where the modelled thing is another agent's hidden mental states, kept separately indexed from one's own world-model so it can hold contents known to be false.
Not perspective. perspective is a vantage-dependent view of the world. Theory of mind is a second-order model of another's view — representing not just a different angle on the world but the other agent's beliefs, knowledge, and ignorance, including beliefs the modeller knows to be wrong.
Not the curse of knowledge. curse_of_knowledge is a failure of theory of mind — over-attributing one's own knowledge to a less-informed other. Theory of mind is the capacity whose breakdown produces that bias; the curse is the symptom of an un-separated second model, not the prime.
Not belief formation. belief_formation is how an agent forms its own beliefs from evidence. Theory of mind is modelling another agent's beliefs (possibly false, possibly divergent from one's own), updated by that agent's channel access, not by the modeller's evidence.
Not empathy. Affective empathy is sharing another's emotional state. Theory of mind's core (cognitive) sense is representing another's informational state; the two dissociate (autism versus psychopathy profiles), so modelling beliefs is distinct from feeling-with.
Common misclassification. Routing a social prediction through ground truth rather than the target's belief — the "they must already know" error. If new information updates the target's model because the modeller received it, channel-indexed updating has failed; predict where they believe the object is, not where it is.

Broad Use¶

Developmental psychology: the false-belief task — a child who predicts that Sally will look where she believes the marble is, not where it actually is, has passed the canonical milestone (around age four); failure is diagnostic of certain developmental profiles.
AI and language models: planning agents, dialogue systems, and language models that must model what a user knows, does not know, or believes incorrectly use belief-state tracking in MDPs/POMDPs and multi-agent RL.
Negotiation, diplomacy, and game theory: any bargaining or signalling game requires modelling the counterparty's information set, beliefs about your type, and beliefs about your beliefs — common knowledge and signalling equilibria rest on iterated theory-of-mind reasoning.
Education and pedagogy: a teacher must model what a student already knows, what they wrongly think they know, and what moves are accessible from their current model; the curse of knowledge is theory-of-mind failure toward a less-knowing interlocutor.
Security and red-teaming: defenders model what attackers know about the system, attackers model what defenders expect to see, and deception, honeypots, and operational security require nested modelling.
Animal cognition, clinical assessment, and interface design: gaze-following and selective caching probe the capacity in non-humans; autism and some stroke profiles show selective deficits; and a designer must model what a user expects a control to do, since mismatches produce predictable error patterns regardless of code correctness.

Clarity¶

The pattern makes a precise distinction visible that ordinary language collapses: between the state of the world and another agent's model of the state of the world. Many failures look like miscommunication or stupidity when they are predictable consequences of one party acting on the wrong model of the other's model. Naming theory of mind forces the question whose mind, in what state of knowledge? every time a social-cognitive prediction is made, and so relocates a class of failures from "they're being difficult" to "I modelled their information state incorrectly."

Naming the four parameters — target, state-type, content, recursion depth — lets the analyst pinpoint which of them a failure got wrong, since many failures are right about three and wrong about one. The pattern also makes the information-asymmetry geometry of any social situation explicit: who has access to what, when, and through what channel. A theory-of-mind-equipped agent reasons about that geometry directly; an agent without it confuses its own access for everyone's. The clarifying force is to convert diffuse interpersonal or design failures into a structured question about a separately-indexed second model and the specific parameter on which it diverged from the target's actual state.

Manages Complexity¶

The pattern collapses a family of cross-substrate problems — deception detection, instructional design, negotiation strategy, multi-agent planning, autism diagnosis, interface ergonomics, espionage — into one structural problem: maintain a separate, updateable model of another agent's information state, and route behaviour through that model rather than through ground truth. A practitioner facing any of these works the same machinery, instantiated to the substrate.

It also collapses a family of failure modes — curse of knowledge, false-consensus effect, transparency illusion, overconfidence in shared understanding, hostile attribution — into one structural failure: the agent failed to keep its own world-model separate from the modelled agent's, or failed to update one when only the other received new information. The fix family is correspondingly uniform: explicitly represent and audit the other's model. A structural bound on the complexity is that most real reasoning bottoms out at depth-2 or depth-3 recursion; deeper nesting becomes computationally hard and error-prone, so a key design move is to structure tasks so they do not require depth beyond 2. The complexity the pattern manages is the complexity of social prediction, which it reduces to maintaining and updating one indexed second model at a tractable recursion depth.

Abstract Reasoning¶

The pattern licenses several characteristic moves. Index by agent: every belief in the system carries an explicit owner, so updates change the owner's belief, not the truth. Track information by channel: when new information arrives, update only the agents who actually had access to its channel — failing to track which agents share which channels is the structural source of "they must already know" failures. Reason about false beliefs: an agent can hold a model in which the target's belief is known to be false, the operational signature of the machinery.

Three further moves complete the toolkit. Recursion management: structure tasks so they bottom out at depth-2, because deeper nesting is computationally hard and error-prone. Asymmetric-knowledge exploitation: deception, surprise, dramatic irony, and educational scaffolding all exploit asymmetries between what the modeller knows and what the other believes — the structural move is the same, only the valence differs. And coordination via shared belief: cooperation often relies on common knowledge, so manufacturing it through public announcements, ceremonies, and visible commitments is a theory-of-mind-targeted intervention. The reasoner asks, at every turn: whose belief is this, what channel gave them access, does my model of their state diverge from ground truth, and how deep must the recursion go?

Knowledge Transfer¶

Theory of mind transfers because the abstract structure — a separately indexed, updateable model of another agent's information state, usable for prediction and intervention — recurs across substrates, though its cognitive-psychology vocabulary and the word "mind" import a cognitive-substrate frame that keeps it from the fully structural end of the spectrum. The role mapping is consistent: the modeller maps to the child, the tutoring system, the negotiator, the defender; the target maps to Sally, the student, the counterparty, the attacker; the mental-state type and content map to whatever hidden state is being represented; and the recursion depth maps identically to the levels of nesting across all of them.

The transfers are documented. False-belief tasks and Sally–Anne protocols have been ported directly into evaluations of large language models and embodied agents, revealing analogous failure modes. Iterated-belief reasoning and signalling-game equilibria from game theory provide explicit apparatus for negotiation and diplomacy training. The curse of knowledge in teaching ports to interface design, where designers systematically over-estimate what users know and explicit user-model audits are the intervention. Red-team thinking — model what the adversary believes about your defences — generalises to competitor-modelling in product strategy and to reasoning about regulators' beliefs. Theory-of-mind deficits identified clinically predict communication failures in collaborative work, where perspective-taking exercises are the cross-substrate intervention. And perspective-taking protocols developed for testing apes have been adapted as evaluation suites for socially-deployed robots. Three internal distinctions travel with the prime and should be preserved: cognitive versus affective theory of mind (modelling beliefs versus sharing emotion, which dissociate in autism versus psychopathy), implicit versus explicit theory of mind (automatic gaze-following versus verbalised false-belief reasoning, with different substrates), and depth of recursion (most practice bottoms out at depth-2, with depth-3 and beyond expensive and error-prone). The unifying transfer move is always: build a separately-indexed model of the target agent's hidden state, update it by channel rather than by truth, keep it distinct from one's own world-model, and route the prediction or intervention through it at a tractable recursion depth.

Examples¶

Formal/abstract¶

The Sally-Anne false-belief task is the canonical worked instance and isolates every parameter of the prime in a single controlled vignette. Sally places a marble in a basket and leaves the room; while she is gone, Anne moves the marble to a box; Sally returns, and the child is asked, "Where will Sally look for her marble?" The modeller is the child; the target agent is Sally; the mental-state type is belief; the content is the marble's location; and the depth of recursion is one (the child models Sally's belief, not Sally's belief about the child). The structurally decisive move is channel-indexed updating: the world changed (the marble is now in the box) and the child witnessed the change (the child's own world-model is updated), but Sally had no access to the channel carrying that information — she was out of the room — so her belief-model must not be updated. A child with theory of mind keeps the two models formally separate and predicts Sally will look in the basket, where she still believes the marble is; this is the false-belief capacity in its purest form, representing a state the child knows to be false. A child who fails routes the prediction through ground truth, answering "the box," confusing its own access for Sally's — exactly the prime's omniscient-reasoning failure mode. The prime's parameter-localization claim is illustrated cleanly: the failing child is correct on three parameters (right target, right state-type, right content) and wrong on one (it updated Sally's belief when only the world and the child had access). The intervention the prime names is the developmental and diagnostic one: probe whether the separate second model exists by constructing a situation where it must diverge from truth, since only divergence reveals the machinery.

Mapped back: Sally-Anne is theory of mind stripped to its core — child as modeller, Sally as target, belief as the state-type, marble-location as content, depth-one recursion, and channel-indexed updating that must withhold the world-change from Sally's model — confirming that the false-belief capacity (representing a state known to be false) is the operational signature of the machinery.

Applied/industry¶

Two applied domains — instructional design in education and adversarial red-teaming in security — run the same separately-indexed-second-model structure (with the prime's caveat that the "mind" vocabulary imports a cognitive frame). In teaching, the modeller is the instructor, the target is the student, and the prime's central discipline is the antidote to the curse of knowledge — a theory-of-mind failure in which the expert routes explanation through its own world-model (where the concept is obvious) instead of the student's (where it is not). The four parameters localize the pedagogical failure precisely: the teacher is usually right about the target and content but wrong about the student's knowledge state, assuming a prerequisite the student lacks, or wrong about which move is accessible from the student's current model. The prime's channel-indexed-updating insight is the design lever: the instructor must update its model of the student only by what the student has actually been exposed to, not by what the instructor knows, and the intervention is an explicit student-model audit (formative assessment) that reveals where the second model diverges from the student's real state. Security red-teaming maps cleanly: the defender is the modeller and the attacker is the target, and effective defense requires modeling what the attacker believes about the system's defenses — a depth-2 operation, since the attacker is in turn modeling what the defender expects to see. Deception techniques (honeypots, decoys) are direct theory-of-mind interventions: they exploit the asymmetry between what the defender knows (this server is fake) and what the attacker believes (this server is real and valuable), the prime's asymmetric-knowledge-exploitation move with adversarial valence. The prime's recursion bound is the practical constraint in both: pedagogy and red-teaming both bottom out around depth-2 or depth-3, beyond which the nested modeling becomes intractable and error-prone, so good design structures the task to stay within tractable depth. In both, the prime's diagnostic applies — when an explanation or a defense fails, ask which of the four parameters of the second model diverged from the target's actual state.

Mapped back: Instructional design and security red-teaming both instantiate a modeller maintaining a separately-indexed model of a target's hidden knowledge state (student; attacker), failing by the prime's curse-of-knowledge and own-model-confusion modes, and both are bounded by its tractable recursion depth, so the intervention — audit the second model, exploit or close the knowledge asymmetry — transfers from education to security with the cognitive frame translated.

Structural Tensions¶

T1 — Ground Truth versus Modelled Belief (scopal). The pattern routes behavior through a model of the other's information state, not the underlying state of affairs; the two must be kept formally separate. The failure mode is collapsing them — acting on what is true rather than what the other believes, so deception, persuasion, and instruction misfire. Diagnostic: ask whether the prediction depends on the world's state or the target's belief about it. If reasoning uses ground truth where the other's behavior is driven by their (possibly false) belief, the second model has been confused with one's own; the false-belief test (predict where they think it is) reveals whether the separation holds.

T2 — Recursion Depth versus Tractability (scalar). Modelling can nest — what they think I think they think — but real reasoning bottoms out at depth-2 or depth-3 before becoming computationally hard and error-prone. The failure mode lives at both ends: stopping at depth-1 where the situation needs depth-2 (missing that the other anticipates you), or attempting deep nesting that exceeds reliable capacity. Diagnostic: ask how many levels the task actually requires and whether that exceeds tractable depth. If a strategy needs depth-4 to work, it is fragile; good design restructures the task to bottom out at depth-2 rather than relying on unreliable deep recursion.

T3 — Channel-Indexed Update versus Universal Update (measurement). New information should update only the agents who had access to its channel, not the truth and not every agent. The failure mode is the "they must already know" error — updating the target's model with information they never received, because the modeller received it. Diagnostic: for each new fact, ask which agents actually had access to its channel. If the target's model is updated by what the modeller learned rather than by what the target was exposed to, the indexing has failed; the curse of knowledge is exactly this universal-update error toward a less-informed party.

T4 — Cognitive versus Affective Modelling (scopal). Modelling another's beliefs (cognitive) is dissociable from sharing or representing their emotion (affective); the two run on different substrates and fail independently. The failure mode is assuming competence at one implies the other — accurate belief-tracking with no emotional resonance, or strong empathy with poor belief-modelling. Diagnostic: ask whether the task needs the target's knowledge state or their emotional state, and whether the modeller is equipped for that one. If a design assumes a single undifferentiated "theory of mind," it will mispredict where cognitive and affective capacities dissociate (autism versus psychopathy profiles); the two parameters must be tracked separately.

T5 — Correct Modelling versus Parameter Divergence (measurement). Every operation specifies four parameters — target, state-type, content, recursion depth — and many failures are right on three and wrong on one. The failure mode is treating a modelling failure as global ("they're being difficult") when it localizes to a single wrong parameter — right agent and content but wrong belief, or right at depth-1 but wrong at depth-2. Diagnostic: when a social prediction fails, audit each of the four parameters separately. If three are correct, the fault is the fourth; diffuse attributions of stupidity or hostility usually mask a precise single-parameter divergence in the second model.

T6 — Imported Mind-Frame versus Substrate Fit (scopal/framed-boundary). The abstract second-model-of-an-agent structure travels, but the "mind" vocabulary imports a cognitive-substrate frame — beliefs, desires, intentions — that may not fit non-cognitive targets. The failure mode is over-attributing rich mental states where a simpler information-state model suffices, or assuming the cognitive framing's machinery (emotion, intention) applies to a system that merely has an information set. Diagnostic: ask whether the target genuinely has the mental-state types being attributed, or only an information state. If modelling a market, an algorithm, or an institution as if it had beliefs and desires, the cognitive frame may mislead; the portable core is the separately-indexed information model, not the full mentalistic vocabulary.

Structural–Framed Character¶

Theory of mind sits just structural of the middle on the structural–framed spectrum, with a mixed-structural label and an aggregate of 0.4. Its abstract core — a separately-indexed, updateable model of another agent's hidden information state, routed through rather than ground truth — is a genuine relational structure that travels, but the cognitive-psychology origin and the word "mind" import a cognitive-substrate frame that pulls four diagnostics partway toward framed.

Walking the diagnostics with this prime's substrates: vocabulary travels with effort, scored 0.5. The home lexicon — "belief," "desire," "intention," "false belief," "mind" — is cognitive-psychological, and reaching AI, game theory, or security requires translating into "information set," "belief-state tracking in a POMDP," or "adversary model"; yet the second-model-of-an-agent structure is recognizably the same across the false-belief task, multi-agent RL, signalling equilibria, and red-teaming, so the relational skeleton travels even as the mentalistic words need translating. Evaluative weight is absent (scored 0): modelling another's belief is neither good nor bad; deception and pedagogy are the same machinery with opposite valence. Institutional origin is partial (0.5): the bare second-model structure is formal, but the prime's framing and richest vocabulary come from the institutional discipline of cognitive science and social cognition. Human-practice-boundness is likewise 0.5: the abstract structure can run in an AI planner or a POMDP that no human practice mediates, yet the canonical cases require agents with genuine mental states, and the prime's own T6 warns that attributing "beliefs and desires" to a market or algorithm may mislead, so the richest instances are bound to cognitive agents. And import-versus-recognize sits at 0.5: invoking theory of mind partly recognizes a real separately-indexed information model one can test with a false-belief probe, and partly imports a mentalistic frame of beliefs and desires. The genuinely portable second-model-of-an-agent structure keeps the prime on the structural side of the middle; the cognitive vocabulary and "mind" framing that travel only by translation lift the aggregate to 0.4, faithful to the mixed-structural label and to the prime's own caveat that the portable core is the indexed information model, not the full mentalistic vocabulary.

Substrate Independence¶

Theory of Mind is a moderately substrate-independent prime — composite 3 / 5 on the substrate-independence scale. Its abstract core — modeling another system's hidden internal states in order to predict and respond to its behavior — does travel: it reappears in AI and machine learning (opponent modeling, user modeling, inverse reinforcement learning), in game theory (reasoning about others' beliefs and types), in pedagogy (the teacher inferring the learner's misconception), and in adversarial security (modeling an attacker's intentions). That recurrence gives it reasonable domain breadth, scored 4, and a transfer-evidence sub-score of 4 reflecting these concrete, named instances. What caps the composite at 3 is structural abstraction: the prime's vocabulary leans heavily cognitive and psychological — "mind," beliefs, desires, intentions, false-belief understanding — and that mentalistic framing imports a cognitive-substrate context rather than presenting a clean medium-neutral relation. Every application presupposes an agent capable of representing another agent's representations, so there is no physical or biological reading; the substrate ceiling is the set of belief-attributing systems, and the inherited cognitive frame keeps the abstraction sub-score at 3. Genuine cross-domain reach against a mentalistic-substrate ceiling places the composite squarely at 3.

Composite substrate independence — 3 / 5
Domain breadth — 4 / 5
Structural abstraction — 3 / 5
Transfer evidence — 4 / 5

Relationships to Other Primes¶

Parents (1) — more general patterns this builds on

Theory Of Mind is a kind of Mental Model

Theory of mind is the SPECIFIC case of mental_model where the modelled thing is another agent's hidden mental states, kept SEPARATELY INDEXED from one's own world-model (false-belief capacity). The file: mental_model is 'any internal representation', ToM is the agent-target, dual-indexed specialization.

Children (1) — more specific cases that build on this

Curse Of Knowledge is a kind of, typical Theory Of Mind

*** curse_of_knowledge is a CANDIDATE (CAND-R2-154-08), not a canonical prime — recorded as links_to_other_candidates below, NOT a corpus reparent. *** The file: the curse is a FAILURE MODE of theory of mind (a breakdown of the separately-indexed second model), not a parallel prime.

Path to root: Theory Of Mind → Mental Model → Representation → Abstraction

Neighborhood in Abstraction Space¶

Theory Of Mind sits in a sparse region of abstraction space (70^th percentile for distinctiveness): few abstractions share its structure, so a faithful description tends to retrieve it precisely rather than landing on a neighbor.

Family — Representation & Mental Models (7 primes)

Nearest neighbors

Mental Model — 0.75
Second-Order Cybernetics (Second-Order Observation) — 0.71
Object Permanence — 0.70
Inconsistent Shared Model — 0.68
Shared Mental Model — 0.68

Computed from structural-signature embeddings · 2026-06-14

Not to Be Confused With¶

Theory of mind's nearest neighbor is mental_model, and the confusion is natural because theory of mind is a kind of model held in the mind — but the two differ in what is modelled and in a critical structural commitment. A mental model is any internal representation of how something works: a person's model of how a thermostat regulates temperature, how a market clears, how a codebase is organized. Theory of mind is the specific case where the modelled object is another agent's hidden mental states — beliefs, knowledge, intentions, attention — and it carries a commitment that generic mental models do not: the second model is separately indexed from the modeller's own world-model, so the modeller can simultaneously hold "I know X" and "they do not know X," and can represent contents known to be false. A mental model of a thermostat has no such dual-indexing requirement; it is simply the modeller's best representation of the device. The decisive test is the false-belief capacity: theory of mind must be able to model an agent looking where they believe an object is, not where it actually is, which requires keeping the agent's belief-model formally separate from ground truth. Conflating the two loses this separation — treating another's mind as "just another system to model" misses that the representation must track whose belief it is and what channel updated it, and that the contents can diverge from what the modeller knows to be true.

Theory of mind must also be distinguished from perspective, with which it is closely related because both involve representing how things look from somewhere other than one's own standpoint. Perspective is a vantage-dependent view of the world: what is visible, salient, or accessible from a particular position, angle, or role. Theory of mind goes a level higher — it is a model of another agent's mind, which includes but is not limited to their perceptual perspective. Perspective-taking ("what can they see from where they stand?") is in fact one component of theory of mind, but theory of mind additionally tracks the agent's beliefs (including false ones), knowledge (including channel-indexed updates they did or did not receive), intentions, and recursive models of the modeller's own mind. The difference is between representing a viewpoint on the world and representing a mind that has beliefs about the world. A practitioner who reduces theory of mind to perspective will capture the "what can they perceive" question but miss the false-belief and channel-updating machinery — they will correctly model what the other sees yet fail to model what the other wrongly believes after the world changed outside their view, which is exactly the Sally-Anne failure.

A third important confusion is with curse_of_knowledge, but here the relationship is not similarity but symptom-to-capacity: the curse of knowledge is a failure mode of theory of mind, not a parallel prime. The curse of knowledge is the systematic bias by which an informed agent over-attributes its own knowledge to a less-informed other — the expert who explains as if the novice already shares their background, the writer who assumes the reader knows what the writer knows. Structurally, this is precisely a breakdown of the separately-indexed second model: the modeller has failed to keep the target's knowledge-state distinct from its own, updating the target's model with information only the modeller received (the channel-indexed-updating failure). Theory of mind is the capacity whose correct operation prevents the curse and whose breakdown produces it. Keeping them distinct matters because the curse of knowledge names a specific recurring error with its own debiasing interventions (explicit audience modelling, formative assessment), while theory of mind names the general machinery of which that error is one localized failure (wrong on the target's knowledge-state parameter). Treating the curse as a separate, unrelated phenomenon obscures that the fix — explicitly represent and audit the other's separately-indexed model — is the general theory-of-mind discipline applied to one parameter.

These distinctions matter because each isolates a different facet: a mental model is any representation-of-a-system (where theory of mind adds the agent-target and the separate, false-belief-capable indexing), perspective is a viewpoint-on-the-world (where theory of mind adds belief, knowledge, and recursion above mere perception), and the curse of knowledge is a specific failure (of which theory of mind is the capacity). A practitioner who conflates them models another mind as a mere system, captures perception but not false belief, or treats a debiasing problem as unrelated to the underlying machinery. Holding theory of mind as the specific separately-indexed-second-model-of-an-agent's-hidden-state structure keeps the analyst asking its real questions — whose belief is this, what channel gave them access, does my model of their state diverge from ground truth, and how deep must the recursion go?

Solution Archetypes¶

No catalogued solution archetypes reference this prime yet.