Identifiability¶

Prime #: 903
Origin domain: Statistics & Experimental Design
Subdomain: causal inference and system identification → Statistics & Experimental Design

Core Idea¶

The condition under which an internal unknown — a parameter, mechanism, causal effect, or latent variable — is in principle recoverable from the observable signal the system makes available. The core is a uniqueness claim: the map from internals to observations is one-to-one within what the observation can see. When two distinct internals produce identical observations, the target is unidentified, and no amount of same-kind data can fix it.

How would you explain it like I'm…

Can You Even Tell?

Imagine I tell you two numbers add up to 10, and ask which two numbers I started with. You can't know — it could be 4 and 6, or 1 and 9. The answer is hidden because adding hides which pieces went in. Identifiability is asking: can you actually figure out the hidden answer from what you're allowed to see, or is it impossible no matter how hard you look?

Is the Answer Reachable?

Sometimes you want to know something hidden inside a system, but all you get to see is what comes out. Identifiability asks whether the hidden thing can even be figured out from what you can observe. If two totally different hidden setups would produce the exact same observations, then you simply cannot tell them apart — and collecting more of the same kind of data won't help, because the data can never separate them. The only fix is to change what you can see: a new kind of measurement, a real experiment, or an extra assumption. So identifiability is about whether the answer is reachable at all, before you even worry about getting it precisely.

One Cause or Two?

Identifiability is the condition under which something internal — a parameter, mechanism, cause, or hidden state — can in principle be recovered from the observable signal a system makes available. The core claim is uniqueness: among all the internal models you'd allow, the mapping from internals to observations is one-to-one within whatever the observation can actually see. If two distinct internal setups produce the very same observable distribution, the thing is unidentified, and no amount of additional same-kind data can separate them; only a structural change — a new measurement channel, an experiment, a parametric restriction, a prior — can. It is crucial to keep this apart from estimation: identifiability is whether the destination even exists, while estimation noise is the difficulty of getting there once it does. More data, fancier estimators, and bigger computers help only when the object is identifiable in the first place.

Identifiability is the structural condition under which an internal unknown — a parameter, mechanism, causal effect, hidden state, or latent variable — is in principle recoverable from the observable signal the system makes available. The defining commitment is a uniqueness claim: across the space of admissible internal models, the mapping from internals to observations is one-to-one within whatever subspace the observation can see. When two distinct internal configurations produce the same observable distribution, the object is unidentified, and no amount of additional data of the same kind can distinguish them — only a structural intervention (a new measurement channel, an experimental manipulation, a parametric restriction, a prior commitment) can. The shape is the same across substrates: a target object you want to know, an observation map the world makes available, and an equivalence class on the target space induced by that map; identifiability is the property that this equivalence class is a singleton for the value of interest. The diagnostic question — could two structurally distinct internals produce identical observations? — is substrate-neutral, and a 'yes' tells you, before any data arrive, that the model is misspecified for the question being asked. What the prime provides is a separation of an information-theoretic upper bound from estimation difficulty: failure to identify is not solved by more data, better estimators, or more computation; estimation noise lives downstream, while identifiability gates whether the destination exists at all.

Broad Use¶

Statistics and econometrics: parameter identifiability, rank conditions, instrumental variables.
Causal inference: the do-calculus identification problem (decidable, with an algorithm).
Control theory: structural and practical identifiability of dynamic-system parameters.
Systems biology and pharmacokinetics: structurally non-identifiable compartmental models.
Cryptography: one-way functions as the deliberate engineering of non-identifiability (sign-flipped).
Philosophy of mind: the underdetermination of mental content by behavioural output.
Machine learning: identifiability of factors in nonlinear ICA and disentangled representations.

Clarity¶

Separates can the answer in principle be recovered from this kind of data? from how well can we estimate it? — routing a stuck inference to the right diagnosis instead of blaming sample size for a degenerate channel.

Manages Complexity¶

Collapses "can we learn X from these data?" into a tractable structural-algebraic check on the model-and-channel pair, performed before any computation on the data, and recognized as one recurring test across a dozen substrate-specific methods.

Abstract Reasoning¶

Licenses portable moves: enumerate the equivalence class the channel induces on the target; enrich the channel, restrict by prior, or intervene to collapse it to a singleton — and recognize that the dual is anonymity, the same test wanting the class large.

Knowledge Transfer¶

Causal graphs → systems biology: the identification algorithm drives experimental-design choices in pharmacology.
Cryptography → privacy: indistinguishability's decision-theoretic framing carries to data-privacy guarantees.
Econometrics → ML: weak identification maps onto flat-minima and practically-unidentified disentangled representations.

Example¶

A two-compartment pharmacokinetic model is structurally non-identifiable — distinct rate-constant triples produce identical blood-concentration curves — so denser sampling cannot separate them; only enriching the channel (a tissue biopsy) or restricting by prior can.

Not to Be Confused With¶

Identifiability is not Falsifiability because falsifiability asks whether a hypothesis could be refuted by some observation whereas identifiability asks whether a unique internal value can be recovered.
Identifiability is not Observability because observability asks whether a system's current state can be reconstructed from output whereas identifiability asks whether the model parameters can be recovered from behavior — its control-theoretic dual.
Identifiability is not Confounding because confounding is one cause of non-identifiability (an open back-door path) whereas identifiability is the broader property that can fail for many reasons besides confounding.