Missing Data Mechanisms (MCAR, MAR, MNAR)¶
Core Idea¶
Missing Data Mechanisms classify data absence as: MCAR (missing completely at random) where the probability of missingness is unrelated to any variable; MAR (missing at random) where missingness depends only on observed variables; or MNAR (missing not at random) where missingness is related to the unobserved data itself.
How would you explain it like I'm…
Why Stuff Is Missing
Three Reasons Data Is Missing
Missing-Data Types: MCAR, MAR, MNAR
Broad Use¶
-
Medical Studies: Some patients don't return for follow-ups (missing data) based on symptoms or side effects, affecting how outcomes can be analyzed.
-
Online Surveys: Nonresponse may be higher among certain demographics, indicating MAR or even MNAR.
-
Credit Scoring: Individuals with incomplete financial records might systematically differ in risk profile from those with full data.
-
Longitudinal Research: Attrition (dropouts) can bias results if the reason for leaving correlates with the study's main variable.
Clarity¶
Differentiates benign missingness (MCAR) from more problematic patterns (MNAR), each requiring distinct methods (imputation, weighting, or specialized models) to ensure valid results.
Manages Complexity¶
Appropriately diagnosing how data vanish helps researchers or analysts choose robust strategies (like multiple imputation or sensitivity analyses), preventing skewed conclusions.
Abstract Reasoning¶
Reveals that missingness itself can be structured and correlated, demanding thoughtful modeling—akin to other hidden factors or confounders but with a unique mechanism.
Knowledge Transfer¶
-
Software User Data: High churn among certain user segments can produce MNAR patterns if those leaving had negative experiences.
-
Ecommerce: People who abandon checkout forms might systematically differ from those who complete purchases.
Example¶
In a weight-loss trial, dropouts may be strongly correlated with poor results (they quit because they saw no improvement). This data is MNAR—simply ignoring them can overestimate the program's effectiveness.
Relationships to Other Primes¶
Parents (3) — more general patterns this builds on
- Missing Data Mechanisms (MCAR, MAR, MNAR) is a kind of Classification — Missing-data mechanisms is a specific kind of classification, sorting missingness processes into three categories that determine valid handling.
- Missing Data Mechanisms (MCAR, MAR, MNAR) is a decomposition of Bias — Missing-data mechanisms are the specific shape bias takes when systematic data absence skews inferences from observed values.
- Missing Data Mechanisms (MCAR, MAR, MNAR) is a decomposition of Observability — MCAR/MAR/MNAR is the specific shape observability takes when the unobservable elements are missing data entries and the inference problem is reconstructing them.
Path to root: Missing Data Mechanisms (MCAR, MAR, MNAR) → Bias
Not to Be Confused With¶
- Missing Data Mechanisms (MCAR, MAR, MNAR) is not Markov Decision Processes (MDPs) because Missing Data Mechanisms characterize how data come to be absent from a dataset (patterns of missingness), while MDPs model decision-making over time with probabilistic transitions and rewards.
- Missing Data Mechanisms (MCAR, MAR, MNAR) is not Pattern Completion (Filling the Incomplete) because Missing Data Mechanisms describe the structural conditions under which data are missing (randomness, dependence on observed values, dependence on unobserved values), while Pattern Completion is the cognitive or algorithmic process of inferring missing information.
- Missing Data Mechanisms (MCAR, MAR, MNAR) is not Black Box vs. White Box Distinction because Missing Data Mechanisms classify statistical properties of missingness, while Black Box vs. White Box Distinction contrasts whether system internals are visible or opaque.
- Missing Data Mechanisms (MCAR, MAR, MNAR) is not Failure Mode and Effects Analysis (FMEA) because Missing Data Mechanisms describe why data values are absent, while FMEA systematically identifies component failures and propagates their effects through a system.
- Missing Data Mechanisms (MCAR, MAR, MNAR) is not Information Cascade because Missing Data Mechanisms characterize statistical properties of absence in a dataset, while Information Cascade is the social/informational phenomenon where sequential actors adopt observed choices without accessing full information.