A phenomenon where a model, system, or individual
becomes overly attuned to specific details or patterns in training
data, reducing generalizability to new situations.
Imagine you memorize the exact answers to last week's math quiz, including the funny doodle on question 4. You'd ace last week's quiz again — but on a new quiz, you'd be lost, because you learned the wrong stuff. That's overfitting: learning the quirks of one specific test instead of the actual math.
Learning the Practice, Failing the Test
Overfitting happens when a model, like a guessing program or a student studying, learns its practice examples so well that it picks up tiny details that do not matter, including random mistakes. Then when it faces brand new problems, it does much worse than it did on practice. The trick is that the model needs to be flexible enough to catch real patterns, but careful enough not to chase noise. The gap between practice scores and real scores is the clue something went wrong.
Fitting Noise, Not Pattern
Overfitting is when a model captures patterns in its training data that do not really exist in the wider world, including random noise, accidents, or quirks unique to that sample. The result is that the model looks great on training data but performs much worse on new, unseen examples drawn from the same target population. It is not a flaw in the model alone or the data alone; it lives in the relationship between them and the population you actually care about. The core tension is balancing flexibility, enough capacity to catch real structure, against restraint, enough discipline to ignore noise.
Overfitting is the structural condition in which a model or learned procedure captures patterns in its training data that do not correspond to generalizable structure — noise, idiosyncratic coincidences, or features specific to the training distribution — such that performance on training data is disproportionately good relative to performance on new cases drawn from the target population. Crucially, overfitting is a relational property of the model-data-target triple, not of the model or data alone: the same model may be overfit on one dataset and well-calibrated on another. The diagnostic is the gap between in-sample and out-of-sample (held-out, cross-validated, or future) performance. Behind the phenomenon lies the bias-variance trade-off, formalized by Geman, Bienenstock, and Doursat (1992): too little capacity yields bias (systematic miss); too much yields variance (sensitivity to sample-specific noise). Every overfitting diagnosis specifies the model and its capacity, the training sample and its relation to the target distribution, the measured performance gap, and the mechanism by which training-specific patterns were absorbed (e.g., excess parameters, insufficient regularization, leakage, multiple testing).
Customer Behavior Prediction: A marketing algorithm
overly tailored to past campaigns might fail to predict responses to
novel offers, reflecting overfitting.
Overfitting is not Variance because Overfitting is the phenomenon where a model captures noise in training data and loses generalization, whereas Variance is a component of prediction error measuring sensitivity to training-data fluctuations; overfitting results partly from high variance.
Overfitting is not Complexity because Overfitting is the mismatch between model complexity and available training data, whereas Complexity is a property of the model itself (number of parameters, depth, etc.); a complex model can generalize well if trained appropriately.
Overfitting is not Bias because Overfitting is learning noise specific to training data, whereas Bias is systematic error from model assumptions; high bias and high variance are both learning failures but from different sources.