FMEA is a systematic, step-by-step method for
identifying potential failure modes in a product, process, or
system—evaluating their causes and effects so that designers can
prioritize and mitigate the most severe or likely issues before they
occur.
Before a big trip, a careful grown-up imagines all the things that could go wrong: flat tire, no gas, lost map, dead phone. For each one they ask, "how bad would that be? How likely is it? Would we even notice?" Then they pack a spare tire and charger for the worst, most likely, sneakiest problems. That's what engineers do for rockets and cars, but with a checklist.
Listing What Could Go Wrong
FMEA is a careful checklist engineers run *before* they build something, to list every way a part could break, why it would break, and what would happen if it did. For each failure they give three scores: how bad it is, how often it might happen, and how easy it is to catch before it hurts anyone. Multiply those scores and you get a "risk number" that says which problems to fix first. The whole point is to find scary problems on paper, when they're cheap to fix, instead of discovering them in a real crash.
Systematic Failure Audit
Failure Mode and Effects Analysis is a structured way of asking, *before deployment*, "what could go wrong, what would happen, and which problems deserve attention first?" A team walks through every component and subsystem, lists each way it could fail (a *failure mode*), traces the *cause* and the *effect* on the wider system, and scores each failure on three dimensions: severity (how bad if it happens), occurrence (how likely), and detectability (how easily it would be caught before harm). Multiply the three to get a Risk Priority Number (RPN), and you have a ranked list telling you where to spend your mitigation budget. Built originally for NASA's manned spaceflight program, FMEA is now standard in aerospace, automotive, and medical devices.
FMEA — Failure Mode and Effects Analysis — is a systematic, structured methodology for identifying and evaluating potential failure modes in a product, process, or system before deployment. It comprises (1) exhaustive enumeration of the ways a component or subsystem can fail, (2) tracing each failure mode back to its root causes and forward to its effects on system operation and safety, (3) scoring each failure on severity (consequence to user or mission), occurrence (likelihood), and detectability (likelihood the failure is caught before reaching the user), (4) computing a Risk Priority Number (RPN) as the product of those three scores to prioritize mitigation, and (5) designing and implementing countermeasures for high-RPN failures, then re-scoring to verify effectiveness. The deeper commitment is *systematic exhaustiveness*: rather than design-and-hope (reactive discovery via test or field failure), FMEA mandates that the team explicitly map what can go wrong, evaluate consequences upfront, and design controls before deployment. The practice converts the unbounded question "what could go wrong?" into a bounded, enumerable problem: walk through components, apply patterns from prior failures and design standards, rate each mode, and focus resources on high-impact mitigations. Originating in 1960s NASA manned-spaceflight requirements, formalized in MIL-STD-1629A, it is now standard in automotive (AIAG-VDA Handbook), medical devices (FDA guidance), and other safety-critical domains. FMEA does not prevent failures — it makes failure analysis systematic and repeatable so common modes are not overlooked and high-consequence ones receive proportional attention.
Automotive & Aerospace: Engineers apply FMEA to each
subsystem (brakes, engine, avionics) to detect critical points
of failure and reduce safety risks.
Software Development: Teams analyze possible ways a feature
or module could fail (e.g., input errors, load spikes),
assessing impact and likelihood to guide protective measures.
Healthcare: Hospital staff might evaluate how a medication
administration process could fail at each step, preventing
dangerous errors.
Emphasizes the proactive mindset: find and assess
weaknesses up front rather than reacting post-disaster. Helps
teams systematically move through each element, cause, and effect.
Breaks down a large system into manageable
chunks—failure modes—and quantifies severity, occurrence
probability, and detection difficulty. This structured approach
prevents confusion and oversight.
Demonstrates how mapping possible failure
modes creates a conceptual model of risk, clarifying relationships
among components, environment, and user interactions.
In car seat design, an FMEA might list failure modes
like latch not securing, foam degrading, or harness tension issues,
then rank each by potential injury risk to ensure the highest risks
are addressed first.
Failure Mode and Effects Analysis (FMEA) is not Stakeholder Analysis because FMEA maps failure modes and their effects on system function, whereas Stakeholder Analysis identifies who is affected by and influences decisions or outcomes.
Failure Mode and Effects Analysis (FMEA) is not Pareto Effect (80/20 Rule) because FMEA catalogs all failure modes and their consequences to prioritize by severity, whereas the Pareto Effect identifies that a small subset of causes (20%) drive most (80%) of the outcomes.
Failure Mode and Effects Analysis (FMEA) is not Cross-Impact Analysis because FMEA systematically enumerates failure modes and their direct effects on specific functions, whereas Cross-Impact Analysis explores indirect consequences and second-order interactions between events.