Imagine you write a phone number on a paper and pass it around the room. By the end, has anyone changed a digit? Data integrity means making sure the number stays exactly right from the first person to the last - no smudges, no copy mistakes, no sneaky changes.
Information Stays Correct
Computers store and move lots of information - photos, messages, bank balances - and stuff can go wrong: bits get flipped, copies get messed up, bugs change values, or someone tries to sneak in a change. Data integrity is the promise that information stays accurate and unchanged from when it's made until it's used. Computers use tricks like checksums (little math fingerprints) and rules that block bad edits to catch mistakes and prove nothing snuck in.
Trustworthy, Unaltered Data
Data integrity is the property that data stays accurate, consistent with its intended meaning and rules, and free from unauthorized, erroneous, or accidental changes throughout its whole lifecycle - creation, storage, transmission, processing, archival, retrieval. Without explicit protection, data drifts: bits rot in storage, transmission flips bits, bugs corrupt records, operators mistype, and attackers tamper. Detecting corruption requires either redundancy (extra copies you can compare) or cryptographic verification (a math fingerprint only the legitimate writer could produce). Protections combine technical mechanisms (checksums, error-correcting codes, digital signatures, database constraints, transactions) with organizational mechanisms (validation rules, audits, change control, provenance tracking). Different threats need different defenses.
Data integrity is the property that data remains accurate, consistent with its intended meaning and internal rules, and free from unauthorized, erroneous, or accidental modification throughout its entire lifecycle - creation, storage, transmission, processing, archival, and retrieval. It is enforced through a combination of technical mechanisms (checksums, error-correcting codes, digital signatures, database constraints, ACID transactions) and organizational mechanisms (validation rules, audit trails, change control, provenance tracking). The essential commitment of the concept is that data without explicit integrity protection is progressively corrupted by bit rot, transmission errors, software bugs, operator mistakes, and adversarial manipulation; that detecting corruption requires either redundancy or cryptographic verification, since corrupted data does not announce itself; and that different threat classes demand different mechanisms - a checksum catches random transmission errors but cannot stop a sophisticated attacker, while a digital signature catches tampering but does not detect storage degradation. Integrity is distinct from confidentiality (whether unauthorized parties can read) and availability (whether legitimate parties can access), forming the third leg of the classical CIA triad in information security.
Checksum validation when downloading software ensures
no bits were altered during transmission, paralleling
tamper-evident seals on physical goods to confirm authenticity.
Parents (2) — more general patterns this builds on
Data Integrityis a kind ofVerification — Data Integrity is a kind of verification: checksums, signatures, and audits confirm conformance to the data's intended specification.
Data IntegritypresupposesInvariance — Data integrity presupposes invariance because preserving accuracy across the data lifecycle is the preservation of intended content under storage, transmission, and processing operations.
Data Integrity is not Legitimacy because Data Integrity concerns whether data is accurate, complete, and unaltered as intended, while Legitimacy concerns whether authority or decisions are justly grounded and broadly accepted—integrity is a technical property, legitimacy is a normative-political property.
Data Integrity is not Provenance because Data Integrity is the condition that data has remained unchanged and accurate from creation to use, while Provenance is the documented history of where data originated and how it has been handled—integrity is about present state, provenance is about historical chain of custody.
Data Integrity is not Validation because Data Integrity ensures that data has not been corrupted or altered, while Validation ensures that data meets specified standards or requirements for a particular use—integrity concerns preservation of existing state, validation concerns conformance to purpose.