Skip to content

Data Integrity

Prime #
172
Origin domain
Computer Science & Software Engineering
Also from
Information Theory, Accounting Auditing, Systems Thinking & Cybernetics
Aliases
Integrity, Data Correctness, Tamper Resistance
Related primes
Consistency, checksum, digital signature, Transaction, Provenance

Core Idea

Data Integrity ensures accuracy, consistency, and trustworthiness of data throughout its lifecycle, preventing unauthorized alterations or corruption.

How would you explain it like I'm…

Keeping Information Right

Imagine you write a phone number on a paper and pass it around the room. By the end, has anyone changed a digit? Data integrity means making sure the number stays exactly right from the first person to the last - no smudges, no copy mistakes, no sneaky changes.

Information Stays Correct

Computers store and move lots of information - photos, messages, bank balances - and stuff can go wrong: bits get flipped, copies get messed up, bugs change values, or someone tries to sneak in a change. Data integrity is the promise that information stays accurate and unchanged from when it's made until it's used. Computers use tricks like checksums (little math fingerprints) and rules that block bad edits to catch mistakes and prove nothing snuck in.

Trustworthy, Unaltered Data

Data integrity is the property that data stays accurate, consistent with its intended meaning and rules, and free from unauthorized, erroneous, or accidental changes throughout its whole lifecycle - creation, storage, transmission, processing, archival, retrieval. Without explicit protection, data drifts: bits rot in storage, transmission flips bits, bugs corrupt records, operators mistype, and attackers tamper. Detecting corruption requires either redundancy (extra copies you can compare) or cryptographic verification (a math fingerprint only the legitimate writer could produce). Protections combine technical mechanisms (checksums, error-correcting codes, digital signatures, database constraints, transactions) with organizational mechanisms (validation rules, audits, change control, provenance tracking). Different threats need different defenses.

 

Data integrity is the property that data remains accurate, consistent with its intended meaning and internal rules, and free from unauthorized, erroneous, or accidental modification throughout its entire lifecycle - creation, storage, transmission, processing, archival, and retrieval. It is enforced through a combination of technical mechanisms (checksums, error-correcting codes, digital signatures, database constraints, ACID transactions) and organizational mechanisms (validation rules, audit trails, change control, provenance tracking). The essential commitment of the concept is that data without explicit integrity protection is progressively corrupted by bit rot, transmission errors, software bugs, operator mistakes, and adversarial manipulation; that detecting corruption requires either redundancy or cryptographic verification, since corrupted data does not announce itself; and that different threat classes demand different mechanisms - a checksum catches random transmission errors but cannot stop a sophisticated attacker, while a digital signature catches tampering but does not detect storage degradation. Integrity is distinct from confidentiality (whether unauthorized parties can read) and availability (whether legitimate parties can access), forming the third leg of the classical CIA triad in information security.

Broad Use

  • Databases: Constraints (e.g., foreign keys) and checksums enforce internal consistency.

  • Communication: Error-correcting codes and signatures verify message authenticity.

  • Finance: Bank statements must remain unaltered to ensure correct balances and transactions.

  • Healthcare: Ensuring patient records are correct and consistent across different systems.

Clarity

Maintains reliable information by detecting and preventing tampering or corruption, reinforcing trust in data-driven decisions.

Manages Complexity

Introduces mechanisms (e.g., checksums, audits, validation rules) that keep data from drifting or being silently damaged.

Abstract Reasoning

Encourages considering lifecycle perspectives—how data moves and transforms—while ensuring every stage preserves correctness.

Knowledge Transfer

Maintaining data integrity is crucial in supply chains, blockchain systems, scientific research (verifiable datasets), and beyond.

Example

Checksum validation when downloading software ensures no bits were altered during transmission, paralleling tamper-evident seals on physical goods to confirm authenticity.

Relationships to Other Primes

One-hop neighborhood: parents above, mutual partners to the right, children below.Data Integritysubsumption: VerificationVerificationcomposition: InvarianceInvariance

Parents (2) — more general patterns this builds on

  • Data Integrity is a kind of Verification — Data Integrity is a kind of verification: checksums, signatures, and audits confirm conformance to the data's intended specification.
  • Data Integrity presupposes Invariance — Data integrity presupposes invariance because preserving accuracy across the data lifecycle is the preservation of intended content under storage, transmission, and processing operations.

Path to root: Data IntegrityVerification

Not to Be Confused With

  • Data Integrity is not Legitimacy because Data Integrity concerns whether data is accurate, complete, and unaltered as intended, while Legitimacy concerns whether authority or decisions are justly grounded and broadly accepted—integrity is a technical property, legitimacy is a normative-political property.
  • Data Integrity is not Provenance because Data Integrity is the condition that data has remained unchanged and accurate from creation to use, while Provenance is the documented history of where data originated and how it has been handled—integrity is about present state, provenance is about historical chain of custody.
  • Data Integrity is not Validation because Data Integrity ensures that data has not been corrupted or altered, while Validation ensures that data meets specified standards or requirements for a particular use—integrity concerns preservation of existing state, validation concerns conformance to purpose.