Equivalence-Preserving Rewriting¶

Prime #: 833
Origin domain: Computer Science & Software Engineering
Subdomain: program transformation → Computer Science & Software Engineering

Core Idea¶

Equivalence-preserving rewriting transforms an object into a behaviourally equivalent but operationally different form under an explicit equivalence relation, then selects among the admissible rewrites by a cost criterion the equivalence itself does not adjudicate — the discipline lies in keeping the two factored: the equivalence carves a space of safe moves, the cost picks one point in it.

How would you explain it like I'm…

Same Thing, Easier Way

You can say the same thing two different ways and have it still mean exactly the same. "Two plus three" and "three plus two" both make five, but one might be quicker to count. So you pick the way that's easier, knowing the answer stays the same.

Swap Without Changing Meaning

Equivalence-Preserving Rewriting is changing something into a different form that still means or does exactly the same thing, then picking the version that is cheaper, like faster or shorter or easier to read. Saying "6 times 4" instead of "4 + 4 + 4 + 4 + 4 + 4" gives the same answer but is quicker to work out. The rule you follow tells you which changes are allowed because they keep the meaning the same, and a separate rule about cost helps you pick the best allowed version. If you sneak in a change that the rule didn't actually allow, you've broken it, because now it means something different from before.

Meaning-Safe Rewrites

Equivalence-Preserving Rewriting is transforming an expression, program, proof, or document into a behaviorally equivalent but operationally different form under an explicit equivalence relation, then choosing among the allowed rewrites by a cost criterion the equivalence itself does not settle. It has three parts: an object with a specified meaning, an equivalence relation that names which transformations preserve that meaning, and a cost criterion (speed, length, readability) orthogonal to the equivalence under which one admissible rewrite is chosen. What makes this different from plain optimization or rewording is that the equivalence and the cost are factored: the equivalence defines a space of safe moves, and the cost picks one point in it. The discipline is keeping them separate, because slipping in a move outside the equivalence changes the meaning, while a too-coarse equivalence permits rewrites that change behavior users care about. The remedy is to write the equivalence down, because an implicit one drifts and accumulates rewrites it never actually permitted.

Equivalence-Preserving Rewriting is the structural move of transforming an expression, specification, derivation, or document into a behaviorally equivalent but operationally different form under an explicit equivalence relation, then selecting among the candidate rewrites by a cost criterion the equivalence relation itself does not adjudicate. It has three load-bearing parts: an object with a specified meaning (a query, a program fragment, a proof step, an algebraic term, a sentence), an equivalence relation naming which transformations are allowed because they preserve that meaning (relational-algebra equivalence, operational equivalence, logical equivalence, meaning-preserving paraphrase), and a cost criterion orthogonal to the equivalence (execution time, instruction count, proof length, readability) under which one of the many admissible rewrites is chosen. What distinguishes it from optimization, simplification, or rewording in general is that the equivalence and the cost are factored: the equivalence defines a space of safe moves and the cost picks one point in that space, and the discipline lives in keeping the two separate. Slip in a transformation outside the equivalence and the rewrite no longer means what the original meant; admit cost-driven shortcuts the equivalence permits but the user did not intend, and the system silently changes expected behavior. The recurring failure mode is keyed to the equivalence's grain: too coarse admits rewrites that change behavior users care about, too fine excludes safe ones. The canonical case is the compiler optimization that is correct under the language standard but wrong under the programmer's intuitive model, so the remedy is to write the equivalence down, because an implicit equivalence drifts under maintenance.

Broad Use¶

Database query optimisation: relational-algebra rewrites (selection push-down, join reordering) transform a query into result-equivalent plans, and the optimiser picks the lowest estimated cost.
Compiler optimisation: loop unrolling, dead-code elimination, and peephole rewrites, each program-equivalence-preserving and scored by an execution-cost model.
Theorem proving: normal-form rewriting and proof restructuring under preserved logical content.
Symbolic computation: factoring, expanding, and simplifying under symbolic equivalence, scored by length or depth.
Natural-language paraphrase: meaning-preserving rewordings scored by clarity, register, or length.
Legal drafting: restating a clause to clarify or consolidate without changing its legal effect.
Hardware synthesis: equivalent-circuit rewrites under logical equivalence, scored by gate count or power.

Clarity¶

Forces two questions ordinary language conflates — under what equivalence relation are these rewrites admissible? and what cost are we minimising? — exposing the grain problem: a too-coarse equivalence admits behaviour-changing rewrites ("correct by the spec, wrong by the user"); a too-fine one forbids safe ones.

Manages Complexity¶

Separates safe-space construction from cost minimisation — generating a good rewrite is hard, checking one preserves meaning is mechanisable, and choosing by cost is separate — so the equivalence relation, once written down, is reusable across many swappable cost models.

Abstract Reasoning¶

Licenses equivalence-as-design-choice, verification asymmetry (checking beats generating, so generate-then-verify architectures dominate), cost-equivalence orthogonality (re-tune by swapping cost models), and the non-obvious move of treating the equivalence relation — not the visible cost work — as the primary object of scrutiny, since correctness lives there.

Knowledge Transfer¶

Compiler to legal drafting: declare the "preserved legal effects" relation up front and screen rewrites against it, rather than ad-hoc redrafting.
Database optimiser to paraphrase: the search-then-verify architecture ports to controlled-language and assisted-editing tooling.
Term rewriting to product design: the normal-form idea underwrites style guides, code formatters, and design-system standardisation.

Example¶

A database query optimiser generates candidate plans, each provably inside relational-algebra equivalence so correctness is never at issue, then ranks them by an orthogonal cost model — so re-tuning to new hardware swaps the cost model over a fixed, declared equivalence rather than rebuilding the safe space.

Relationships to Other Primes¶

Parents (1) — more general patterns this builds on

Equivalence-Preserving Rewriting is a kind of Transformation — The MEANING-PRESERVING subclass of transformation: an explicit equivalence relation carves a space of safe moves, plus an orthogonal cost criterion that selects within it. The file: 'this prime is the disciplined subclass... preserved under what relation?' Generic transformation may change meaning freely; this constrains it by a named equivalence.

Path to root: Equivalence-Preserving Rewriting → Transformation

Not to Be Confused With¶

Equivalence-Preserving Rewriting is not Transformation in general because a transformation may change meaning freely, whereas this prime is the meaning-preserving subclass constrained by an explicit equivalence relation.
Equivalence-Preserving Rewriting is not Optimization generically because this prime factors the search into safe-space construction (the equivalence) and within-space selection (the cost), which generic optimisation fuses.
Equivalence-Preserving Rewriting is not Canonical Form because a canonical form is a single distinguished representative, whereas this prime is cost-driven navigation among many admissible forms that need not converge on any one.