Equivariance¶
Core Idea¶
Equivariance is the structural pattern in which transforming the input of a map produces a correspondingly transformed output: the map commutes with a group of transformations rather than ignoring them. Formally, f(g·x) = g·f(x), a relation whose modern abstract form descends from the representation-theoretic notion of an intertwining map between group actions, as Serre (1977) develops in his canonical treatment of linear representations. [1] The output does not stay fixed (that would be invariance) but changes in lockstep with the input, so the transformation can be applied before or after the map with the same result. The defining feature is that two distinct group actions — one on the input space, one on the output space — are tied together by the map, so that the map "respects" the symmetry rather than destroying or merely tolerating it. [2] The concept emerges in pure mathematics (G-sets, equivariant functions, natural transformations) and recurs, under different names, as covariance in physics and as the design principle of geometric deep learning in machine learning, where Bronstein and colleagues (2021) treat it as a unifying organizing principle for neural architectures. [3]
How would you explain it like I'm…
Move-Together Rule
Turning Together
When Inputs and Outputs Move in Lockstep
Structural Signature¶
Equivariance encodes a structural pattern: two coupled group actions → a map between their carrier spaces → the constraint that the map commutes with both actions. It separates a transformation of the input from the corresponding transformation of the output, and asserts that these two transformations are matched by the map. The diagram f(g·x) = g·f(x) is the load-bearing relation; everything else is interpretation, a point Mac Lane (1971) makes structurally precise by recasting such commuting squares as naturality conditions. [4]
Equivalent framings:
- A map that commutes with a group action
- Transforming the input yields a correspondingly transformed output
- Symmetry of the input is tracked, not discarded, by the map
- Apply-then-map equals map-then-apply
- Two coupled group representations linked by an intertwiner
- Structure-preserving response to a transformation group
- Covariance of a relation under change of frame
The structural insight is robust: a representation-theoretic intertwiner, a physical law written so it holds in every coordinate frame, a convolutional feature extractor whose output shifts when the image shifts, and a linear time-invariant filter whose response merely delays when the input is delayed are all the same shape, as Cohen and Welling (2016) demonstrate by deriving group-equivariant convolutional networks directly from the commuting-square condition. [5] Equivariance lets one reason about an entire family of transformed inputs from a single representative, because the map's behavior on the whole orbit is determined once its behavior on one point is fixed.
What It Is Not¶
Equivariance is not a claim that the output is unchanged. That is the special — and weaker — case of invariance, recovered when the group acts trivially on the output. A common misreading treats "equivariant" as a fancy synonym for "robust" or "symmetric," but equivariance makes a precise commutativity demand that most robust maps fail. A map can be insensitive to noise yet not equivariant to rotation; it can be approximately invariant yet violate the exact lockstep relation that equivariance requires.
Nor does equivariance assert that the input and output spaces are the same, or that the two group actions are identical. The action on the input and the action on the output can differ — they need only correspond through the map. A convolution maps an image to a feature field; both transform under translation, but the representation carried by the feature field can be richer than the raw pixel grid. The relation is between how each space transforms, not a demand that they transform identically.
Equivariance also does not, by itself, say anything about which group is relevant or whether a given symmetry is desirable. It is a relational property contingent on a chosen group G; the same map can be equivariant to one group and not another. Choosing the wrong symmetry group — imposing rotation-equivariance where the task actually breaks rotational symmetry (reading text, say) — produces a model that is provably structured but wrongly structured. The prime names the relation; it does not adjudicate the modeling choice, a caution Lyle and colleagues (2020) raise when they show invariance and equivariance help only when the assumed symmetry genuinely holds in the data. [6]
Finally, equivariance is not the same as exact group action in continuous or discretized settings. Real systems often achieve only approximate equivariance — a pooled or sampled grid breaks perfect continuous symmetry — and the gap between exact and approximate equivariance is itself a substantive engineering concern, not a definitional footnote, as Weiler and Cesa (2019) make explicit in characterizing the steerable filters that achieve exact equivariance on the rotation group. [7]
Broad Use¶
Mathematics: Equivariant maps between G-sets and G-spaces; intertwining operators in representation theory; natural transformations in category theory (a naturality square is a commuting diagram of the same shape); equivariant cohomology and equivariant K-theory, where the group action is carried through every construction; Noether's theorem, which ties continuous symmetries of a system to conserved quantities and rests on the covariance of the action functional, a connection Olver (1986) develops systematically through the theory of symmetry groups of differential equations. [8]
Physics: Covariance of physical laws under coordinate change — the equations transform consistently so the physics is the same in every reference frame. Lorentz covariance in special relativity, general covariance in general relativity, and gauge equivariance in gauge field theory all instantiate the same demand: the dynamical relations must commute with the relevant transformation group, a principle Weyl (1952) traces from geometry into physics in his classic study of symmetry. [9]
Machine learning: Convolutional layers are translation-equivariant (shift the image, the feature map shifts identically), the founding insight of geometric deep learning; group-equivariant CNNs extend this to rotations and reflections; equivariant graph neural networks respect permutation symmetry of nodes; equivariant transformers and message-passing networks for molecules respect the rotation/translation symmetry of 3D space, with Satorras and colleagues (2021) showing that E(n)-equivariant graph networks predict molecular properties more sample-efficiently than unconstrained models. [10]
Signal processing: A linear time-invariant filter is the canonical equivariant operator — delay the input, and the output is delayed by exactly the same amount. The entire theory of convolution and Fourier analysis can be read as the study of operators equivariant to the translation (shift) group, a viewpoint Oppenheim and Schafer (1989) build their treatment of discrete-time systems around. [11]
Robotics and vision: Pose-equivariant representations, where rotating or translating an object rotates or translates its encoding correspondingly, so a grasp or trajectory computed in one pose transfers predictably to the transformed pose without retraining.
Clarity¶
Naming equivariance separates two notions that are constantly confused in design discussions: a quantity left unchanged by a transformation (invariance) versus a quantity that tracks the transformation predictably (equivariance). Without the distinction, engineers and theorists conflate "the symmetry doesn't matter to the output" with "the symmetry is preserved by the output," which are opposite commitments. The vocabulary lets a designer state precisely whether a representation should discard a symmetry (an invariant classifier label) or carry it forward (an equivariant feature map that later layers can still exploit), a layering distinction Cohen and colleagues (2019) formalize through the general theory of equivariant maps on homogeneous spaces. [12]
The clarity also surfaces an underappreciated dependency relation: invariance can be derived from equivariance by composing an equivariant map with a symmetric pooling step. This means the two are not competing alternatives but layers of a single pipeline — keep the symmetry tracked equivariantly through the early stages, then collapse it to an invariant at the end. Stating this explicitly prevents the common error of building invariance in too early, destroying information that later stages need.
Manages Complexity¶
Equivariance lets one guarantee behavior across an entire orbit of transformed inputs from a single analysis of one representative, collapsing infinitely many cases into one. Once the map's value is known on a single point, its value on the whole G-orbit of that point is fixed by the commuting relation, so the analyst, prover, or trainer handles one case and transports the conclusion. [1] This is the structural reason equivariant models need far less data: the symmetry is built into the hypothesis class rather than learned from examples, so the model does not waste capacity re-learning that a rotated cat is still a cat.
It also bounds the design space of models and laws to those whose structure respects a known symmetry, drastically shrinking what must be learned, proven, or checked. The space of all maps is vast; the space of equivariant maps to a given group is a structured, often finite-dimensional, subspace — in the linear case it is fully characterized by representation theory (Schur's lemma and the decomposition into irreducibles), turning an open-ended search into a constrained, enumerable one. Imposing equivariance is therefore a strong inductive bias: it does not merely regularize, it carves out the only admissible solutions in advance.
Abstract Reasoning¶
Recognizing equivariance enables reasoning by symmetry reduction: solve the problem once on a representative, then transport the solution across the group rather than re-solving for each transformed instance. This is the conceptual engine behind reducing a PDE to its symmetry-invariant solutions, behind decomposing a representation into irreducibles, and behind the practice of working in a quotient or fundamental domain. The counterfactual "what would the answer be under this transformation?" is answered for free once equivariance is established, because the transformed answer is just the group element applied to the original. [8]
Equivariance also licenses a precise form of inductive-bias reasoning: choosing a symmetry group is choosing what the model is forbidden to distinguish in a structured way, and clarifies exactly when invariance can be derived from equivariance followed by a symmetric pooling step rather than imposed independently. This converts a vague desideratum ("the model should handle rotations") into a checkable algebraic constraint on the architecture, and it travels: the same reasoning that decomposes a physical field into its symmetry sectors decomposes a neural feature space into its equivariant channels.
Knowledge Transfer¶
The physicist's covariant equations and the machine-learning engineer's translation-equivariant network are the same structure: a map that commutes with a symmetry group. This is not analogy but identity — the commuting square f(g·x) = g·f(x) is literally what both communities write, with G being the Lorentz group in one case and the translation group in the other. The insight that pooling an equivariant feature yields an invariant one transfers from harmonic analysis (averaging over a group to project onto the trivial representation) to deep learning (global pooling over spatial positions to obtain a translation-invariant classifier) entirely unchanged. [3] A researcher who understands intertwining operators in representation theory already understands, structurally, why an equivariant network's layers must be convolutions; a signal-processing engineer who understands LTI systems already understands why a network respecting time-shift symmetry must be a temporal convolution. The vocabulary of group actions provides a shared interlingua across mathematics, physics, and computation that makes these transfers mechanical rather than metaphorical.
Examples¶
Formal/abstract¶
Representation theory (intertwining operators): Let G act linearly on two vector spaces V and W via representations ρ and σ. A linear map T: V → W is equivariant (an intertwiner) when T(ρ(g)v) = σ(g)T(v) for all g and v. Schur's lemma then tells us that if V and W are irreducible, T is either zero or an isomorphism, and over the complex numbers an intertwiner of an irreducible representation with itself is a scalar multiple of the identity. The equivariance constraint thus collapses the entire space of linear maps down to a tiny, fully characterized set. Mapped back: the abstract demand "the map must commute with both group actions" is exactly the demand that a convolutional layer respect translation, or that a physical law hold in every frame. Schur's lemma is the formal reason equivariant linear layers have so few free parameters: most of the map is determined by symmetry, not learned, which is why imposing equivariance is such a powerful constraint on a hypothesis space.
General covariance in physics: Einstein's field equations are written so that they take the same form under any smooth change of coordinates: the equations transform as tensors, so applying a diffeomorphism and then evaluating the law gives the same physics as evaluating the law and then applying the diffeomorphism. The dynamical content is the equivalence class of solutions under the symmetry group, not any single coordinate representation. Mapped back: this is f(g·x) = g·f(x) with f the law, g a coordinate transformation, and the "lockstep" being the tensor transformation rule. The physicist's insistence that physics not depend on the observer's chart is structurally identical to the engineer's insistence that an image classifier's features shift when the image shifts — both are statements that a map commutes with a transformation group.
Applied/industry¶
Translation-equivariant convolutional networks: In a convolutional neural network, a convolutional layer satisfies the property that if the input image is shifted by some vector, the output feature map is shifted by the same vector. This is what lets a network trained to detect an object in one image region detect it anywhere, without seeing the object in every position during training. The symmetry is built into the architecture: weight-sharing across spatial positions is the translation-equivariance constraint made concrete. Mapped back: the network is an explicit engineering instantiation of the commuting square, with g a spatial shift. The data-efficiency payoff is the practical face of the abstract orbit-collapsing argument: because the layer is equivariant, the model needs to learn the appearance of a feature only once rather than once per position, exactly the "solve on one representative, transport across the orbit" reduction.
Equivariant networks for molecular and physical systems: Modern models that predict molecular energies, forces, or protein structure are built to be equivariant to the rotations and translations of 3D space (the Euclidean group E(3)): rotate the input molecule, and the predicted force vectors rotate identically, while scalar energies stay invariant. This guarantees physically consistent predictions — a molecule's energy cannot depend on its arbitrary orientation in the simulation box — and dramatically improves sample efficiency, since the model is not forced to learn rotational consistency from data. Mapped back: here the output is mixed — some quantities (energy) are invariant, others (forces, dipoles) are equivariant — illustrating the layered relation from the Clarity section: equivariant intermediate representations are pooled or contracted to invariant scalars where appropriate, while vector outputs retain their equivariance. The same group action governs input and output; the architecture simply enforces that the map commutes with it.
Structural Tensions¶
T1: Equivariance versus expressivity. Imposing exact equivariance restricts the hypothesis class to maps that commute with the group, which is precisely the source of its sample-efficiency benefits but also a hard ceiling on what the map can represent. A strictly equivariant model cannot represent any relation that genuinely breaks the assumed symmetry, even when a small symmetry-breaking term is exactly what the problem needs. Practitioners face a real trade-off: more symmetry means fewer parameters and better generalization when the symmetry holds, but a brittle, mis-specified model when it does not.
T2: Exact versus approximate equivariance. The clean relation f(g·x) = g·f(x) presumes the group acts cleanly on both spaces, but discretization, sampling, and finite boundaries break exact symmetry. A pixel grid is only approximately rotation-equivariant; a finite simulation cell is only approximately translation-invariant. The tension is between the mathematical ideal that makes the reasoning clean and the engineered reality where equivariance holds only up to an error that must be measured, bounded, and sometimes deliberately tolerated.
T3: Choosing the group is choosing the prior, and the choice can be wrong. Equivariance is always relative to a group G, and the prime gives no guidance on which group to pick. Imposing rotation-equivariance on a task that depends on orientation (reading text, recognizing the digit 6 versus 9) bakes in a false symmetry that the model cannot escape. The strength of equivariance as an inductive bias is exactly what makes a wrong symmetry assumption so damaging: the error is structural, not just a matter of insufficient data.
T4: Equivariance preserves information that invariance discards, but at a cost. Carrying a symmetry forward equivariantly keeps more information available to downstream stages than collapsing it to an invariant immediately, which is often the right design. But equivariant representations are larger, more complex, and more expensive to compute and store than their invariant pooled summaries. The decision of when in a pipeline to collapse equivariance into invariance is a genuine architectural tension with no universal answer: too early destroys needed structure, too late wastes resources and may leak nuisance variation.
T5: The same commuting relation reads as a constraint to enforce or a property to discover. In machine learning, equivariance is a constraint deliberately built into an architecture to shrink the search space. In physics, covariance is closer to a discovered property that correct laws are observed to possess and that serves as a filter on candidate theories. The structural relation is identical, but its epistemic role inverts: designed-in versus read-off. Conflating the two leads to category errors — treating a discovered symmetry as freely adjustable, or a chosen architectural symmetry as a law of nature.
T6: Symmetry reduction simplifies analysis but can hide where the symmetry is broken. Reasoning by symmetry reduction (solve on one representative, transport across the orbit) is enormously economical, yet it presumes the symmetry is intact everywhere it is invoked. When a symmetry is spontaneously or locally broken — a phase transition, an adversarial input that violates the assumed group action, a boundary effect — the orbit-collapsing argument quietly fails, and the single-representative analysis no longer transports. The convenience of the reduction can mask the precise locations where the symmetry assumption no longer holds.
Structural–Framed Character¶
Equivariance sits at the structural end of the structural–framed spectrum: it is the pattern in which transforming the input of a map produces a correspondingly transformed output, so that the map commutes with a group of transformations rather than ignoring them. Formally f(g·x) = g·f(x): the output tracks the symmetry instead of staying fixed.
The vocabulary is purely mathematical, the origin lies in representation theory with no institutional referent, and the relation is fully definable without any reference to human practices. It carries no normative weight — a rotation-equivariant image filter and a translation-equivariant convolution are no more "correct" than a non-equivariant one — and applying it recognizes a commuting structure already present in the map. On every diagnostic, it reads structural.
Substrate Independence¶
Equivariance is a moderately substrate-independent prime — composite 3 / 5 on the substrate-independence scale. Its formal signature — a map that commutes with a transformation group, f(g.x) = g.f(x) — is maximally abstract, and the physicist's covariance and the ML engineer's translation-equivariant network are literally the same structure. But the genuine span is narrow: it lives in the formal, physical, and computational substrates and finds no biological, social, or cognitive instance. Despite a perfect score on abstraction, that confinement to the math-physics-computation cluster is what keeps the composite at the middle of the scale.
- Composite substrate independence — 3 / 5
- Domain breadth — 3 / 5
- Structural abstraction — 5 / 5
- Transfer evidence — 4 / 5
Relationships to Other Primes¶
Parents (3) — more general patterns this builds on
-
Equivariance is a kind of Invariance
Equivariance is the property f(g.x) equals g.f(x), so applying the group action before or after the map gives the same result. The preserved feature is the commutative-square relation between the map and the group action, and it is preserved under the named group of transformations. That is precisely the structure of Invariance, with the preserved feature being relational rather than pointwise. Equivariance specializes invariance to maps whose output transforms in lockstep with the input rather than ignoring the action.
-
Equivariance is a kind of Symmetry
Equivariance is a specialization of symmetry. Specifically, it instantiates the transformation-group structure by tying two group actions -- one on the input, one on the output -- through a map satisfying f(g.x) = g.f(x). Like every symmetry claim, it specifies a group of transformations and how the system responds; equivariance is the subclass where the response is to transform-in-lockstep rather than to remain fixed (which would be invariance). The map respects the symmetry without being annihilated by it.
-
Equivariance presupposes Function (Mapping)
Equivariance is the relation f(g.x) equals g.f(x), which only makes sense when f is a deterministic rule that assigns each domain element exactly one image — the defining commitment of a Function. Without single-valued dependency between input and output, the equation has no settled meaning and the commutative-square claim cannot be tested. Equivariance therefore presupposes Function as the underlying mathematical object on which the group-action commutation property is imposed.
Path to root: Equivariance → Invariance
Neighborhood in Abstraction Space¶
Equivariance sits among the more crowded primes in the catalog (10th percentile for distinctiveness): several abstractions describe nearly the same structure, so a description that fits it will tend to fit its neighbors too — transporting it usually means disambiguating within this family rather than landing on it exactly.
Family — Representation & Interpretive Mapping (25 primes)
Nearest neighbors
- Transformation — 0.86
- Asymmetry — 0.86
- Decomposition — 0.83
- Impartiality — 0.81
- Form and Content — 0.81
Computed from structural-signature embeddings · 2026-05-29
Not to Be Confused With¶
Equivariance must be distinguished first and most carefully from Invariance, with which it is constantly conflated. Invariance is the property that a quantity is left unchanged by a transformation: f(g·x) = f(x). Equivariance is the property that the output transforms correspondingly with the input: f(g·x) = g·f(x). The relationship between the two is exact and asymmetric — invariance is the special case of equivariance in which the group acts trivially on the output, so that g·f(x) = f(x) for every g. This means equivariance is the more general and more information-rich notion: an equivariant map keeps the symmetry "alive" in its output, where an invariant map deliberately erases it. The practical consequence is that one can manufacture invariance out of equivariance — compose an equivariant map with a symmetric pooling or averaging step and the result is invariant — but one cannot recover equivariance from an invariant map, because the symmetry information has already been discarded. Designers who reach for invariance too early in a pipeline destroy exactly the structure that later equivariant stages would have exploited; naming the distinction is what lets them see the error. An image classifier wants its final label to be translation-invariant (a cat anywhere is still "cat"), but its intermediate feature maps should be translation-equivariant (the cat's features should move with the cat) so that spatial reasoning remains possible until the final pooling step collapses position away.
Equivariance is also not Symmetry itself. A symmetry is a transformation (or a group of them) under which a single system or object is left unchanged — a property of one thing relative to a group action: the square is symmetric under 90-degree rotations because rotating it yields the same square. Equivariance, by contrast, is a property of a map between two systems, asserting a relationship between their respective group actions. Symmetry says "this object looks the same after I act on it"; equivariance says "this map respects the action on its source and its target, translating one into the other." The two concepts are intimately linked — equivariance is defined relative to a symmetry group, and the existence of symmetries is what makes equivariance a meaningful constraint — but they live at different levels. Symmetry is a property of an object and its group; equivariance is a property of a morphism connecting two objects each carrying a (possibly different) group action. One can have a richly symmetric object that no interesting map treats equivariantly, and one can have an equivariant map between objects whose individual symmetries are modest. Confusing the two leads to the error of thinking that building a "symmetric" component automatically yields an equivariant system, when in fact equivariance is a constraint on how components interact under the group, not merely on their individual invariances.
Finally, equivariance is distinct from Conjugate Variables, a pairing concept with which it shares only a superficial sense of "two linked quantities." Conjugate variables (position and momentum, time and energy, a function and its Fourier transform) are pairs of complementary descriptions related by a transform and typically bound by a trade-off such as an uncertainty relation: sharpening knowledge of one blurs the other. The relationship there is between two representations of the same system and the cost of specifying them jointly. Equivariance involves no such complementarity and no trade-off between paired observables; it constrains how a single map respects a single transformation group, coupling the action on the input to the action on the output. The two ideas can even co-occur — the Fourier transform is itself an equivariant intertwiner that converts the translation action into a phase-multiplication action, and position/momentum are conjugate under it — but the conjugacy is a fact about the dual descriptions, while the equivariance is a fact about the map relating their group actions. Treating equivariance as a kind of conjugacy would wrongly import a notion of mutual-exclusion or uncertainty that the commuting-square relation simply does not contain.
Solution Archetypes¶
No catalogued solution archetypes reference this prime yet.
Notes¶
Equivariance and invariance are best understood not as competitors but as adjacent layers of a single design vocabulary: keep the symmetry tracked equivariantly through intermediate stages, then collapse it to an invariant at the point where the symmetry genuinely no longer matters to the output. Many architectural and theoretical mistakes trace to collapsing too early (destroying needed structure) or never collapsing at all (leaving nuisance symmetry variation in a quantity that should be symmetry-blind).
The span of equivariance is genuinely narrower than its abstraction would suggest. The formal signature f(g·x) = g·f(x) is maximally substrate-agnostic in vocabulary, yet every well-attested instance lives in the mathematics-physics-computation cluster: representation theory and category theory, covariant physical law, geometric deep learning, and signal processing. There is no clean biological, social, or cognitive instance in which a group action and a commuting map are both literally present, which is why the substrate-independence composite is held at 3 despite a perfect 5 on structural abstraction. Loose analogies ("the policy responds proportionally to the input") are not equivariance unless an actual transformation group and a genuine commuting relation can be exhibited.
A recurring subtlety is the distinction between exact and approximate equivariance. Continuous-group equivariance (rotations, the Lorentz group, time-shift on a continuum) is exact in the idealized setting but only approximate once spaces are discretized or truncated. The engineering literature treats the gap between exact and approximate equivariance as a first-class concern — steerable filters, equivariant interpolation, and symmetry-regularization losses all exist to manage it — and analysts should resist the temptation to treat the idealized commuting square as if it held exactly in a sampled implementation.
References¶
[1] Serre, J.-P. (1977). Linear Representations of Finite Groups (Graduate Texts in Mathematics, Vol. 42, L. L. Scott, Trans.). Springer-Verlag. Canonical treatment of group representations and intertwining (equivariant) linear maps; the orbit of a representative under a group action and Schur's lemma ground both the intertwiner notion and the orbit-determines-everything reasoning. ↩
[2] Fulton, W., & Harris, J. (1991). Representation Theory: A First Course (Graduate Texts in Mathematics, Vol. 129). Springer-Verlag. Standard introduction to representations of finite groups and Lie groups/algebras; develops how a map couples two distinct group actions on source and target so that it commutes with both. ↩
[3] Bronstein, M. M., Bruna, J., Cohen, T., & Veličković, P. (2021). Geometric deep learning: Grids, groups, graphs, geodesics, and gauges. arXiv preprint arXiv:2104.13478. Establishes equivariance/invariance under symmetry groups as the unifying organizing principle of modern neural architectures; shows that pooling an equivariant feature yields an invariant one (group averaging onto the trivial representation). ↩
[4] Mac Lane, Saunders. Categories for the Working Mathematician. Graduate Texts in Mathematics 5. New York: Springer-Verlag, 1971; 2nd ed., 1998. Standard reference. Precursor: Eilenberg, Samuel, and Saunders Mac Lane. "General Theory of Natural Equivalences." Transactions of the American Mathematical Society 58, no. 2 (September 1945): 231–294, DOI 10.2307/1990284. (Cross-linked to FACT-151 in set_and_membership.md — same underlying citation.). ↩
[5] Cohen, T. S., & Welling, M. (2016). Group equivariant convolutional networks. In Proceedings of the 33rd International Conference on Machine Learning (ICML), PMLR 48, 2990–2999. Derives group-equivariant convolutional networks directly from the commuting (equivariance) condition, generalizing translation-equivariant CNNs to rotations and reflections. ↩
[6] Lyle, C., van der Wilk, M., Kwiatkowska, M., Gal, Y., & Bloem-Reddy, B. (2020). On the benefits of invariance in neural networks. arXiv preprint arXiv:2005.00178. Analyzes when building in invariance/equivariance helps; shows the generalization benefit accrues only when the assumed symmetry genuinely holds in the data. ↩
[7] Weiler, M., & Cesa, G. (2019). General E(2)-equivariant steerable CNNs. In Advances in Neural Information Processing Systems 32 (NeurIPS), 14334–14345. Characterizes steerable filters achieving exact equivariance under the Euclidean group E(2) and its subgroups, making the exact-versus-approximate equivariance distinction explicit. ↩
[8] Olver, P. J. (1986). Applications of Lie Groups to Differential Equations (Graduate Texts in Mathematics, Vol. 107). Springer-Verlag. Systematic theory of symmetry groups of differential equations; develops Noether's theorem linking continuous symmetries to conservation laws via covariance of the action functional, and the symmetry-reduction method of solving on a representative and transporting across the group. ↩
[9] Weyl, Hermann. Symmetry. Princeton: Princeton University Press, 1952. Canonical expository treatment covering discrete and continuous symmetries. Technical Lie-group treatment: Weyl, Gruppentheorie und Quantenmechanik (Leipzig: Hirzel, 1931); English translation The Theory of Groups and Quantum Mechanics (Dover, 1950). ↩
[10] Satorras, V. G., Hoogeboom, E., & Welling, M. (2021). E(n) equivariant graph neural networks. In Proceedings of the 38th International Conference on Machine Learning (ICML), PMLR 139, 9323–9332. Shows that E(n)-equivariant graph networks predict molecular and physical-system properties more sample-efficiently than unconstrained models by building rotation/translation equivariance into the architecture. ↩
[11] Oppenheim, A. V., & Schafer, R. W. (1989). Discrete-Time Signal Processing. Prentice-Hall. Canonical signal-processing text; develops convolution and Fourier analysis around linear time-invariant (LTI) systems, the canonical translation-equivariant operators whose output is delayed by exactly the input delay. ↩
[12] Cohen, T. S., Geiger, M., & Weiler, M. (2019). A general theory of equivariant CNNs on homogeneous spaces. In Advances in Neural Information Processing Systems 32 (NeurIPS), 9145–9156. General theory of equivariant maps between fields on homogeneous spaces; formalizes the layered relation between equivariant intermediate representations and invariant outputs. ↩