Load Shedding¶
Intent¶
Load Shedding preserves critical system viability under overload by deliberately dropping, denying, deferring, or deprioritizing lower-priority load that cannot be safely handled.
The archetype is useful when a system cannot serve all demand without collapsing, corrupting important work, or sacrificing essential function. Instead of attempting to process everything, the system makes sacrifice explicit: some load is removed so the remaining load can be handled within viable bounds.
In compact form:
When total load exceeds safe capacity, deliberately shed lower-priority load to preserve critical function at the cost of completeness, fairness tension, and denied or lost work.
Primes¶
Composed of: Prioritization, Admission Control, Selective Rejection, Threshold, Resource Management, Graceful Degradation, Boundary, Observability
Related primes: Flow, Constraint, Threshold, Queueing, Resource Management, Trade-offs, Resilience, Fault Tolerance, Fail-Safe, Coupling, Observability, Scheduling
Structural Signature¶
This archetype is a strong candidate when the following conditions co-occur:
- A system receives, holds, transmits, or is responsible for a flow of requests, work, traffic, demand, exposure, transactions, obligations, or tasks.
- Total load can exceed the system's safe capacity.
- Attempting to serve all load would degrade critical function, create unbounded queues, collapse the system, or produce unacceptable harm.
- Some load is less critical, less urgent, less valuable, more deferrable, or safer to reject than other load.
- The system can distinguish load classes well enough to make explicit sacrifice decisions.
- Shed load can be handled predictably: dropped, denied, deferred, expired, disconnected, rerouted, or retried later.
Load Shedding is especially relevant when the relevant question is no longer “How do we serve everything?” but “What must be sacrificed so the system remains viable?”
Intervention Signature¶
Selectively drop, deny, defer, disconnect, or deprioritize lower-priority load according to an explicit policy so critical capacity remains available.
The intervention changes the system from:
to:
load classified by priority or safety
-> excess or lower-priority load shed
-> critical capacity preserved
The key move is controlled sacrifice of load.
Causal Logic¶
Overloaded systems often fail because they continue treating all load as equally serviceable after capacity has become scarce. Queues grow without bound. Critical work waits behind noncritical work. Downstream components saturate. Operators lose visibility. Recovery headroom disappears. Eventually, the system may fail for everyone.
Load Shedding works by changing the allocation of scarce capacity.
- Overload becomes visible. The system detects that total load threatens safe operation.
- Load is classified. Work, requests, flows, or obligations are ranked by urgency, value, safety, fairness, or recoverability.
- Lower-priority load is sacrificed. Some load is dropped, denied, deferred, disconnected, expired, or deprioritized.
- Critical capacity is protected. Essential functions, high-priority users, safety-critical flows, or recovery mechanisms retain capacity.
- Accumulation remains bounded. The system avoids infinite backlog or collapse through explicit removal of work.
- Recovery remains possible. By reducing total load, the system can stabilize and eventually re-admit normal demand.
The archetype converts indiscriminate overload into bounded, policy-governed loss.
What It Is Not¶
Load Shedding is not Rate Limiting. Rate Limiting governs how much or how often flow may be admitted, often before overload has fully accumulated. Load Shedding removes, denies, drops, or defers load when capacity is or will be exceeded.
Load Shedding is not Backpressure. Backpressure propagates downstream capacity signals upstream so producers slow or pause. Load Shedding sacrifices load locally or at an admission boundary when the system cannot or should not rely solely on upstream cooperation.
Load Shedding is not Buffering. Buffering holds excess flow temporarily. Load Shedding removes or deprioritizes load so it does not continue consuming scarce capacity. A buffer may eventually shed load if it fills.
Load Shedding is not Circuit Breaker. Circuit Breaker interrupts or meters flow at a boundary under active overload or cascade risk. Load Shedding is the broader intervention of sacrificing selected load to preserve capacity. Circuit Breaker may use load shedding as a mechanism or component.
Load Shedding is not Graceful Degradation. Graceful Degradation reduces service capability or quality to preserve core function. Load Shedding reduces the amount or class of load being served. They often combine, but the intervention logic differs.
Load Shedding is not arbitrary denial. A mature load shedding policy names what is shed, why, when, and how harm is bounded.
Load Shedding is not capacity expansion. It preserves viability by reducing load, not by increasing capacity.
Composition¶
Load Shedding is composed from several lower-level abstractions:
- Flow — Something must enter, accumulate, demand attention, consume resources, or require processing.
- Constraint — Capacity is limited and may be exceeded.
- Threshold — A stress, utilization, queue, latency, risk, or demand level determines when shedding begins.
- Prioritization — Load must be classified by criticality, urgency, value, fairness, or safety.
- Selective rejection — The system must be able to drop, deny, expire, defer, disconnect, or deprioritize load.
- Resource management — Scarce capacity is reserved for protected functions.
- Boundary — Shedding often occurs at an admission point or service boundary.
- Observability — Operators or mechanisms must see overload, shed rate, and remaining capacity.
The composition matters. Without prioritization, shedding is arbitrary. Without explicit excess handling, work may be silently lost. Without observability, the system may shed too late, too much, or the wrong load. Without recovery rules, shedding can become permanent.
Mechanism Families¶
Common mechanism families include:
- Software request shedding — Lower-priority requests are rejected or degraded during overload to preserve core service.
- Power grid load shedding — Electrical load is disconnected or curtailed to preserve grid stability.
- Traffic or network packet dropping — Packets, sessions, or flows are dropped under congestion or policy constraints.
- Emergency service triage — Scarce care, attention, or response capacity is directed toward higher-priority cases.
- Organizational work intake deferral — Teams pause or defer noncritical work to preserve capacity for essential operations.
- Cloud or compute job preemption — Lower-priority jobs are stopped, delayed, or evicted to preserve capacity for higher-priority workloads.
- Queue discard or deadline expiration — Work that waits too long or exceeds deadline becomes invalid and is removed.
- Demand response and consumption curtailment — Lower-priority consumption is reduced during scarcity.
- Crisis-mode service reduction — Public or institutional services suspend lower-priority functions under emergency capacity limits.
These mechanisms differ by domain, but they preserve the same intervention logic: sacrifice selected load to preserve critical capacity.
Parameter Dimensions¶
Concrete mechanisms usually require tuning along dimensions such as:
- Shedding trigger threshold — What overload signal activates shedding?
- Shed fraction — How much load is removed?
- Priority class order — Which load is protected and which is sacrificed first?
- Drop vs. defer policy — Is load discarded permanently, delayed, retried, rerouted, or queued elsewhere?
- Queue age or deadline limit — When does waiting work expire?
- Critical capacity reserve — How much capacity is protected for essential work?
- Fairness allocation rule — How is sacrifice distributed across users, regions, classes, or sources?
- Notification policy — How are affected parties told that load was shed?
- Retry or reentry policy — When and how can shed work return?
- Shedding cadence — Is shedding continuous, periodic, event-triggered, or staged?
- Hysteresis band — What prevents repeated shedding and re-admission oscillation?
- Recovery threshold — When does normal admission resume?
- Maximum shed duration — How long may a class remain shed?
These are parameter dimensions, not the archetype itself.
Invariants to Preserve¶
Load Shedding should preserve explicit invariants:
- Critical load remains serviceable — The protected function or class actually retains capacity.
- Shed load is handled explicitly — Removed work is dropped, rejected, expired, deferred, or retried according to policy.
- Accepted work is not silently lost — Work that the system claims to accept must be completed, cleanly rejected, or safely deferred.
- Shedding policy is auditable — Operators or stakeholders can understand what was shed and why.
- Safety and integrity are not sacrificed — Shedding should not corrupt state, hide acknowledged work, or endanger protected participants.
- Queue growth remains bounded — The system must not merely move overload into an invisible backlog.
- Recovery remains possible — The system can return to normal operation when overload subsides.
- Priority rules do not create unacceptable harm — Sacrifice must remain within ethical, legal, and operational constraints.
If these invariants cannot be preserved, load shedding may become arbitrary harm rather than controlled sacrifice.
Tradeoffs¶
Load Shedding accepts explicit loss or denial to prevent broader collapse.
Typical tradeoffs include:
- Some work is denied, dropped, or lost rather than completed.
- Completeness declines because the system no longer serves everything.
- Fairness tensions increase because some load classes are sacrificed before others.
- User or stakeholder experience degrades for affected parties.
- Low-priority value may still be real value and may be lost.
- Policy complexity rises because load classification and sacrifice rules must be maintained.
- Reputation or trust may suffer if shedding feels arbitrary or opaque.
- Misclassification can be costly when important load is incorrectly shed.
The archetype is therefore a survival-oriented intervention, not an optimization routine.
Contraindications¶
Load Shedding is a poor fit when load cannot be safely or legitimately sacrificed.
Use cautiously or avoid when:
- all load is equally critical,
- load priority cannot be determined,
- dropping or deferring load is more damaging than overload,
- rejected work cannot be cleanly handled,
- shedding would violate safety, legal, ethical, or contractual constraints,
- overload is caused by correctness, authorization, or data-integrity failure rather than excess load,
- shedding would mask a need for capacity expansion or structural redesign,
- actors can game priority labels to avoid being shed,
- the system cannot observe whether shedding is preserving critical function.
In such cases, backpressure, buffering, graceful degradation, failover, capacity expansion, repair, or redesign may be more appropriate.
Failure Modes¶
Common failure modes include:
- Arbitrary shedding — Load is sacrificed without explicit priority or fairness policy.
- Wrong load shed — Critical work is dropped while less important work continues.
- Priority inversion — Lower-priority work consumes protected capacity.
- Silent drop — Work disappears without acknowledgment or safe handling.
- Starvation — Some class of work is repeatedly shed and never served.
- Fairness collapse — Sacrifice falls disproportionately on a group, region, user class, or component.
- Hidden backlog — Shed or deferred work accumulates elsewhere.
- Oscillatory shedding — The system repeatedly sheds and re-admits load near a threshold.
- Reputational damage — Stakeholders lose trust because denial appears arbitrary.
- Shedding becomes normalized — Temporary sacrifice becomes ordinary operating practice.
- Gaming priority labels — Actors misclassify their work to avoid being shed.
- Collapse despite shedding — The policy sheds too little, too late, or the wrong load.
These failure modes should be treated as part of the archetype's design space.
Worked Example¶
A web service receives a sudden surge of requests during a major event. The service includes checkout, account login, product recommendations, analytics writes, and low-priority background refreshes. Under the surge, database latency rises and queues grow. If the system attempts to process every request and background task, checkout will fail for everyone.
The team implements Load Shedding.
- Checkout and login are classified as critical.
- Product recommendations are treated as optional.
- Analytics writes are buffered briefly and then dropped if the backlog grows too large.
- Background refreshes are suspended.
- Low-priority API requests receive explicit rejection responses.
- The system tracks shed rate, queue depth, latency, and checkout success.
- Once latency returns below the recovery threshold, lower-priority work is gradually re-admitted.
The system sacrifices completeness and some user experience. Recommendations may be missing. Analytics data may be incomplete. Some low-priority requests are denied. But the essential function of the service remains available.
The key move is not simply reducing load. It is choosing which load to sacrifice so critical function survives.
Cross-Domain Instances¶
- Software and web services — Noncritical requests, background jobs, or optional features are dropped or denied under overload to protect core paths.
- Power grids — Portions of demand are disconnected or curtailed to prevent broader grid collapse.
- Networking and packet handling — Packets or flows are dropped under congestion to preserve network viability or priority traffic.
- Emergency medicine and triage — Scarce care capacity is allocated by urgency or survivability when demand exceeds available resources.
- Organizational work management — Teams defer or cancel lower-priority work during overload to preserve essential operations.
- Cloud and compute scheduling — Low-priority jobs are preempted, evicted, or delayed so critical workloads continue.
- Transportation and public services — Lower-priority routes, services, or operations may be suspended during emergency capacity constraints.
- Demand response and resource systems — Noncritical consumption is curtailed during scarcity to preserve critical supply or infrastructure stability.
These examples are structurally related because each explicitly sacrifices some load to preserve more critical function under capacity constraint.
Notes¶
Load Shedding should be reviewed alongside Rate Limiting, Backpressure, Buffering, Circuit Breaker, Graceful Degradation, Bulkhead Isolation, Failover, and Triage.
The main conceptual risk is collapse into nearby concepts:
- If the entry emphasizes governing admission rate, it becomes Rate Limiting.
- If the entry emphasizes upstream capacity signals changing producer behavior, it becomes Backpressure.
- If the entry emphasizes temporary holding, it becomes Buffering.
- If the entry emphasizes dynamic boundary interruption under cascade risk, it becomes Circuit Breaker.
- If the entry emphasizes reducing feature quality or capability, it becomes Graceful Degradation.
- If the entry emphasizes sorting scarce care or response by urgency, it may overlap with Triage as a mechanism family or neighboring archetype.
- If the entry lacks explicit policy, it becomes arbitrary denial rather than Load Shedding.
The current entry uses selective_rejection, admission_control, and prioritization as solution-side labels. These may need later normalization as lower-level archetypal components, prime abstractions, mechanisms, or informal component labels.