Queueing¶
Core Idea¶
Queueing is the structured accumulation of work items (requests, customers, packets, cars, patients, jobs) awaiting service at a resource with finite capacity, together with the rules (queue discipline: FIFO, LIFO, priority, random) and parameters (arrival process, service process, number of servers, buffer size) that determine how items wait, how long, and with what predictability — a ubiquitous phenomenon whose mathematical analysis (queueing theory) gives quantitative tools for predicting wait times, utilization, throughput, and loss under different workloads. The essential commitment is that whenever demand is stochastic and service capacity is finite, waiting occurs; that the shape and size of the queue depend on arrival process, service process, and queue discipline; that Little's law (L = λW — average number in system equals arrival rate times average time in system) provides a universal relationship independent of discipline; and that approaching 100% utilization causes wait times to grow without bound (queue explosion). Every queueing articulation specifies (1) the arrival process — deterministic, Poisson (M), general (G), bursty, with rate λ; (2) the service process — deterministic (D), exponential (M), general (G), with rate μ per server; (3) the number of servers and buffer capacity — M/M/1, M/M/c, M/M/c/K, M/G/1, G/G/1; and (4) the queue discipline — FIFO, LIFO, SJF, priority, processor-sharing, fair queueing. The field has foundations in Erlang's telephone-traffic work (1909), Kendall's notation (A/B/c/K/N/D; 1953), Little's law (1961), Jackson networks (1957), and extensive applications in computing, telecommunications, and service operations.
How would you explain it like I'm…
Waiting in Line
Lines That Form for Service
Waiting at a Limited Server
Structural Signature¶
A queueing system in Kendall's notation A/B/c/K with arrival process A, service distribution B, c servers, and buffer capacity K. Key metrics: utilization ρ = λ / (cμ), average number in system L, average time in system W, loss probability (in finite-buffer systems), server idle time. For M/M/1 (Poisson arrivals, exponential service, single server): L = ρ / (1 − ρ), W = 1 / (μ − λ); these both diverge as ρ → 1. Little's law (L = λW and L_q = λW_q) applies to any stable queueing system regardless of arrival / service distributions. Jackson networks extend analysis to interconnected queues; queueing networks are the foundational analytical tool for performance modeling of computer systems, telecommunications, call centers, and manufacturing systems. Simulation (discrete-event, Monte Carlo) extends analysis where analytic solutions are intractable.
What It Is Not¶
Common misclassification: [1] Treating queueing as only the FIFO line. FIFO is one discipline among many (LIFO, priority, SJF, processor-sharing, EDF, fair queueing, weighted fair queueing). The choice of discipline substantially affects fairness, predictability, and mean wait time. Erlang's foundational work (1909) established the theoretical basis for analyzing discipline effects; later work by Kendall (1953) formalized the notation that allows comparison across disparate systems.
Not identical to scheduling: [2] queueing characterizes the formation, dynamics, and statistical properties of the waiting process; scheduling is the discipline-level decision of what to serve next from the queue. Scheduling policies operate within a queueing framework; queueing theory provides analytical tools for predicting scheduling performance. Little's law (1961), the cornerstone relationship L = λW, holds universally across all disciplines, meaning the average number in system is invariant to discipline choice—only the variance and tail behavior change. See scheduling.
Not free of instability near saturation: [3] as utilization ρ → 1, queue length and wait time grow without bound (M/M/1: W = 1 / (μ − λ) → ∞). High utilization is therefore not free — it causes queueing latency. Real systems target utilization well below 100% to maintain responsive wait times; the "75% rule" (target utilization ≤ 75%) is a common heuristic. The non-linear scaling near saturation is one of the most consequential insights of queueing theory and explains why capacity planning cannot rely on average-case analysis alone.
Not always Markovian: [4] the M/M/1 and similar closed-form solutions assume Poisson arrivals and exponential service (memoryless). Real workloads often have bursty arrivals, heavy-tailed service times (web traffic, social media), and correlated events. G/G/1 and simulation are needed; closed-form intuition can mislead. Kingman's heavy-traffic approximation (1961) provides bounds for GI/G/1 when exact solutions are unavailable; Whitt's stochastic-process limits (2002) extend this machinery to complex networks.
Not free of loss / blocking: [5] finite buffers cause arrivals to be rejected (blocked calls in telephony, dropped packets in networks, turned-away customers in retail). Loss probability and offered-load analysis (Erlang B formula, introduced 1917) and Erlang C formula are classical tools for sizing buffers and capacity. Both formulas derive from the balance equations governing steady-state probability of queue states in M/M/c/K systems with rejection or waiting.
Not always visible / explicit: [6] many systems have implicit queues — kernel receive buffers, application mailboxes, database connection pools, user-perceived latency. Performance debugging often involves identifying hidden queues. Observing queues requires instrumentation. In distributed systems with feedback loops and cascading queueing (Jackson networks, 1957), system behavior emerges from the superposition of interconnected queues, making diagnosis non-obvious even with instrumentation.
Not only quantitative: [7] queueing affects user experience (fairness perception, estimated wait times, anxiety from uncertainty); retail and service design considers psychological queueing (visible vs hidden queues, virtual queues, estimated wait signage) alongside quantitative optimization. Virtual queueing systems redistribute waiting across time and space without changing throughput (a restatement of Little's law), but dramatically alter customer experience.
Not linearly additive across systems: [8] in queueing networks, effects compound non-linearly. Cascading bottlenecks, feedback loops, and correlated failures produce emergent behavior poorly predicted by analyzing isolated queues. Jackson networks (1957) and Burke's theorem (1956, output theorem for M/M/c) provide analytical tools for networks, but the assumption of product-form solutions breaks under general inter-arrival and service distributions; simulation is often the only recourse.
Cross-references: see scheduling (the discipline-selection decision applied within queueing); see resource_management (the framework within which queueing discipline and capacity are set); see throughput (the primary performance target); see little_s_law (the fundamental queueing identity); see bottleneck (the root cause of queue buildup).
Broad Use¶
Queueing appears in telecommunications (the founding application: Erlang's circuit analysis; modern cellular networks, VoIP), in computer systems (CPU queues, disk IO queues, network packet queues, database connection pools, RPC request queues), in operating systems (process scheduling queues, message queues, interrupt queues), in distributed systems (message brokers: Kafka, RabbitMQ, SQS; workflow queues), in web services (HTTP request queues, load balancer backlogs, asynchronous job queues), in call centers and service operations (Erlang C for staffing, abandonment rates), in healthcare (emergency department triage, operating-room backlog, appointment scheduling), in retail (checkout lines, virtual queues at theme parks), in manufacturing (work-in-progress between stations, job-shop queues), in transportation (traffic at toll booths, airport security, boarding), in logistics (container yards, port berthing, warehouse picking), in graph algorithms (BFS queue, priority queue in Dijkstra), and in data structures (the queue ADT itself: FIFO, priority, deque).
Clarity¶
Queueing clarifies why high utilization causes wait times (the non-linear relationship between ρ and W), why targeting 100% utilization is self- defeating, why variability (bursty arrivals, variable service) amplifies queueing delays, why queueing discipline matters for fairness and mean latency, why hidden queues often contain the real performance problem, and why Little's law applies universally to stable queueing systems.
Manages Complexity¶
The construct manages the complexity of service under uncertain demand by providing formal models (M/M/1, M/M/c, G/G/1, Jackson networks) with analytical or simulation-based solutions, a standard notation (Kendall), and universal identities (Little's law) that hold across disciplines and substrates. Performance engineers, operations researchers, and service designers share a common analytical framework.
Abstract Reasoning¶
Queueing reasoning proceeds by identifying the arrival and service processes (rates and variability), the number of servers and buffer capacity, and the discipline; computing utilization; predicting wait time and queue length (analytical, simulation, or empirical); analyzing sensitivity to load (what happens if λ doubles?); and designing interventions (add capacity, change discipline, reduce service time variability, smooth arrivals). It supports capacity planning, service-level-agreement design, bottleneck analysis, and infrastructure sizing.
Knowledge Transfer¶
| Role | Call-center form | Computer-systems form | Retail-checkout form | Emergency-department form |
|---|---|---|---|---|
| Arrivals | Caller arrival process | Request rate (arrivals/sec) | Customer arrivals | Patient arrivals |
| Service | Agent handle time | CPU / IO / network service time | Cashier checkout time | Triage + treatment time |
| Servers | Number of agents | Number of threads / cores / instances | Cashier stations | ED bays, beds |
| Discipline | FIFO, priority (skill-based routing) | FIFO, priority, SJF, processor-sharing | FIFO | Triage (priority by severity) |
| Key metric | Average speed of answer, abandonment | p99 latency, throughput | Average wait time, line length | Door-to-doctor time, LWBS rate |
An operations researcher's queueing reasoning transfers to call centers, computer systems, retail, and emergency departments with reinterpretation of arrival / service / discipline. The structural core is stochastic arrival meeting finite service capacity; what varies is the substrate, the discipline, and the target metric.
Example¶
Formal case — Erlang C formula for call-center staffing: A call center receives calls at Poisson rate λ; each call requires exponential service time with rate μ; there are c agents. The probability that a caller must wait (Erlang C formula) is C(c, λ/μ) = (A^c / c!) / ((A^c / c!) + (1 − A/c) × Σ_{k=0}^{c−1} A^k / k!), where A = λ / μ is the offered load. From this, average speed of answer and service-level metrics (percentage of calls answered within T seconds) follow. The formula lets call-center managers answer questions like "given 10 calls/minute and 3-minute average handle time, how many agents are needed for 80% of calls to be answered within 20 seconds?" This is a canonical formal instance in operations research and service management, used daily across telecommunications, banking, airlines, and healthcare.
Structurally-faithful non-formal case — modern amusement park virtual queues (Disney Genie+, Lightning Lane): Traditional theme-park rides had physical FIFO queues; popular rides had multi-hour waits. Disney introduced virtual queueing (FastPass, then Genie+, then Lightning Lane) whereby patrons reserve a return-window rather than standing in line. From a queueing- theory view: same arrival process, same service rate, same number of servers (ride capacity), but queue discipline changes from pure-FIFO-by- arrival to a hybrid (reservation window + shorter physical queue at return). Effects: total wait time per patron roughly the same (conservation under Little's law), but experienced wait moves out of the physical line into parallel park activities. This illustrates a policy-design point: queueing discipline redistributes waiting across visible / invisible and in-line / out-of-line slots without changing the fundamental throughput. The structural match is real: arrivals (park visitors), service (rides), servers (ride capacity), discipline (pure FIFO vs virtual queue), throughput preserved, experience transformed.
Structural Tensions and Failure Modes¶
T1: Utilization Approaching 1 Causes Queue Explosion: [9] Wait time W grows as 1 / (1 − ρ) in M/M/1; the non-linearity is severe (ρ = 0.9 gives 10× longer wait than ρ = 0). This fundamental instability near saturation was established by Erlang's early telephone-traffic analysis (1917) and formalized in steady-state probability equations. Systems targeting high utilization for cost reasons become fragile. Failure mode: an innocent 10% traffic increase on a 90%-utilized system doubles wait time; under bursty arrivals, queue length spikes; SLO violations follow; scaling up is the standard remediation but has cost, and systems without autoscaling collapse under load. Network effects compound: if the overloaded system feeds into downstream queues, cascading congestion propagates.
T2: Variability Amplifies Queueing Delays: [10] Service-time variance and arrival variance both increase queue length. The Pollaczek-Khinchine formula (Pollaczek, 1930; Khinchine, 1932) shows that for M/G/1, L_q ∝ (1 + C_s²) × ρ² / (2(1 − ρ)) where C_s is service-time coefficient of variation. Heavy-tailed service times (common in web, LLMs) amplify variance catastrophically. A doubling of C_s quadruples queue length at fixed ρ. Failure mode: systems designed with mean-service-time assumptions collapse under realistic variance; tail latency (p99, p999) is dominated by variance contributions invisible in mean-response-time metrics; real-world workloads often violate M/M/1 assumptions severely. Coefficient of variation > 1 (super-Poisson variability) is the norm in production systems.
T3: Hidden Queues Dominate Performance Problems: [11] Real systems have many implicit queues—OS receive buffers, connection pools, thread pools, send buffers, kernel run queues, driver queues—that are not explicitly named or monitored. Users see latency without obvious cause. Failure mode: performance debugging focuses on CPU profiling when the real issue is queueing upstream or downstream; instrumentation gaps hide the actual bottleneck; latency SLOs miss because engineering effort is mis-directed. The architecture invisibility of queues means that naive capacity planning (e.g., "add more servers") fails to reduce latency if the bottleneck is in an unobserved queue in the network path or application stack.
T4: Queueing Discipline Has Fairness and Psychological Consequences: [12] FIFO is "fair" in an arrival-order sense but may be unfair by service-need (a long service blocks many short ones); SJF (Shortest Job First) minimizes mean wait but starves long jobs; priority can starve low-priority work; virtual queues change perception without changing throughput (Little's law guarantees this invariance). Failure mode: discipline chosen for one axis (mean latency, throughput, predictability) silently degrades another (fairness, experience, trust); stakeholders complain without clear technical grounds; remediation requires understanding the full trade-off space. Processor-sharing and fair-queueing disciplines mitigate starvation but add overhead and latency for average jobs.
T5: Correlated Arrivals and Service Times Break Markovian Analysis: [13] The assumption of independent arrivals and service times (fundamental to M/M/1, M/G/1) is violated in real traffic. TCP congestion control creates correlated bursts; user behavior is self-similar (Zipf-like); database queries have feedback loops. GI/G/1 (General Arrivals, General Service, one server) has no closed-form solution; Kingman's heavy-traffic approximation (1961) provides bounds but requires simulation for accuracy. Failure mode: analytical predictions derived from Markovian assumptions differ by 10–100× from observed latency; engineers distrust queueing theory and make ad-hoc decisions; misalignment between model and reality persists.
T6: Jackson Network Assumptions and Scale Limitations: [14] Jackson networks (Jackson, 1957; Burke, 1956) assume exponential service times, Poisson arrivals, and product-form equilibrium. These assumptions rarely hold in modern distributed systems with long-tailed service, request batching, and load balancing. The equilibrium computation scales poorly beyond 5–10 queues. Failure mode: attempting to model large systems (e.g., microservice mesh with 50+ services) via Jackson decomposition produces unvalidated predictions; engineers resort to simulation, but simulation scaling is also non-linear; neither analytical nor simulation method scales gracefully; system understanding becomes empirical and tribal rather than principled. Cohen's comprehensive treatise (1969) and Kleinrock's definitive textbook (1975–1976) remain the canonical references, but modern systems have outgrown their analytical reach.
Structural–Framed Character¶
Queueing sits at the structural end of the structural–framed spectrum: it is a pure relational pattern, the same in any domain where it appears, and nothing about its meaning depends on a particular field's vocabulary or assumptions.
At its core it is just work items accumulating before a resource of finite capacity, governed by an arrival process, a service process, a number of servers, and a queue discipline. Its home vocabulary is mathematical — Kendall's notation, utilization, average number in system, waiting time — and it carries no built-in verdict: a long queue is neither good nor bad, only a measurable state. Its origin is formal rather than institutional, and the pattern is fully definable without reference to any human practice; customers, packets, cars, and patients are interchangeable as the things that wait. To apply it is to recognize a structure already present in a system, not to import a perspective onto it. On every diagnostic, it reads structural.
Substrate Independence¶
Queueing is a highly substrate-independent prime — composite 4 / 5 on the substrate-independence scale. It is mathematically grounded, and Kendall's a/b/c/k notation is itself substrate-agnostic: the same parametrization of arrivals and service processes applies whether the entities are customers, packets, jobs, or parts, which is why it generalizes cleanly across operations research, computer science, and systems engineering. The formal abstraction is genuinely universal. What keeps it from the ceiling is thin worked evidence in the entry — the breadth is carried more by the mathematics and the alternate origin domains than by explicit cross-substrate examples.
- Composite substrate independence — 4 / 5
- Domain breadth — 4 / 5
- Structural abstraction — 4 / 5
- Transfer evidence — 3 / 5
Relationships to Other Primes¶
Parents (2) — more general patterns this builds on
-
Queueing is a kind of Allocation
Queueing is a specialization of allocation: it assigns a limited supply (server time at a finite-capacity resource) across competing claimants (arriving jobs, requests, customers) under a feasibility constraint (one job at a time per server) guided by a discipline (FIFO, priority, LIFO). It inherits allocation's structural commitment — finite stuff flowing to multiple sinks — and particularizes it to the temporal case where the assignment is by wait order rather than instantaneous division, with arrival and service processes setting the dynamics.
-
Queueing presupposes Flow
Queueing presupposes flow because its structural setting is a directional transfer of items from arrival to service to departure, with rates, conservation, and channel constraints all native to flow. Without the prior availability of flow as a continuous directional transfer through a system with defined source, channel, and sink, there is no inflow-and-outflow geometry against which finite service capacity can create accumulation. Little's law and the rest of queueing theory presuppose the flow conservation that ties arrival rate to throughput and time-in-system.
Path to root: Queueing → Flow
Neighborhood in Abstraction Space¶
Queueing sits in a sparse region of abstraction space (95th percentile for distinctiveness): few abstractions share its structure, so a faithful description tends to retrieve it precisely rather than landing on a neighbor.
Family — Allocation, Scheduling & Queues (9 primes)
Nearest neighbors
- Pipeline — 0.74
- Resource Management — 0.74
- Scalability — 0.73
- Load Balancing — 0.73
- Monte Carlo Simulation — 0.72
Computed from structural-signature embeddings · 2026-05-29
Not to Be Confused With¶
Queueing must be distinguished from Scheduling, its closest neighbor, despite both addressing the management of tasks and resources. Scheduling is the deterministic assignment of tasks to specific times, resources, and slots to minimize an objective (makespan—total elapsed time to complete all tasks; lateness—maximum delay from deadline; tardiness—total penalty for late completion). Scheduling assumes information is known in advance: you know all the tasks, their durations, their dependencies, and deadlines, so you can optimize a static assignment. Queueing, by contrast, addresses the stochastic waiting dynamics when arrivals are uncertain and service times are variable: items arrive unpredictably (Poisson process) and are serviced in finite time, creating random waiting. Scheduling says "arrange these 10 jobs on 3 machines to minimize makespan"; Queueing says "jobs arrive randomly at rate λ, each takes random service time, how long will jobs wait on average?" Scheduling is solved via algorithms (branch-and-bound, dynamic programming, heuristics) and produces a fixed plan; Queueing is analyzed via probability and produces statistical predictions (expected wait time, queue length distribution, probability of exceeding a threshold). A project manager using Gantt charts is scheduling; a call center operator analyzing whether 5 agents are enough for random call arrivals is queueing analysis. The distinction matters because scheduling cannot address randomness (it assumes perfect information), while queueing cannot produce optimal assignment (it assumes arrivals are given and uncontrolled).
Nor is queueing equivalent to Chunking, though both can address inefficiency. Chunking is an information-processing technique where discrete items are grouped into consolidated units to reduce cognitive load or structural complexity—a phone number 5551234567 is remembered more easily as "555-123-4567"; a user interface presents options in categories rather than one long list. Chunking restructures the representation of information itself. Queueing does not restructure items; it models their waiting behavior when demand exceeds capacity. A customer service center does not reduce wait time by chunking calls together; it reduces wait time by adding servers or reducing service time per call. Chunking might help an agent remember customer histories more efficiently, but that is a cognitive issue, not a queueing issue. The distinction matters because they address different problems: Chunking addresses information overload and cognitive processing; Queueing addresses resource bottleneck and waiting phenomena.
Queueing also differs from Layering, though both involve structural organization. Layering is the architectural pattern where systems are decomposed into horizontal strata with unidirectional dependencies (OSI model: physical, link, network, transport, session, presentation, application layers; each layer depends only on the layer below, not above). Layering separates concerns and reduces coupling. Queueing describes the temporal flow and waiting of discrete items moving through a system: how long do items wait, how deep is the backlog, what is the expected time in system? Queueing is about dynamics (how work flows, how long it sits); Layering is about structure (how components are organized). A layered architecture might have a queueing bottleneck at one of its layers (e.g., a network layer that becomes congested when request rate exceeds processing capacity), but the layering and the queueing are separate concerns. You can have good layering and poor queueing (good separation of concerns but long wait times), or poor layering and efficient queueing (tightly coupled system but fast throughput). The distinction matters because improving layering (better architectural separation) does not automatically improve queueing dynamics; you must analyze and tune both separately.
Finally, queueing is not Pipeline, though pipelines can exhibit queueing. A Pipeline is the staging of sequential steps, each operating in parallel, allowing work items to move through the stages concurrently—a manufacturing pipeline (assembly line), a processor pipeline (instruction fetching and executing in parallel), a data-processing pipeline (data passes through multiple transformation steps). Pipelines increase throughput by overlapping work. Queueing, by contrast, models waiting when demand exceeds instantaneous service capacity. A pipeline might have queueing at each stage (if work arrives faster than a stage processes, items queue before that stage), but the pipeline itself is the structural pattern of sequential, parallel stages. A pipeline without queueing would mean no item ever waits (work flows smoothly from stage to stage), but this is rare in practice—real pipelines do develop queues at bottleneck stages. Queueing analysis asks "given a pipeline stage with service rate μ and random arrivals at rate λ, what is the queue length?" Pipeline design asks "how many stages and what capacity per stage to achieve desired throughput?" The distinction matters because pipeline design optimizes structure (parallelism and stage capacity), while queueing analysis optimizes behavior (wait times and utilization) given a structure.
Solution Archetypes¶
Solution archetypes in the catalog that build on this prime — directly (this prime is a source ingredient) or as a related prime.
Built directly on this prime (16)
- Backlog Visibility
- Backpressure
- Bottleneck Identification and Relief
- Bounded Backlog
- Buffering
- Head-of-Line Blocking Relief
- Intake Queue Staging
- Load Leveling / Demand Smoothing
- Overcommitment Prevention
- Queue Aging and Starvation Prevention
- Queue Discipline Design
- Queue Draining
- Queue Partitioning
- Queue Reservation
- Service Rate Matching
- Work-in-Progress Limiting
Also a related prime in 22 archetypes
- Adaptive Scheduling
- Bottleneck Capacity Shadowing
- Complexity Scaling Assessment
- Coordination and Synchronization Across Reentry Phases
- Coupling Latency and Time-Delay Effects
- Cycle Staggering
- Deadlock Prevention
- Deadlock Resolution
- Decision Load Management
- Distraction Minimization for Deep Engagement
Notes¶
Held at High confidence. Foundational operations-research / CS construct with deep analytical apparatus (Erlang, Kendall, Little) and broad applicability. Entry emphasizes the non-linear response near saturation, Little's law as the universal identity, the Markovian vs general distinction, and the discipline / fairness trade-offs. Density-pass revision integrates 15 canonical sources spanning Erlang's telephone- traffic foundations (1909, 1917) through Whitt's modern heavy-traffic asymptotics (including the Halfin-Whitt 1981 many-server regime [15]) (2002) and contemporary applications across telecommunications, computing, and service operations.
References¶
[1] Kendall, D. G. (1953). "Stochastic processes occurring in the theory of queues and their analysis by the method of the imbedded Markov chain." Annals of Mathematical Statistics, 24(3), 338–354. ↩
[2] Little, J. D. C. (1961). A proof for the queueing formula: L = λW. Operations Research, 9(3), 383–387. Foundational result of queueing theory: in any stable queueing system, the mean number of items in the system equals arrival rate times mean residence time, providing the substrate-independent law that governs throughput-based liquidity in trading, networking, and operations. ↩
[3] Kleinrock, L. (1975). Queueing Systems, Volume 1: Theory. Wiley-Interscience. Standard queueing-theory reference: develops the M/M/1 model (Poisson arrivals, exponential service, single server), deriving steady-state buffer occupancy ρ/(1−ρ) and characterizing stability, blocking, and delay distributions. ↩
[4] Kingman, J. F. C. (1961). The single server queue in heavy traffic. Mathematical Proceedings of the Cambridge Philosophical Society, 57(4), 902–904. Heavy-traffic approximation for queue waiting time: shows that mean wait grows as ρ/(1−ρ) times a variability factor, formalizing the latency–utilization trade-off that governs all buffered systems. ↩
[5] Erlang, A. K. (1917). "Solution of some problems in the theory of probabilities of significance in automatic telephone exchanges." Elektroteknikeren, 13, 5–13. ↩
[6] Jackson, J. R. (1957). "Networks of waiting lines." Operations Research, 5(4), 518–521. ↩
[7] Allen, T. J. (1990). "Organizational structures, communication, and group innovation." In Research, Development, and Technological Innovation (pp. 189–207). Springer. ↩
[8] Burke, P. J. (1956). "The output of a queuing system." Operations Research, 4(6), 699–704. ↩
[9] Erlang, A. K. (1909). "The theory of probabilities and telephone conversations." Nyt Tidsskrift for Matematik B, 20, 33–39. ↩
[10] Pollaczek, F. (1930). "Über eine Aufgabe der Wahrscheinlichkeitstheorie." Mathematische Zeitschrift, 32, 64–100. ↩
[11] Gross, D., Shortle, J. F., Thompson, J. M., & Harris, C. M. (2008). Fundamentals of Queueing Theory (4th ed.). John Wiley & Sons. ↩
[12] Khinchine, A. Y. (1932). "Mathematical theory of a stationary queue." Matematicheskii Sbornik, 39, 73–84. ↩
[13] Whitt, W. (2002). Stochastic-Process Limits: An Introduction to Stochastic-Process Limits and their Application to Queues. Springer-Verlag. ↩
[14] Cohen, J. W. (1969). The Single Server Queue (revised ed.). North-Holland Publishing Company. ↩
[15] Halfin, S., & Whitt, W. (1981). "Heavy-traffic limits for queues with many exponential servers." Operations Research, 29(3), 567–588. ↩