Periodic Little's Law - Columbia University

Periodic Little’s Law

Xiaopei Zhang

Submitted in partial fulfillment of the

requirements for the degree

of Doctor of Philosophy

in the Graduate School of Arts and Sciences

COLUMBIA UNIVERSITY

2019

©2019

Xiaopei Zhang

All Rights Reserved

ABSTRACT


Xiaopei Zhang

In this dissertation, we develop the theory of the periodic Little’s law (PLL) as well

as discussing one of its applications. As extensions of the famous Little’s law, the

PLL applies to the queueing systems where the underlying processes are strictly or

asymptotically periodic. We give a sample-path version, a steady-state stochastic

version and a central-limit-theorem version of the PLL in the first part. We also

discuss closely related issues such as sufficient conditions for the central-limit-theorem

version of the PLL and the weak convergence in countably infinite dimensional vector

space which is unconventional in queueing theory.

The PLL provides a way to estimate the occupancy level indirectly. We show how

to construct a real-time predictor for the occupancy level inspired by the PLL as an

example of its applications, which has better forecasting performance than the direct

estimators.

Table of Contents

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

1 Introduction 1

1.1 Little’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 Extensions of Little’s Law . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2.1 Time-Varying Little’s Law . . . . . . . . . . . . . . . . . . . . 5

1.2.2 Central-Limit-Theorem version of Little’s Law . . . . . . . . . 8

1.3 Applied Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.4 Applications of the Periodic Little’s Law . . . . . . . . . . . . . . . . 11

1.5 Main Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

I Periodic Little’s Law 14

2 Sample-Path Version of the PLL 15

2.1 Notation and Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2 The Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.3 The Assumptions in Theorem 2.2 . . . . . . . . . . . . . . . . . . . . 18

2.4 Indirect Estimation of Lk via the PLL . . . . . . . . . . . . . . . . . 20

2.5 Departure Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

i

2.6 Connection to Cumulative Processes . . . . . . . . . . . . . . . . . . 24

2.7 Different Orders of Events in Discrete Time . . . . . . . . . . . . . . 26

3 Steady-State Stochastic Versions of the PLL 30

3.1 Periodic Steady State . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.2 The Stochastic PLL . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.3 The Discrete-Time Periodic Gt/GIt/∞ Model . . . . . . . . . . . . . 33

3.4 Exploiting the Palm Theory in Continuous Time . . . . . . . . . . . 34

4 CLT Version of the PLL 36

4.1 More Notation and Definitions . . . . . . . . . . . . . . . . . . . . . . 37

4.2 A Practical Version for Applications . . . . . . . . . . . . . . . . . . 38

4.3 The General Version . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4.4 Statistical Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 47

5 Sufficient Conditions for the CLT Version of the PLL 49

6 Weak Convergence of Random Elements of (ℓ1)d 54

6.1 A Criterion for Weak Convergence in (ℓ1)d . . . . . . . . . . . . . . . 55

6.2 Central Limit Theorem in ℓ1 . . . . . . . . . . . . . . . . . . . . . . . 59

II An Application of the Periodic Little’s Law 63

7 Real-Time Predictor for the Occupancy Level 64

7.1 The Data and the Problem . . . . . . . . . . . . . . . . . . . . . . . 65

7.2 Real-Time Predictor for the Occupancy Level . . . . . . . . . . . . . 66

7.2.1 Predicting Occupancy Level One Hour Ahead . . . . . . . . . 66

7.2.2 Predicting Occupancy Level More Than One Hour Ahead . . 71

ii

III Proofs 75

8 Proofs 76

8.1 Proofs for the Sample-Path Version of the Periodic Little’s Law . . . 76

8.1.1 Proof of Lemma 2.1 . . . . . . . . . . . . . . . . . . . . . . . 76

8.1.2 Proof of Theorems 2.2 and 2.3 . . . . . . . . . . . . . . . . . . 78

8.1.3 Proof of the Second Half of Corollary 2.4 . . . . . . . . . . . . 84

8.2 Proofs for the Central-Limit-Theorem Version of the Periodic Little’s

Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

8.2.1 Proof of Lemma 4.2 . . . . . . . . . . . . . . . . . . . . . . . 85

8.2.2 Proof of Theorem 4.3 . . . . . . . . . . . . . . . . . . . . . . . 87

8.2.3 Proof of Proposition 4.4 . . . . . . . . . . . . . . . . . . . . . 89

8.2.4 Proof of Corollary 4.5 . . . . . . . . . . . . . . . . . . . . . . 90

8.2.5 Proof of Corollary 4.6 . . . . . . . . . . . . . . . . . . . . . . 92

8.2.6 Proof of Theorem 5.1 . . . . . . . . . . . . . . . . . . . . . . . 93

8.2.7 Proof of Theorem 5.2 . . . . . . . . . . . . . . . . . . . . . . . 99

Bibliography 103

iii

List of Figures

1.1 Estimated arrival rate at an Emergency Department over a week. . . 6

1.2 A comparison of the estimated mean ED occupancy level from (i)

simulations of multiple replications of the model fit to the data to (ii)

direct estimates from the data. . . . . . . . . . . . . . . . . . . . . . 11

2.1 An example of a periodic queueing system with d = 5 and n = 4. The

vertical line is placed at time nd = 20. Area A corresponds to CQ(4)

in (2.16) below, while Area A∪B corresponds to CL(4) in (2.16). . . . 25

7.1 Predicted occupancy level one hour ahead compared to the true occu-

pancy level. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

iv

List of Tables

7.1 MSE for predicting the occupancy several hours ahead. . . . . . . . . 73

7.2 MSE for predicting the occupancy by the rolling averages of n weeks. 73

v

Acknowledgments

I would like to convey my profoundest gratitude to Professor Ward Whitt. As my

Ph.D. advisor, Professor Whitt has not only enlightened me on academic knowledge

but also given me a model of integrity, enthusiasm, and devotion. He is always

energetic and passionate in doing research as well as being very patient and skillful

in mentoring his students. He is a leading authority in queueing theory, but he still

keeps curiosity and open-mindedness to new areas and ideas, which deeply inspires

me. He trains his students in a very responsible way. He illustrates how to do research

systematically from coming up with ideas, exploring questions and narrowing down

the topic to recording and organizing research results, writing in an academic style

and responding to reviewers. As an international student from China, I appreciate

his emphasis on both writing and oral communication skills, which has been proved

to be valuable and crucial even beyond research. I am also grateful for his awareness

of cultural differences and his efforts to help us overcome it. It is impossible to have

such a wonderful Ph.D. life without his continued kind support. I would like to thank

Mrs. Joanna Whitt as well for her understanding and support such that Professor

Whitt could devote so much time to work.

I wish to express great appreciation to Professor Karl Sigman, Professor Mark

Brown, Professor Van-Anh Truong and Professor Henry Lam for their time serving

on my dissertation committee and providing invaluable advices. Special thanks go

to Professor Karl Sigman who provided generous help during my stressful time and

vi

Professor Mark Brown who gave me great support for my Ph.D. application.

Being a member of the Department of IEOR is a wonderful experience. The faculty

members and staffs are invariably supportive. I would like to thank Professor Ward

Whitt, Professor Donald Goldfarb, Professor David D. Yao, and Professor Clifford

Stein for their instruction in the Ph.D. core courses as well as Professor Jose H.

Blanchet, Professor Shipra Agrawal for the great seminar courses I attended. I also

thank Professor Daniel Bienstock, Professor Yuan Zhong, Professor David Yao, and

Professor Karl Sigman, with whom I worked as a teaching assistant and had good

cooperation. I appreciate the daily supports from Lizbeth Morales, Kristen Maynor,

Jaya Mohanty, Carmen Ng, and other staff team members. I am also grateful to

many faculty members and staffs in the Department of Statistics, including Professor

Mark Brown, Professor Victor H. de la Peña, Professor Michael E. Sobel, Professor

Tian Zheng, Professor Philip E. Protter, Professor Yang Feng, Professor Demissie

Alemayehu, and Ms. Dood Kalicharan. Without their help, I would not be able to

get my Master’s degree and pursue my Ph.D. degree.

My sincere thanks also go to all friends during my Ph.D. life, Yang Kang, Ji Xu,

Jing Wu, Wei You, Di Xiao, Ni Ma, Shuoguang Yang, Fengpei Li, Chaoxu Zhou, Fan

Zhang, Yanping Liu, Yunjie Sun, Yeqing Zhou, Xu Sun, Zhipeng Liu, Yanan Pei, Yin

Lu, Cun Mu, Chun Wang, Xinshang Wang, Lin Chen, Xiaowei Tan, Jingtong Zhao

and Xuan Zhang among others, who made this four-year life colorful.

Last but not least, I give my sincerest thanks to my wife Jing Ai, my parents

Jing Zhang and Ying Wang, and my other family members for their unconditional

and unfailing love in my life. They share my happiness and sadness. I can hardly go

through all these years without them.

vii

To

my loving family

and

all the people suffering in queues.

viii

CHAPTER 1. INTRODUCTION 1

Chapter 1

Introduction

The study of queueing systems involves mathematical modeling of a service system.

It includes those systems providing services from humans such as call centers, hos-

pitals, banks, supermarkets, etc. and those providing other kinds of services such as

computing, communication, transportation, and manufacturing systems. The service

providers are modeled as ”servers”, and the service receivers are called ”customers”

or ”tasks”. As a branch of operations research, this abstract yet versatile frame-

work is proved to be powerful for people to optimize the system according to desired

performance with limited resources by understanding and exploiting its stochastic

nature.

Queueing theory originates from the research of Danish mathematician Agner K.

Erlang when he built mathematical models for the Copenhagen telephone exchange

and published ”The Theory of Probabilities and Telephone Conversations” in 1909;

see Brockmeyer et al. (1948) and Chapter 1 of Gross et al. (2018). It experienced a

slow development until 1950-1970, when a great number of papers came out, among

them the famous Little’s law by Little (1961).

Regarded as one of the fundamental principles for queueing systems, Little’s law


relates three elemental quantities in a queueing system – the queue length (L), the

arrival rate (λ), and the sojourn time (W ) – by a concise formula:

L = λW. (1.1)

The queueing length may also be called the occupancy level or the number of cus-

tomers in the system and the sojourn time may also be called the service time. For

healthcare systems, as we will discuss soon as our motivation, the sojourn time of a

patient may be called the patient’s length of stay (LoS) as well. We will use them in-

terchangeably throughout this thesis. However, the practical meaning of those terms

can change, depending on the setting, even for a fixed model. For example, the sys-

tem can be one server or all the servers or the whole system including the waiting

space. Of course, there are mathematical complexities behind this simple formula.

For example, all three quantities are in the sense of long-run averages or expectations.

We will go into details in later sections. Little’s law in queueing theory acts as Ohm’s

law in circuitry, and when two of the three quantities for a system are specified, the

third must be determined by the law. However, we must understand that Little’s

law is a mathematical theorem. It is a tautology instead of a physical law where we

summarize the pattern from the experiment data, as emphasized by Little and Graves

(2008).

This thesis contributes to Little’s law by generalizing it to a periodic time-varying

case in a discrete-time framework. We allow the three quantities L, λ and W to be

periodic functions of time. Though Little’s law holds as well for periodic queueing

systems if we treat the three quantities as long-run averages or, equivalently, the

averages over a cycle, the new work provides a more accurate description of such

systems.


The rest of the introduction is divided into five sections. In §1.1, we briefly review

the history of the ordinary Little’s law and discuss it in a formal mathematical way.

The next section (§1.2) reviews two ways to extend Little’s law and introduces the

motivation of this work from a theoretical perspective. Then in §1.3, we introduce

our motivation from an applied view. We introduce the applications of the Periodic

Little’s law in §1.4. Our main contributions are summarized in §1.5.

1.1 Little’s Law

The first use of Little’s law is often attributed to Cobham (1954). He asserted that

the expected number of units in a queue equals to the product of the arrival rate

and the expected waiting time without any proof. Then, in Chapter 7 of Philip

M. Morse’s book ”Queues, Inventories and Maintenance” published in 1958 Morse

(1958), the well-known formula first time appeared in its explicit form ”L = λW”,

but was proved for a specific situation. It is John D. C. Little who made the first

effort in Little (1961) to show that this seemingly intuitive relationship holds in the

general case. He proved the formula by exploiting the ergodic theorems for strictly

stationary stochastic processes defined on (−∞,∞) based on Joseph L. Doob’s work

Doob (1953). He also drew a realized sample path to facilitate the proof and was

close to a sample-path proof, even though that was not accomplished in that paper.

Attempts were made in the following years as in Jewell (1967) and Eilon (1969) to

simplify the proof by assuming the processes to be regenerative. A milestone was

reached in 1974 when Shaler Stidham, Jr. gave a clean sample-path proof of Little’s

law with minimum assumptions in his paper ”A Last Word on L = λW” Stidham

(1974). However, far from ”a last word”, it is rather a beginning of sample-path

analysis; see El-Taha and Stidham Jr. (1999), Fiems and Bruneel (2002) and Wolff


and Yao (2014). Extensions and generalizations of Little’s law were also developed

afterward. We will review different forms of Little’s law in §1.2, so in this section, we

only focus on the ordinary sample-path version of Little’s law.

There are also useful reviews of Little’s law, e.g., Whitt (1991), Whitt (1992),

Little and Graves (2008) and Little (2011).

We remark that Little’s law can be regarded as a special case of a more general

relationship H = λG, where λ is the arrival rate and H and G are respectively time

and customer averages of some arbitrary queue statistics having a certain relationship

to each other; see Brumelle (1971), Brumelle (1972) and Heyman and Stidham (1980).

Since this generalization is out of the scope of this thesis, we shall not discuss it

further.

Here we state the ordinary sample-path version of Little’s law under the framework

of Stidham (1974) with slightly different notation (same as in Whitt (1991)), and we

will use this framework in our work. Suppose that customers arrive at a system at

time epochs Ak, where k = 1, 2, · · · is the index of the customer and Ak is a real-

number sequence satisfying 0 ≤ Ak ≤ Ak+1. Let Dk be the departure time epochs

respectively with Ak ≤ Dk so that Wk ≡ Dk −Ak ≥ 0 is the service time (also called

sojourn time or length of stay, LoS). We use ”≡” for ”equal by definition” throughout

this thesis. Let Ik(t) be the indicator function of customer k, i.e., Ik(t) = 1 if the kth

customer is in the system at time t and Ik(t) = 0 otherwise. Then Wk can also be

express as Wk =∫∞0Ik(t)dt. Let Q(t) ≡

∑∞k=0 Ik(t) be the number of customers in

the system at time t. Let A0 ≡ 0 and A(t) ≡ maxk ≥ 0 : Ak ≤ t count the number

of arrivals in [0, t].

Theorem 1.1 (sample-path version of Little’s law). If limt→∞

A(t)/t = λ where 0 ≤ λ ≤

∞ and limn→∞

∑nk=1Wk/n = W , then lim

t→∞

∫ t

0Q(s)ds/t = L and (1.1), i.e. L = λW ,


holds.

Remark 1.1 (asymmetry in Little’s law). In Theorem 1.1, the relation between the

three quantities is asymmetric. We allow W to be ∞, in which case L will also

be ∞. However, it is not true that the existence of any two of the three limits

implies the convergence of the third; see Example 1 of Glynn and Whitt (1986) for a

counterexample.

1.2 Extensions of Little’s Law

Like other good mathematical theorems, Little’s law has clean and strong conclusions

with mild assumptions. However, many kinds of extensions are also worth exploring.

Here we identify two directions:

1. from non-time-varying to time-varying, and

2. from strong-law-of-large-number (SLLN or LLN) type to central-limit-theorem

(CLT) type.

We will discuss these two extensions in the following two sections.

1.2.1 Time-Varying Little’s Law

It is usually good to start research from simple settings. Most stochastic processes

textbooks only cover the non-time-varying queueing models. However, there are also

many works on time-varying queues; see Whitt (2018) for an overview of time-varying

queues.

Little’s law is concerned with the long-run limits or expectations of queue length,

arrival rate and LoS. If the system is deterministic and has a constant arrival rate


and LoS, then Little’s law perfectly describes everything of the system. However,

except the speed of light in vacuum, few things in the real world are deterministic

and constant. Figure 1.1 taken from Whitt and Zhang (2017) shows the estimated

hourly patient arrival rates to an Emergency Department (ED) located in Haifa, Israel

using 25 weeks data collected from December 2004 to May 2005. We can see that

the arrival rate is obviously time-varying instead of being constant. If the arrival rate

and LoS distribution are changing over time, as long as the long-run limits or the

expectation under a proper stochastic framework exist, then Little’s law still holds,

but we lose the information about its time-varying nature and capture only the first

moment feature of the system by using the ordinary Little’s law. Hence it is natural

to seek generalizations of Little’s law that will be able to characterize the time-varying

nature of a queueing system.

05

1015

20

time of a week

patie

nts

per

hour

Sun Mon Tue Wed Thu Fri Sat

Figure 1.1: Estimated arrival rate at an Emergency Department over a week.

In 1997, Dimitris Bertsimas and Georgia Mourtzinou established a transient Lit-

tle’s law for non-stationary queueing models in Bertsimas and Mourtzinou (1997) as

a generalization of the classical stationary version of Little’s law by identifying the


structural relationships between the number of customers in the system and the delay

at time t. Let Λ(t) = E(A(t)) where A(t) is the arrival counting process, and assume

that Λ(t) =∫ t

0λ(u)du. The main result has the form

E(L(t)) =

∫ t

0

λ(u)P (Su > t− u)du, (1.2)

where L(t) is the number of customers in the system at time t and Su represents the

time spent in the system for a customer that arrived at time u. Despite the good

idea, the mathematics there is not entirely satisfying. Fralix and Riaño (2010) put it

on more solid ground with more applications using Palm measures.

In this thesis, we make extensions by assuming the system is periodic and in

discrete time framework. However, as can be seen later, rather than directly assuming

its periodicity, we make it implicit by assuming the existence of limit if we take

averages every d time periods. The periodic Little’s law states that, under appropriate

conditions,

Lk =∞∑j=0

λk−jFck−j,j, k = 0, 1, . . . , d− 1, (1.3)

where d is the number of time points within each periodic cycle, Lk is the long-run

average number of customers in system at time k, λk is the long-run average number

of arrivals at time k, and F ck,j, j ≥ 0, is the long-run proportion of arrivals at time

k that remain in the system for at least j time points, which can be viewed as the

complementary cumulative distribution function (ccdf) of the length of stay of an

arbitrary arrival. The long-run averages are over all indices of the form k + md,

m ≥ 0. These quantities λk, F ck,· and Lk are periodic functions of the time index k,

exploiting the extension of these periodic functions to all integers, negative as well

as positive. We can see that (1.3) can be viewed as a discrete-time analog to (1.2).


Chapter 2 is devoted to providing a sample-path version of the PLL and Chapter 3

provides a steady-state version of PLL. We remark that the PLL in this thesis has

already motivated further research related to it, such as Sigman and Whitt (2019),

where a general framework for stationary marked point processes in discrete time is

developed and used to provide a direct derivation of the periodic-stationary Little’s

law.

Our work is initially motivated by our analysis to the patient flow of an Emergency

Department in Whitt and Zhang (2017) rather than directly from a theoretical view.

We describe our motivation from an applied perspective in §1.3.

1.2.2 Central-Limit-Theorem version of Little’s Law

From non-time-varying to time-varying is one way to describe the system more accu-

rately. Another way is to not only characterize the long-run limits of those quantities,

but also consider the rate of convergence. We say a ”sample-path version” which

means that the proof depends on the analysis to an arbitrary sample path without

any probability characterization of the system, thus when we impose some stochastic

structure of the system, as long as the sample-path analysis holds, the result is true

with probability 1 (w.p. 1). Taking averages over time to approximate the mean w.p.

1 is of the same spirit of the strong law of large numbers (SLLN). In probability,

besides the SLLN which assures us the convergence of sample mean, we also (with

strong assumptions) have central limit theorem (CLT) which tells us the ”conver-

gence speed” of the sample mean. Further, we have functional central limit theorem

(FCLT) known as Donsker’s theorem as a functional extension of CLT. This inspires

people to do the same for Little’s law.

From 1986 to 1989, Peter W. Glynn and Ward Whitt wrote a series of papers


in this direction. In Glynn and Whitt (1986) they established a FCLT version of

Little’s law where they identified the key remainder terms (ends effect) that must

be asymptotically negligible such that the relationship holds. It involves the theory

of weak convergence of probability measure on the function space D[0,∞), which

includes all right continuous with left limits (also known as càdlàg for French: ”con-

tinue à droite, limite à gauche”) functions on [0,∞); see Billingsley (1999) and Whitt

(1980). The proof is essentially based on the continuous mapping theorem. Later in

Glynn and Whitt (1987), they provided sufficient conditions based on regenerative

structure such that the remainder terms to go to 0. The FCLT version of Little’s

law will imply the CLT version if the function is continuous at the point 1. A CLT

version of Little’s law with direct deduction (not as a corollary of a FCLT version)

was provided in Glynn and Whitt (1988b). The key is again identifying the remain-

der terms that need to vanish when taking limits and then find proper conditions

on the joint process of arrival process and LoSs such that it holds. They found that

stationarity of the input process was a sufficient but not necessary condition for the

CLT version of Little’s law. As an important and interesting statistical application of

the FCLT/CLT version of Little’s law, Glynn and Whitt (1989) discuss the issue of

indirectly estimating the occupancy level via Little’s law. They showed that the CLT

version of Little’s law could compare the accuracy of the occupancy level estimation

by the variance of the limit variables, thus provide a guide on choosing between direct

and indirect estimations.

Inspired by the above works, we also established a CLT version of the PLL and

provided sufficient conditions for it. It requires stronger conditions compared to the

sample-path version of the PLL but provides a characterization of the convergence

rates of related quantities. We will discuss this in Chapter 4 and Chapter 5. A

discussion of its statistical application is also included there.


We remark that there is also a law-of-iterated-logarithm version of Little’s law;

see Glynn and Whitt (1988a). It gives a more accurate description of the convergence

rate for Little’s law with proper conditions, just like the law of iterated logarithm for

independent, identically distributed random variables.

1.3 Applied Motivation

As mentioned in the previous section, the PLL was motivated by an applied pa-

per Whitt and Zhang (2017), where we built a data-driven stochastic model for the

Emergency Department of an Israeli hospital, using 25 weeks of data from the data

repository associated with the study by Armony et al. (2015). We found that both

the arrival rates (as shown in Figure 1.1) and the LoS distributions are time-varying

and can be viewed as periodic with circle length being 1 week. Given the sample size,

we assume the arrival rates and LoS distributions are constants within each hour.

This gives us a periodic discrete-time framework. We proposed a queueing model

for the system and then conducted simulations to test our model. We simulated the

arrivals and LoSs of each customer so that we could calculate the occupancy level.

We repeated the simulation and compared the mean curve to the empirical one, and

as shown in Figure 1.2 (from Whitt and Zhang (2017)), we observed that the two

curves coincided with each other almost perfectly. On the one hand, it means that our

proposed model fits the data well; on the other hand, we realized that there is some

Little’s-law-type theorem behind it that ensures the relationship between occupancy

level, arrival rates, and LoS distributions.

From a statistical perspective, if we have any data from a periodic system, it

is a very natural thought to take averages over periods as statistics for estimating

interested quantities. For a queueing system, we can estimate the occupancy level,


010

2030

4050

60

time of a week

num

ber

of p

atie

nts

Sun Mon Tue Wed Thu Fri Sat

Estimated mean from dataEstimated mean from simulation

Figure 1.2: A comparison of the estimated mean ED occupancy level from (i) sim-ulations of multiple replications of the model fit to the data to (ii) direct estimatesfrom the data.

the arrivals rates, and the LoS distributions in such a way from the data. Then

we can calculate the occupancy level from the estimated arrival rates and the LoS

distributions, and we call it the indirect estimation of the occupancy level. Then

the question is: under what condition, the indirect estimation of occupancy level is

consistent with the direct estimation? This is very similar to the question in Glynn

and Whitt (1989), but with a periodic setting.

We answered the question above by the periodic Little’s law in this thesis.

1.4 Applications of the Periodic Little’s Law

The PLL comes from the ED patients flow analysis. In the end, it contributes back

to our ED patients flow forecast problem. In Whitt and Zhang (2019b), we find

that based on good model and predicting method for the arrival rates and the LoS

distributions, the indirect estimation of future occupancy level via the PLL which


incorporates the real-time information beats the corresponding direct estimation. We

propose a real-time occupancy predictor which exploits the currently observed occu-

pancy level and the empirical hazard functions, given the elapsed LoS for each patient

in the system. That is a variant of the approach suggested in §6 of Whitt (1999); see

Ibrahim and Whitt (2011a,b) and references there for related work. The construction

of the real-time occupancy level as an indirect estimator heavily relies on our PLL.

We think it is an excellent example of the potential power of the PLL.

Due to limited resource and high cost, healthcare has become one of the major

application fields of operations research. Considerable research has been done to

forecast patient flow in the ED. Some of them, e.g. Tandberg and Qualls (1994), Jones

et al. (2002), Jones et al. (2008), Calegari et al. (2016) and Marcilio et al. (2013),

focus on the arrival pattern only; some, e.g. Hoot et al. (2008) and Schweigler et al.

(2009), focus on the occupancy level only; others, e.g. de Bruin et al. (2007), consider

queueing models but only with a stationary model. With the PLL, our forecasting

method connects the prediction of arrival rates and the occupancy level while keeps

the time-varying nature of the ED patient flow. Research has also been done on the

patient flow based on simulation methods, e.g., Hung et al. (2007), Medeiros et al.

(2008) and Zeltyn et al. (2011), however, none of them realized the fundamental

relationship between the occupancy level, the arrivals and the LoS distributions they

used, though seemingly trivial, could be and should be formalized and justified.

We will demonstrate this application of the PLL in Chapter 7.

1.5 Main Contributions

In summary, the main contributions of this thesis are:

1. We develop both a sample-path version and a stationary version of the periodic


Little’s law in discrete time. It generalizes the ordinary Little’s law.

2. We establish a CLT-version of the periodic Little’s law. It parallels the result

in Glynn and Whitt (1988b) which is for the ordinary Little’s law. We also

provide practical sufficient conditions for the CLT-version of the PLL. Potential

statistical applications are also discussed.

3. We give an example of the application of the PLL in forecasting the occupancy

level of an Emergency Department. It shows the potential use of our theorems

in applied problems.

14

Part I


CHAPTER 2. SAMPLE-PATH VERSION OF THE PLL 15

Chapter 2

Sample-Path Version of the


In this chapter, we develop the sample-path Periodic Little’s Law in discrete time.

This version is general in that (i) we do not directly make any stochastic assumptions

and (ii) we do not directly impose any periodic structure. Instead, we assume that

natural limits exist, which we take to be with probability 1 (w.p.1). It turns out that

the periodicity of the limit emerges automatically from the assumed existence of the

limits.

2.1 Notation and Definitions

We index the discrete time points by integers i, i ≥ 0. Because multiple events can

happen at these times, we need to specify the order of events as in the large literature

on discrete-time queues, e.g., Bruneel and Kim (1993). We assume that all arrivals at

one time occur before any departures. Moreover, we count the number of customers

in the system after the arrivals but before the departures. Thus, each arrival can


spend time j in the system for any j ≥ 0. Our convention yields a conservative

upper bound on the occupancy. We discuss other possible orderings of events and the

relation between continuous time and discrete time in §2.7.

With these conventions, we focus on a single sequence, X ≡ Xi,j : i ≥ 0; j ≥ 0,

with Xi,j denoting the number of arrivals at time i that have a LoS j. We also could

have customers at the beginning, but without loss of generality, we can view them as

a part of the arrivals at time 0. We define other quantities of interest in terms of X.

In particular, with ≡ denoting equality by definition, the key quantities are:

Yi,j ≡∑∞

s=j Xi,s: the number of arrivals at time i with a LoS greater or equal to j,

j ≥ 0,

Ai ≡ Yi,0 =∑∞

s=0Xi,s: the total number of total arrivals at time i,

Qi ≡∑i

j=0 Yi−j,j =∑i

j=0Ai−jYi−j,j

Ai−j

: the number of customers in system at time i,

all for i ≥ 0. In the last line, and throughout the paper, we understand 0/0 ≡ 0, so

that we properly treat times with 0 arrivals.

We do not directly make any periodic assumptions, but with the periodicity in

mind, we consider the following averages over n periods:

λk(n) ≡ 1

n

n∑m=1

Ak+(m−1)d,

Qk(n) ≡ 1

n

n∑m=1

Qk+(m−1)d =1

n

n∑m=1

k+(m−1)d∑j=0

Yk+(m−1)d−j,j

,

Yk,j(n) ≡ 1

n

n∑m=1

Yk+(m−1)d,j, j ≥ 0,

F ck,j(n) ≡ Yk,j(n)

λk(n)=

∑nm=1 Yk+(m−1)d,j∑nm=1Ak+(m−1)d

, j ≥ 0, and

Wk(n) ≡∞∑j=0

F ck,j(n), 0 ≤ k ≤ d− 1, (2.1)


where d is a positive integer.

Clearly, λk(n) is the average number of arrivals at time k, 0 ≤ k ≤ d − 1, over

the first n periods; Similarly, Qk(n) is the average number of customers in the system

at time k, while Yk,j(n) is the average number of customers that arrive at time k

that have a LoS greater or equal to j. Thus, F ck,j(n) is the empirical ccdf, which is

the natural estimator of the LoS distribution’s ccdf of an arrival at time k. Finally,

Wk(n) is the average LoS of customers that arrive at time k. We will let n→ ∞.

2.2 The Limit Theorem

With the framework introduced above, we can state one of our main theorems, the

sample-path version of the PLL. We first introduce our assumptions, which are just

as in the sample-path Little’s law, with one exception. In particular, we assume that

(A1) λk(n) → λk, w.p.1 as n→ ∞, 0 ≤ k ≤ d− 1,

(A2) F ck,j(n) → F c

k,j, w.p.1 as n→ ∞, 0 ≤ k ≤ d− 1, j ≥ 0, and

(A3) Wk(n) → Wk ≡∞∑j=0

F ck,j w.p.1 as n→ ∞, 0 ≤ k ≤ d− 1, (2.2)

where the limits are deterministic and finite. For the sample-path Little’s law, d = 1

and we do not need (A2).

The assumptions above only assume the existence of limits within the first period,

but the limits immediately extend to all k ≥ 0, showing that the limit functions must

be periodic functions. We then extend these periodic functions to the entire real line,

including the negative time indices. We give a proof of the following in §8.1.1

Lemma 2.1 (periodicity of the limits). If the three assumptions in (2.2) hold, then

the limits hold for all k ≥ 0, with the limit functions being periodic with period d.


We are now ready to state a sample-path version of PLL; we give the proof in

§8.1.2.

Theorem 2.2 (sample-path PLL). If the three assumptions (A1), (A2) and (A3) in

(2.2) hold, then Qk(n) defined in (2.1) converges w.p.1 as n → ∞ to a limit that we

call Lk. Moreover,

Lk =∞∑j=0

λk−jFck−j,j <∞, 0 ≤ k ≤ d− 1, (2.3)

where λk and F ck,j are the periodic limits in (A1) and (A2) extended to all integers,

negative as well as positive.

Remark 2.1 (the extension to negative indices). To have convenient notation, we

have extended the periodic limit functions to the negative indices, but we do not

consider the averages and their limits in Assumptions (A1), (A2) and (A3) for negative

indices.

2.3 The Assumptions in Theorem 2.2

When d = 1, the PLL reduces to the non-time-varying ordinary Little’s law. In that

case, k = 0 represents all time indices since it is non-time-varying. In Theorem 2.2,

L0 ≡ limn→∞

Q0(n) is the limiting time-average number of customers in the system while

limn→∞

λ0(n) = λ0 is the limiting average number of arrivals at each time and the right

hand side of (2.3) becomes

∞∑j=0

λk−jFck−j,j = λ0

∞∑j=0

F c0,j = λ0W0. (2.4)


And Theorem 2.2 claims that

L0 = λ0W0,

which is exactly the ordinary Little’s law. Of course, the ordinary Little’s law can be

applied to the time-varying case as well, but then we will lose the time structure and

get overall averages.

There is a difference between our assumptions in (2.2) and the assumptions in

the Little’s law. For the Little’s law, we let L be the limiting time-average number

of customers in system, λ be the limiting average arrival rate of customers, and W

be the limiting customer-average waiting time (time spent in the system or length of

stay). Then, if both λ and W exist and are finite, then L exists and is finite, and

L = λW . Our limit for λk(n) in (A1) is the natural extension; the only difference

is that now we require that λk(n) converges w.p.1 for each k, 0 ≤ k ≤ d − 1. The

third limit for Wk(n) in (A3) parallels the limit for the average waiting time, but

again we require that Wk(n) converges w.p.1 for each k, 0 ≤ k ≤ d − 1. However,

these two limits alone are not sufficient to determine the number of customers for the

periodic case. Now we need to require that the LoS distribution converges for each

k, 0 ≤ k ≤ d − 1, as stated in (A2). We show that this extra condition is needed in

the following example.

Example 2.1 (the need for convergent ccdfs). We now show that we need to assume

the limit F ck,j(n) → F c

k,j in (2.2). For simplicity, let d = 2, so that we have two

time points in each periodic cycle. Suppose we have two systems. In the first one,

we deterministically have 2 arrivals at the first time of each periodic cycle, (i.e., 2

arrivals at each even-indexed time), with one of them having a LoS of 0 and the other

having a LoS of 2. In the other system, we also deterministically have 2 arrivals in the

first time of each periodic cycle, but with both of them having a LoS of 1. Suppose


there is no arrival at the odd indexed time for both of the two systems. Now the two

systems have the same λk and Wk. (λ0 = 2, λ1 = 0, W0 = 2 and W1 = 0.) However,

if we count the number of customers in the system, we have limn→∞

Q0(n) = 3 for the

first system and limn→∞

Q0(n) = 2 for the second one.

2.4 Indirect Estimation of Lk via the PLL

The PLL in Theorem 2.2 provides an indirect way to estimate the long-run average

occupancy level Lk through the right-hand side of (2.3), as discussed in Glynn and

Whitt (1989) for the ordinary LL. Here we show that the indirect estimator for Lk is

consistent with the direct estimator.

Since we only have data going forward in time from time 0, we start by rewriting

(1.3) as

∞∑j=0

λk−jFck−j,j =

k∑i=0

λi

∞∑l=0

F ci,k−i+ld +

d−1∑i=k+1

λi

∞∑l=1

F ci,k−i+ld, 0 ≤ k ≤ d− 1. (2.5)

Guided by (2.5), we let our indirect estimator for Lk be

Lk(n) ≡k∑

i=0

λi(n)∞∑l=0

F ci,k−i+ld(n)+

d−1∑i=k+1

λi(n)∞∑l=1

F ci,k−i+ld(n), 0 ≤ k ≤ d−1, (2.6)

where λi(n) and F ci,j(n) are defined in (2.1). With data, it is likely that the infinite

sums in (2.6) would be truncated to finite sums, but at a level growing with n; we do

not address that truncation modification, which we regard as minor.

We now show that the estimator Lk(n) in (2.6) is asymptotically equivalent to

the direct estimator Qk(n) in (2.1); we will prove this result together with Theorem

2.2 in §8.1.2


Theorem 2.3 (indirect estimation through the PLL). Under the conditions of The-

orem 2.2,

limn→∞

Lk(n) = Lk w.p.1 for 0 ≤ k ≤ d− 1, (2.7)

where Lk(n) is defined in (2.6) and Lk is as in Theorem 2.2.

In applications, the LoS often can be considered to be bounded, i.e., for some

m > 1, Xi,j = 0 when j ≥ md. In that case, condition (A3) is directly implied by

condition (A2) and it is possible to bound the error between the direct and indirect

estimators for Lk, defined as

Ek(n) ≡ |Lk(n)− Qk(n)|, (2.8)

for Qk(n) in (2.1) and Lk(n) in (2.6), as we show now.

Corollary 2.4 (the bounded case). If, in addition to conditions (A1) and (A2) in

Theorem 2.2, there exists some mu > 0 such that Xi,j = 0 for i ≥ 0, j ≥ dmu, then

assumption (A3) is necessarily satisfied. If, in addition, there exists some λu > 0

such that Ai ≤ λu for i ≥ 0, then

Rn ≡ max0≤k≤d−1Ek(n) ≤ λud(mu + 2)2

2n, n ≥ mu, (2.9)

for Ek(n) in (2.8).

Proof: Here we show the proof of the first part of the corollary, i.e. if the LoS is

bounded, then assumption (A3) is implied from (A2), and we postpone the second

half of proof until §8.1.3, since it depends on part of the proof of Theorems 2.2 and

2.3.


If Xi,j = 0 for i ≥ 0, j > dmu, then Fk,j(n) = 0 for 0 ≤ k ≤ d − 1 and j ≥ dmu.

So

Wk(n) =dmu∑j=1

F ck,j(n), 0 ≤ k ≤ d− 1,

is a finite summation and F ck,j = 0 for 0 ≤ k ≤ d− 1 and j > dmu. Then

limn→∞

Wk(n) = limn→∞

dmu∑j=0

F ck,j(n) =

dmu∑j=0

limn→∞

F ck,j(n) =

dmu∑j=0

F ck,j = Wk,

which is assumption (A3).

Remark 2.2 (when does (A2) imply (A3)). In addition to the boundedness condition

presented in Corollary 2.4, there are other mathematical conditions under which (A2)

implies (A3), i.e., under which we can interchange the order of the limits. Uniform

integrability is a standard condition for this purpose; see p. 185 of Billingsley (1995)

and Section 2.6 of El-Taha and Stidham Jr. (1999). We prefer (A3) plus (A2) because

that makes our conditions easier to compare to the conditions in the ordinary LL.

2.5 Departure Processes

Besides relating the occupancy level with the arrival processes and the LoS as in the

Little’s law, we can also establish the relationship between the departure processes

and the other quantities. This will also be helpful to understand the error between

different ways of counting what happens at each time point, as we will explain in

§2.7.

Let Di ≡∑i

j=0Xi−j,j, i ≥ 0, be the number of departures at time i. Given the


arrivals occur before departures at each time, it is easy to see that

Di = Qi −Qi+1 + Ai+1, for i ≥ 0. (2.10)

Paralleling (2.1), we look at the averages

δk(n) ≡1

n

n∑m=1

Dk+(m−1)d, for 0 ≤ k ≤ d− 1. (2.11)

Corollary 2.5 (departure averages). Under the conditions of Theorem 2.2, δk(n)

defined in (2.11) converges w.p.1 as n → ∞ to a periodic limit that we call δk.

Moreover,

δk =∞∑j=0

λk−jfk−j,j ≡∞∑j=0

λk−j(Fck−j,j − F c

k−j,j+1) (2.12)

for 0 ≤ k ≤ d− 1, where λk and F ck,j are the same periodic limits as in Theorem 2.2

and fk,j ≡ F ck,j − F c

k,j+1 is the discrete time probability mass function of the LoS.

Proof. The proof is easy given we have Theorem 2.2 and equation (2.10). By

equations (2.10) and (2.1),

δk(n) =1

n

n∑m=1

Dk+(m−1)d =1

n

n∑m=1

(Qk+(m−1)d −Qk+1+(m−1)d + Ak+1+(m−1)d)

=

Qk(n)− Qk+1(n) + λk+1(n), 0 ≤ k < d− 1,

Qd−1(n)− Q0(n+ 1) + λ0(n+ 1) +1

nQ0 −

1

nA0, k = d− 1.

(2.13)


Since limn→∞

1

nQ0 = 0 and lim

n→∞

1

nA0 = 0, by Theorem 2.2 and (A1), we have

limn→∞

δk(n) = Lk − Lk+1 + λk+1 =∞∑j=0

λk−jFck−j,j −

∞∑j=0

λk+1−jFck+1−j,j + λk+1

=∞∑j=0

λk−j(Fck−j,j − f c

k−j,j+1)− λk+1 + λk+1 =∞∑j=0

λk−jfk−j,j. (2.14)

2.6 Connection to Cumulative Processes

The direct and indirect estimators of the occupancy level we introduced in (2.1) and

(2.6), respectively, can be related through the cumulative processes, as depicted in

Figure 2.1.

We focus on the two cumulative processes associated with the occupancy level

and total LoS, respectively, i.e.,

CQ(n) ≡n∑

m=1

d−1∑k=0

Qk+(m−1)d =nd−1∑i=0

Qi,

CL(n) ≡n∑

m=1

d−1∑k=0

∞∑j=0

(j + 1)Xk+(m−1)d,j =n∑

m=1

d−1∑k=0

∞∑j=0

Yk+(m−1)d,j

=nd−1∑i=0

∞∑j=0

Yi,j. (2.15)

The first, CQ(n), is the cumulative occupancy level up to time n, while the second,

CL(n), is the cumulative total LoS of customers that arrived up time n.

Figure 2.1 helps to understand the two cumulative quantities. In the figure, we

plot the time intervals that each of the first 35 arrivals spends in the system as

horizontal bars, each with a height 1 placed in order of the arrival times. The left

end point is the arrival time, while the right end point is the departure time, which

need not be in order of arrival. We can see that CQ(n) and CL(n) correspond to two


areas respectively.

Figure 2.1: An example of a periodic queueing system with d = 5 and n = 4. Thevertical line is placed at time nd = 20. Area A corresponds to CQ(4) in (2.16) below,while Area A∪B corresponds to CL(4) in (2.16).

We can further relate the two cumulative processes to the averages in (2.1) and

(2.6), as stated in the following proposition.

Proposition 2.6. The cumulative processes and the averages are related by

Area(A) = CQ(n) = n

d−1∑k=0

Qk(n),

Area(A ∪B) = CL(n) = nd−1∑k=0

λk(n)Wk(n) = n

d−1∑k=0

Lk(n). (2.16)


Proof: The proof follows directly from the definitions, especially for CQ(n). For

CL(n), observe that

CL(n) =n∑

m=1

d−1∑k=0

∞∑j=0

Yk+(m−1)d,j = nd−1∑k=0

∞∑j=0

Yk,j(n) = nd−1∑k=0

∞∑j=0

λk(n)Fck,j(n)

= nd−1∑k=0

λk(n)Wk(n). (2.17)

By (2.6), if we sum over 0 ≤ k ≤ d− 1 and adjust the order of summation, we have

d−1∑k=0

Lk(n) =d−1∑k=0

∞∑j=0

λk(n)Fck,j(n) =

d−1∑k=0

λk(n)Wk(n). (2.18)

In the context of Figure 2.1, Theorem 2.2 and Theorem 2.3 assert that

E(n) ≡d−1∑k=0

Ek(n) = Area(B)/n→ 0 as n→ ∞, (2.19)

where Ek(n) defined in (2.8).

We remark that Figure 2.1 is a variant of Figure 1 in Whitt (1991) and Figures

2 and 3 in Kim and Whitt (2013), as well as similar figures in earlier papers. The

figures in Kim and Whitt (2013) are different, because of the initial edge effect, which

we avoid by treating arrivals before time 0 in the system as arrivals at time 0.

2.7 Different Orders of Events in Discrete Time

We have assumed that all arrivals occur before all departures at each time, and that we

count the number of customers in system after the arrivals but before the departures.

That produces an upper bound on the system occupancy for all possible orderings.

Suppose instead that all departures occur before all arrivals at each time, and that we


count the number of customers in system after the departures but before the arrivals.

That obviously produces a lower bound. The consequence of any other ordering will

fall in between these two.

For a discrete-time system, the order may be given, but many applications start

with a continuous-time system. In that case, the modeler can choose which ordering

to use in the discrete-time version. We have chosen the conservative upper bound. In

this section we derive an expression for the alternative lower bound and the difference

between the upper bound and the lower bound. From that difference, we can see

that the difference between an initial continuous-time system and a discrete-time

“approximation” will become asymptotically negligible as we refine the discrete-time

version by increasing the number of discrete time points within a fixed continuous-

time cycle.

For the alternative departure-first (lower-bound) ordering, let QLi be the number

of customers in the system at time i and let

QLk (n) ≡

1

n

n∑m=1

QLk+(m−1)d. (2.20)

Assume that we keep the meaning of Xi,j and Yi,j.

Proposition 2.7 (the lower-bound occupancy). Under the conditions of Theorem

2.2, QLk (n) converges w.p.1 as n→ ∞ to a limit that we call LL

k , and we have

LLk = Lk − δk − 2λk + λkfk,0 ≤ Lk for 0 ≤ k ≤ d− 1, (2.21)

where Lk is in Theorem 2.2, λk is in (A1), δk and fk,0 are the same as in Corollary

2.5.


Proof. For the new departures-first ordering, the number of customers at time i is

QLi =

i∑j=1

Yi−j,j+1 − Yi,0 =i∑

j=1

Yi−j,j −i∑

j=1

Xi−j,j − Yi,0

=i∑

j=0

Yi−j,j −i∑

j=0

Xi−j,j − 2Ai +Xi,0 = Qi −Di − 2Ai +Xi,0, (2.22)

where we regard the summation as 0 if the lower bound is larger than the upper

bound. By equation (2.22) and (2.20), we have

QLk (n) =

1

n

n∑m=1

QLk+(m−1)d

=1

n

n∑m=1

(Qk+(m−1)d −Dk+(m−1)d − 2Ak+(m−1)d +Xk+(m−1)d,0)

= Qk(n)− δk(n)− 2λk(n) + (Yk,0(n)− Yk,1(n)). (2.23)

Next observe that (A1) and (A2) imply that limn→∞

Yk,j(n) = λkFck,j, so that

limn→∞

QLk (n) = Lk − δk − 2λk + λk(F

ck,0 − F c

k,1) = Lk − δk − 2λk + λkfk,0. (2.24)

We can apply Proposition 2.7 to deduce that the error in an incorrect ordering

of events is asymptotically negligible if we start with a continuous-time system and

choose sufficiently short time periods. For the arrival process A(t), we assume that

t−1A(t) → Λ(t) as t→ ∞, (2.25)

where

Λ(t) =

∫ t

0

λ(u) du and 0 < λL ≤ λ(u) ≤ λU <∞, (2.26)

for all non-negative u and t, with λ being a periodic function with period c. We make


a similar assumption for the departure process D(t) with the limiting rate being δ(t).

Corollary 2.8 (from continuous to discrete time). Suppose that we start with a

continuous-time periodic system with period c and construct a discrete-time system

by considering d evenly spaced time intervals within each periodic cycle. If the arrival

and departure processes satisfy (2.25) and (2.26), and if the conditions of Theorem

2.2 hold for each d, then the error in the discrete-time approximation due to the order

of events at each time point goes to 0 as d→ ∞.

Proof. From (2.21), we know that 0 ≤ Lk − LLk = δk + 2λk − λkfk,0. Hence, if we

consider a sequence of systems indexed by d, then the condition implies that λd,k → 0

and δd,k → 0 as d→ ∞.

We can use Corollary 2.8 to define what we mean by a continuous-time sample-

path PLL. We also exploit the basic properties of Riemann integrals; see §6 of Rudin

(1976).

Definition 2.1 (continuous-time sample-path PLL). Consider a continuous-time sys-

tem with arrival process having a periodic arrival rate and satisfying (2.25) and (2.26).

A continuous-time sample-path PLL holds, yielding relation (1.2), if (i) the discrete-

time PLL holds for each d when we form d time periods within each periodic cycle,

and (ii) the sequence of discrete-time PLLs in (1.3) can be regarded as converging

Riemann-sum approximations for the continuous-time relation in (1.2).

CHAPTER 3. STEADY-STATE STOCHASTIC VERSIONS OF THE PLL 30

Chapter 3

Steady-State Stochastic Version of

the Periodic Little’s Law

In this chapter, we discuss the stochastic analogs of Theorem 2.2. In §3.1 we define a

stationary framework in continuous time. In §3.2 we derive a steady-state stochastic

PLL by applying Theorem 2.2. In §3.3 we establish a version of PLL for theGt/GIt/∞

time-varying infinite-server model proposed in Whitt and Zhang (2017). Finally, in

§3.4 we establish a continuous-time stochastic PLL, which primarily follows from

Rolski (1989) and Fralix and Riaño (2010).

For this initial version, we aim for simplicity. Thus, we assume that the basic

stochastic process is both stationary and ergodic, so that steady-state means coincide

with long-run averages. The main idea is that we now interpret the key quantities

Lk and λk appearing in (1.3) as appropriate expected values of random variables

associated with the system in periodic steady state. As we see it, there are two main

issues:

1. What is meant by periodic steady state?


2. What is F ck,j or, equivalently, what is the probability distribution of the LoS of

an arbitrary arrival in period k?

3.1 Periodic Steady State

In order to construct periodic steady-state, we assume that the basic stochastic pro-

cess Yn : n ∈ Z with

Yn ≡ Ynd+k,j : 0 ≤ k ≤ d− 1; j ≥ 0 (3.1)

introduced in §2.1 is a stationary sequence of non-negative random elements indexed

by the integer n. For each integer n, the random element Yn takes values in the space

(Zd)∞ ≡ Zd×Zd×· · · . See Chapter 6 of Breiman (1968), Baccelli and Brémaud (1994)

and Sigman (1995) for background on stationary processes and their application to

queues. Just as for the time-varying LL, as discussed in Fralix and Riaño (2010), it

is important to apply the Palm transformation, but we avoid that issue by exploiting

the established limits for the averages.

Without loss of generality, we now regard our stochastic processes as stationary

processes on the integers Z, negative as well as positive; see Proposition 6.5 of Breiman

(1968). As usual, we mean strictly stationary; i.e., the finite-dimensional distributions

are independent of time shifts, which in turn means that, for each k and each k-tuple

(n1, . . . , nk) of integers in Z,

(Yn1 , . . . ,Ynk)

d= (Yn1+m, . . . ,Ynk+m) for all m ∈ Z, (3.2)

with d= denoting equality in distribution.

As a consequence of the stationarity assumed for Yn : n ∈ Z, we also have


stationarity for the associated stochastic process (And+k, Qnd+k) : n ∈ Z, where

And+k is the number of arrivals at time nd+ k and Qnd+k is the number of customers

in the system at time nd+k, both of which are defined in §2.1, but now that we have

stationarity on all integers, negative as well as positive. Thus, we have

And+k ≡ Ynd+k,0 and Qnd+k ≡∞∑j=0

Ynd+k−j,j. (3.3)

Hence, with some abuse of notation, we let (Yk,j : j ≥ 0, Ak, Qk) be a stationary

random element. In this stochastic setting, we have

λk ≡ E(Ak) = E(Yk,0) and Lk ≡ E(Qk) =∞∑j=0

E(Yk−j,j). (3.4)

3.2 The Stochastic PLL

We now come to the second issue: In the stochastic setting it remains to define

F ck,j ≡ P (Wk > j), where Wk is the time in system for an arbitrary arrival in period

k. It is natural to define F ck,j by requiring that it agree with the limit of the averages

F ck,j(n) in (2.1). That limit is well defined if we assume that the basic sequence is

ergodic as well as stationary, with 0 < E(Yk,0) <∞ for all k.

With the stationary framework on all the integers, positive and negative, the

stochastic PLL becomes very elementary, because there are no edge effects.

Theorem 3.1 (stochastic PLL). Suppose that Yn : n ∈ Z in (3.1) is stationary and

ergodic with 0 < λk ≡ E(Yk,0) <∞, 0 ≤ k ≤ d− 1. Then, for each k, 0 ≤ k ≤ d− 1,

and j ≥ 0,

F ck,j(n) → F c

k,j ≡E(Yk,j)

E(Yk,0)w.p.1 as n→ ∞ (3.5)


and

Lk ≡ E(Qk) =∞∑j=0

λk−jFck−j,j. (3.6)

Proof. First, the stationary and ergodic condition with the specified moment as-

sumptions implies that

Yk,j(n) → E(Yk,j) = λkFck,j as n→ ∞ w.p.1 (3.7)

for all k and j in view of (2.1), (3.4) and the definition of F ck,j in (3.5). Then (3.5)

follows immediately by continuity under the division operation by a quantity with a

strictly positive limit. If we multiply and divide by λk within the representation for

E(Qk) in (3.4), then we see that

E(Qk) =∞∑j=0

E(Yk−j,j) =∞∑j=0

λkE(Yk−j,j)

λk=

∞∑j=0

λkFck−j,j. (3.8)

In closing this section, we remark that the distribution Fk,j can be seen to corre-

spond to an underlying Palm measure, but we do not develop that framework here;

see Sigman and Whitt (2019).

3.3 The Discrete-Time Periodic Gt/GIt/∞ Model

The candidate model for the ED proposed in Whitt and Zhang (2017) was a special

case of the periodic Gt/GIt/∞ infinite-server model, in which the LoS variables are

mutually independent and independent of the arrival process, with a ccdf F ck,j ≡

P (Wk ≥ j) for a steady-state LoS in time period k that depends only on the time

period k within the periodic cycle (a week). This strong local condition provides a

sufficient condition for the steady-state stochastic PLL. That can be seen from the


following proposition.

Proposition 3.2 (theGt/GIt/∞ special case). For the stationary Gt/GIt/∞ infinite-

server model specified above, where Y ≡ Yn : n ∈ Z in (3.1) is strictly stationary

with finite mean values, then

E(Yk,j) = F ck,jE(Yk,0), (3.9)

consistent with formula (3.5).

Proof. Let Wk,i : i ≥ 1 be a sequence of i.i.d. LoS variables associated with

arrivals in time period k, which is also independent of the arrival process and the

other LoS variables. Under those conditions,

E(Yk,j) =∞∑

m=1

m∑i=1

P (Wk,i > j)P (Ak = m)

=∞∑

m=1

mF ck,jP (Ak = m) = F c

k,jE(Ak) = F ck,jE(Yk,0).

3.4 Exploiting the Palm Theory in Continuous Time

Finally, we observe that the Palm theory in Rolski (1989) and Fralix and Riaño (2010)

also provides a general steady-state stochastic PLL in the conventional continuous-

time setting. For this steady-state stochastic result, we now assume that the arrival

process is a simple point process (arrivals occur one a time) on the entire real line

with a well-defined arrival rate λ(t) at time t. This was satisfied in the ED application

(our applied motivation) because the arrival times have very detailed time stamps.

Thus, letting N(t)−N(s) be the number of arrivals in the interval [s, t], we assume


that

E(N(t))− E(N(s)) =

∫ t

s

λ(s) ds, (3.10)

where the arrival-rate function λ(t) is a periodic function with periodic cycle length

c, which is also right continuous with left limits.

As in section 2 of Fralix and Riaño (2010), we let W (t) be the waiting time of the

last arrival before time t. That convention yields a well-defined waiting-time process

W (t) : t ∈ R.

As a continuous-time analog of the periodic stationarity assumed in §3.1, we

assume that the queue-length (the number of customers in system) process Q ≡

Q(t) : t ∈ R has a distribution that is invariant under time shifts by c. The

arrival process is included by the upward jumps on Q. As a consequence of this

c-stationarity, the set of Palm measures Ns : s ∈ R associated with the arrival

process N is periodic with period c. The mean queue length is expressed in terms of

the tail probabilities

F ct,x ≡ Pt(W (t) > x) (3.11)

under the Palm measures Pt, which are periodic with period c.

Then, paralleling the remark after Theorem 3.1 of Fralix and Riaño (2010), The-

orem 3.1 of Fralix and Riaño (2010) implies the following continuous-analog of (1.3),

which was already given for the Mt/GI/1 special case in Rolski (1989).

Theorem 3.3 (continuous-time PLL following from Rolski (1989) and Fralix and

Riaño (2010)). Under the conditions above,

E(Q(t)) =

∫ ∞

0

F ct−s,sλ(t− s) ds (3.12)

where F ct−s,s is defined in (3.11).

CHAPTER 4. CLT VERSION OF THE PLL 36

Chapter 4

Central-Limit-Theorem Version of


In this chapter, we establish a CLT version of the PLL, paralleling the CLT versions

of the ordinary Little’s law in Glynn and Whitt (1986), Glynn and Whitt (1987),

Glynn and Whitt (1988b), and Whitt (2012). We look for the relationship between

the CLT-scaled arrival rates, LoS distributions and occupancy level in the periodic

setting, so we now require stronger assumptions than for the sample-path PLL in

Theorem 2.2, but we obtain a rate of convergence for the sample-path PLL. We will

show that linking the indirect estimator of occupancy level with the arrival rates

and LoS distributions is straightforward by the continuous mapping theorem, how-

ever, stronger conditions are needed to ensure that the CLT-scaled direct estimator

converges to the same limit.

One simple and practical case, where we assume that both the number of arrivals

at each time and the LoS of each arrival are bounded and the limits are Gaussian, is

provided first in §4.2. It applies to the ED in Whitt and Zhang (2017) and evidently

to most practical cases. Two more complicated sufficient conditions are stated in


Chapter 5. The CLT versions of PLL also have statistical applications, as in Glynn

and Whitt (1989) and Kim and Whitt (2013), which we discuss in §4.4.

4.1 More Notation and Definitions

In this section we assume the LoSs are bounded by J , i.e. Xi,j = 0 for j ≥ J and all

i. All the vectors are understood to be column vectors.

For 0 ≤ k ≤ d− 1, let

F ck (n) ≡ (F c

k,0(n), Fck,1(n), · · · , F c

k,J−1(n)) ∈ RJ ,

F ck ≡ (F c

k,0, Fck,1, · · · , F c

k,J−1) ∈ RJ . (4.1)

Let the law-of-large-numbers-scaled (LLN-scaled) averages be

λλλ(n) ≡ (λ0(n), λ1(n), · · · , λd−1(n)) ∈ Rd,

δδδ(n) ≡ (δ0(n), δ1(n), · · · , δd−1(n)) ∈ Rd,

F cF cF c(n) ≡ (F c0 (n)

T , F c1 (n)

T , · · · , F cd−1(n)

T ) ∈ Rd×J ,

WWW (n) ≡ (W0(n), W1(n), · · · , Wd−1(n)) ∈ Rd,

QQQ(n) ≡ (Q0(n), Q1(n), · · · , Qd−1(n)) ∈ Rd,

LLL(n) ≡ (L0(n), L1(n), · · · , Ld−1(n)) ∈ Rd, (4.2)


and the CLT-scaled averages be

λλλ(n) ≡√n(λλλ(n)− λλλ) ∈ Rd,

δδδ(n) ≡√n(δδδ(n)− δδδ) ∈ Rd,

F cF cF c(n) ≡√n(F cF cF c(n)−F cF cF c) ∈ Rd×J ,

WWW (n) ≡√n(WWW (n)−WWW ) ∈ Rd,

QQQ(n) ≡√n(QQQ(n)−LLL) ∈ Rd,

LLL(n) ≡√n(LLL(n)−LLL) ∈ Rd, (4.3)

where the deterministic centering constants are

λλλ ≡ (λ0, λ1, · · · , λd−1) ∈ Rd,

δδδ ≡ (δ0, δ1, · · · , δd−1) ∈ Rd,

F cF cF c ≡ ((F c0 )

T , (F c1 )

T , · · · , (F cd−1)

T ) ∈ Rd×J ,

WWW ≡ (W0,W1, · · · ,Wd−1) ∈ Rd,

LLL ≡ (L0, L1, · · · , Ld−1) ∈ Rd. (4.4)

Again, all the above constant vectors and matrices in (4.4) can be extended as peri-

odic functions with period d, but for convenience, throughout this thesis, we use the

modulo function, [x] = x mod d, to treat k beyond 0 ≤ k ≤ d− 1.

4.2 A Practical Version for Applications

We now state our first CLT version of the PLL. Because it is a special case of the

more general one stated later, the proof is not given separately. The statement is


proved after we introduce Theorem 4.3, Proposition 4.4 and the two corollaries right

after them. Let ⇒ denote convergence in distribution.

Theorem 4.1 (practical CLT version of the PLL). If the following conditions hold:

(E1) (λλλ(n), F cF cF c(n)) ⇒ (ΛΛΛ,ΓΓΓ) in Rd × Rd×J as n→ ∞,

(E2) the number of arrivals in a single discrete time period is bounded, and

(E3) Wk =J−1∑j=0

F ck,j and Lk =

J−1∑j=0

λ[k−j]Fc[k−j],j for 0 ≤ k ≤ d− 1, (4.5)

for (λλλ(n), F cF cF c(n)) in (4.3) and (Wk, Lk) in (4.4), where the limit (ΛΛΛ,ΓΓΓ) in (E1) is a

zero-mean Gaussian random vector, then

(λλλ(n), δδδ(n), F cF cF c(n), QQQ(n), LLL(n), WWW (n)

)⇒ (ΛΛΛ,∆∆∆,ΓΓΓ,ΥΥΥ,ΥΥΥ,ΩΩΩ)

in R2d × Rd×J × R3d, (4.6)

where ΩΩΩ = (Ω0,Ω1, · · · ,Ωd−1), ΥΥΥ = (Υ0,Υ1, · · · ,Υd−1) and ∆∆∆ = (∆0,∆1, · · · ,∆d−1)

are given by

Ωk =J−1∑j=0

Γk,j, Υk =J−1∑j=0

λ[k−j]Γ[k−j],j +J−1∑j=0

Λ[k−j]Fc[k−j],j and

∆k = Υ[k] −Υ[k+1] + Λ[k+1] for 0 ≤ k ≤ d− 1, (4.7)

and (ΛΛΛ,∆∆∆,ΓΓΓ,ΥΥΥ,ΥΥΥ,ΩΩΩ) is also jointly zero-mean Gaussian distributed.

Theorem 4.1 implies that, given the joint convergence of the CLT-scaled arrival

and LoS processes (λλλ(n), F cF cF c(n)) in (E1), with the associated regularity conditions,

we get the associated convergence for the CLT-scaled direct estimate of occupancy

QQQ(n) as well as the indirect estimate LLL(n), jointly with the other processes. We get


the consistency requirement in (E3) from the PLL in Theorem 2.2.

4.3 The General Version

In this section, we introduce the more general CLT version of the PLL, where we

allow both the number of arrivals and the LoS distributions to be unbounded. Thus

it involves countably-infinite-dimensional spaces. We show that the connection among

the CLT-scaled indirect estimator of occupancy level LLL(n), the arrival rates λλλ(n) and

the LoS distributions F cF cF c(n) can be established using the continuous mapping theorem.

More conditions are needed to ensure that the CLT-scaled direct estimator QQQ(n) has

the same limit as LLL(n).

Let Rd be the d-dimensional real space with usual topology; let R∞ be the space

of sequences x = (x0, x1, · · · ) of real numbers with the topology determined by the

convergence of all finite-dimensional projections (which is induced by the metric in

Example 1.2 of Billingsley (1999)); let ℓ1 ⊆ R∞ be the subspace of R∞ which contains

sequences with finite absolute sums; and let ℓ∞ ⊆ R∞ be the subspace of R∞ which

contains sequences with bounded values, i.e.

ℓ1 ≡ xxx = (x0, x1, · · · ) ∈ R∞ : ||xxx||1 ≡∞∑i=0

|xi| <∞,

ℓ∞ ≡ xxx = (x0, x1, · · · ) ∈ R∞ : ||xxx||∞ ≡ supi

|xi| <∞ (4.8)

We equip ℓ1 and ℓ∞ with the norms || · ||1 and || · ||∞ as above, under which both

are Banach spaces. (We remark that ℓ1 ⊆ ℓ∞ and the dual space of ℓ1 is ℓ∞.) Then

we can define (ℓ1)d as the d-fold product space of ℓ1 with the norm equal to the

maximum of the ℓ1-norms of each component, i.e. if yyy ≡ (yyy0, yyy1, · · · , yyyd−1) ∈ (ℓ1)d,

where yyyi ≡ (yi,0, yi,1, · · · ) ∈ ℓ1, then ||yyy||1,d = maxi=0,··· ,d−1||yyyi||1, and we use the


topology induced by this norm; see §6.1 for more discussion of this space.

All the quantities related to LoS distributions are now in those infinite dimensional

spaces. To be specific,

F ck (n) ≡ (F c

k,0(n), Fck,1(n), F

ck,2(n), · · · ) ∈ ℓ1, 0 ≤ k ≤ d− 1,

F ck ≡ (F c

k,0, Fck,1, F

ck,2, · · · ) ∈ R∞, 0 ≤ k ≤ d− 1,

F cF cF c(n) ≡ (F c0 (n), F

c1 (n), · · · , F c

d−1(n)) ∈ (ℓ1)d,

F cF cF c(n) ≡√n(F cF cF c(n)−F cF cF c) ∈ (R∞)d. (4.9)

Other quantities are still the same as we defined in (4.3) and (4.4).

We also define the LLN-scaled and CLT-scaled difference between the direct and

indirect occupancy estimators as

RRR(n) ≡ LLL(n)− QQQ(n) ∈ Rd,

RRR(n) ≡ LLL(n)− QQQ(n) ∈ Rd. (4.10)

Just as in Glynn and Whitt (1986), the continuous mapping theorem plays a

crucial role in the proof of the theorem. Hence, we start by introducing the key

mappings and show that they are continuous.

For 0 ≤ k ≤ d− 1, let xxx ∈ Rd and yyy ∈ (ℓ1)d and define hk : Rd × (ℓ1)

d → R as

hk(xxx,yyy) =∞∑j=0

x[k−j]y[k−j],j, (4.11)

where, again, [k] = k mod d is the modulo function. As a critical condition, in the

following lemma we show two functions that will be used as mapping functions are

continuous. The proof can be found in §8.2.1.


Lemma 4.2. For a given constant zzz ∈ (ℓ1)d, let fzzz : Rd × Rd × (ℓ1)

d → Rd and

g : (ℓ1)d → Rd be defined by

fzzz(xxx(1),xxx(2), yyy) ≡ (fzzz,0(xxx

(1),xxx(2), yyy), fzzz,1(xxx(1),xxx(2), yyy), · · · , fzzz,d−1(xxx

(1),xxx(2), yyy)),

g(yyy) ≡ (∞∑j=0

y0,j,

∞∑j=0

y1,j, · · · ,∞∑j=0

yd−1,j), (4.12)

where fzzz,k : Rd × Rd × (ℓ1)d → R is defined as

fzzz,k(xxx(1),xxx(2), yyy) ≡ hk(xxx

(1), yyy) + hk(xxx(2), zzz), 0 ≤ k ≤ d− 1,

with hk defined in (4.11). The functions fzzz(xxx(1),xxx(2), yyy) and g(yyy) in (4.12) are con-

tinuous.

The following is a counterexample to show the two functions above are not con-

tinuous if we replace ℓ1 by R∞. Let 000 = (0, 0, · · · , 0) be the zero vector in proper

spaces depending on the context.

Example 4.1 (discontinuity of f and g when yyy ∈ (R∞)d). It suffices to see that

g0(yyy0) ≡∑∞

j=0 y0,j from R∞ → R is not continuous in general. Let y(i)0,j ≡ Ii=j

for all i, j ≥ 0, so that yyy(i)0 ≡ (y(i)0,0, y

(i)0,1, · · · ) are all 0 except the ith component.

Under the metric of R∞ as in Example 1.2 of Billingsley (1999), i.e. ρ(xxx(1),xxx(2)) =∑∞i=1min(1, |x(1)i − x

(2)i |)/2i, where xxx(j) = (x1, x2, · · · ), j = 1, 2, yyy(i)0 → 000 as i → ∞,

however limi→∞

g0(yyy(i)0 ) = 1 = 0 = g0(000).

We now state our general CLT version of the PLL with indirect estimator of the

occupancy level. The proof appears in §8.2.2.

Theorem 4.3 (CLT version of the PLL with indirect estimator). If the following


conditions hold:

(C1) (λλλ(n), F cF cF c(n)) ⇒ (ΛΛΛ,ΓΓΓ) in Rd × (ℓ1)d as n→ ∞,

(C2) WWW = g(F cF cF c) and LLL = fF cF cF c(000,λλλ,000), (4.13)

for (λλλ(n), F cF cF c(n)) in (4.3) and (λλλ,F cF cF c,WWW,LLL) in (4.4), using Lemma 4.2, then

((λλλ(n), F cF cF c(n), LLL(n), WWW (n)

),(λλλ(n), F cF cF c(n), LLL(n), WWW (n)

))⇒

((λλλ,F cF cF c,LLL,WWW

),(ΛΛΛ,ΓΓΓ,ΥΥΥ,ΩΩΩ

))in (Rd × (ℓ1)

d × R2d)× (Rd × (ℓ1)d × R2d), (4.14)

where ΩΩΩ = g(ΓΓΓ) and ΥΥΥ = fF cF cF c(λλλ,ΛΛΛ,ΓΓΓ); in particular,

Ωk =∞∑j=0

Γk,j, Υk =∞∑j=0

λ[k−j]Γ[k−j],j +∞∑j=0

Λ[k−j]Fc[k−j],j. (4.15)

Note that condition (C1) requires that the random elements belong to the specified

space. Also the limit for the first five elements in (4.14) yield a weak LLN, consistent

with the PLL.

Just as in the ordinary Little’s law and the PLL, it is interesting to consider when

can we add the direct estimator of occupancy level QQQ(n) into the joint convergence.

Clearly, if we can show that RRR(n) defined in (4.10) goes to 000 in distribution, then

QQQ(n) converges to the same limit as LLL(n) in distribution as n→ ∞ and can be added

into the joint convergence in (4.14) by applying Theorem 3.1 of Billingsley (1999).

For clarity, we summarize that observation in the following proposition.

Proposition 4.4. If, in addition to condition (C1) and (C2), we have RRR(n) ⇒ 000,


then

((λλλ(n), δδδ(n), F cF cF c(n), QQQ(n), LLL(n), WWW (n)

), (4.16)(

λλλ(n), δδδ(n), F cF cF c(n), QQQ(n), LLL(n), WWW (n), RRR(n)))

⇒((λλλ,δδδ,F cF cF c,LLL,LLL,WWW

),(ΛΛΛ,∆∆∆,ΓΓΓ,ΥΥΥ,ΥΥΥ,ΩΩΩ,000

))in (R2d × (ℓ1)

d × R3d)× (R2d × (ℓ1)d × R4d), (4.17)

where δδδ is in (4.4), ∆∆∆ = (∆0,∆1, · · · ,∆d−1) is given by

∆k = Υ[k] −Υ[k+1] + Λ[k+1] for 0 ≤ k ≤ d− 1, (4.18)

and all the other variables have the same meaning as in Theorem 4.3.

The proof is given in §8.2.3.

The following corollary shows that if we have Gaussian limits in the conditions,

then the limits in the conclusion are also Gaussian. The proof is in §8.2.4.

Corollary 4.5 (Gaussian limits). If, in addition to the conditions of Theorem 4.3,

(ΛΛΛ,ΓΓΓ) has a zero-mean Gaussian distribution with covariance and cross-covariance

Cov(ΛΛΛ,ΛΛΛ) = ΣΛ, Cov(Γk,Γl) = ΣΓ:k,l and Cov(ΛΛΛ,Γk) = ΣΛ,Γ:k, 0 ≤ k, l ≤ d− 1, then,


(ΩΩΩ,ΥΥΥ) also has a zero-mean Gaussian distribution with

Cov(ΩΩΩ,ΩΩΩ)k,l =∞∑i=1

∞∑j=1

ΣΓ:k,li,j ,

Cov(ΥΥΥ,ΥΥΥ)k,l =∞∑i=0

∞∑j=0

λ[k−i]λ[l−j]ΣΓ:[k−i],[l−j]i+1,j+1 +

∞∑i=0

∞∑j=0

λ[k−i]Fc[l−j],jΣ

Λ,Γ:[k−j][l−j]+1,i+1

+∞∑i=0

∞∑j=0

F c[k−i],iλ[l−j]Σ

Λ,Γ:[l−j][k−i]+1,j+1 +

∞∑i=0

∞∑j=0

F c[k−i],iF

c[l−j],jΣ

Λ[k−i]+1,[l−j]+1,

Cov(ΩΩΩ,ΥΥΥ)k,l =∞∑i=1

∞∑j=0

λ[l−j]ΣΓ:k,[l−j]i,j+1 +

∞∑i=1

∞∑j=0

F c[l−j],jΣ

Λ,Γ:k[l−j]+1,i. (4.19)

The next corollary show that boundedness is a simple yet practical condition such

that RRR(n) ⇒ 000, so that Theorem 4.1 is covered by Theorem 4.3 and Proposition 4.4

as a special case. The proof is provided in §8.2.5.

Corollary 4.6 (bounded LoS). Suppose that the number of arrivals in a discrete time

period and the LoS are bounded (by J). Then condition (C1) reduces to convergence

in Rd × (RJ)d for some finite J . If, in addition, conditions (C1) and (C2) hold, then

RRR(n) ⇒ 000, so that the joint convergence (4.16) in Proposition 4.4 holds.

If we don’t want to strengthen the conditions with boundedness, two other reason-

able sufficient conditions are given in Chapter 5 such that RRR(n) ⇒ 000 and a full version

of CLT-PLL can be achieved with both CLS-scaled direct and indirect estimator of

occupancy level in the joint convergence as in Proposition 4.4. However, before we go

to that, we make a remark that connects the CLT-PLL to the ordinary CLT version

of Little’s law which is studied in Glynn and Whitt (1988b). We also give potential

statistical applications of our theorems.

Remark 4.1 (connection to the ordinary CLT version of Little’s law in Glynn and

Whitt (1988b)). When d = 1, Theorem 4.3 reduces to an ordinary CLT version of LL,


which can be compared to the earlier one in Glynn and Whitt (1988b). Specifically,

when d = 1, we have ΥΥΥ = fF cF cF c(λλλ,ΛΛΛ,ΓΓΓ) = λΩλΩλΩ +WΛWΛWΛ. All the discussion in Glynn and

Whitt (1988b) is in continuous time, but it is not difficult to translate everything into

discrete time. To avoid notation conflicts and make things clear, we add tildes on all

the variables in Glynn and Whitt (1988b). ΛΛΛ is the limit of the time-averaged arrival

rate, so it corresponds to the second term of (1.2) in Glynn and Whitt (1988b), i.e.

ΛΛΛ = −λ3/2U . ΩΩΩ is the limit of customer-averaged LoS, so in the notation of Glynn and

Whitt (1988b), assuming Theorem 1 of Glynn and Whitt (1988b) holds, we should

write

t1/2(

∑N(t)k=1 Wk

N(t)− w) =

t

N(t)t−1/2(

N(t)∑k=1

Wk − N(t)w)

=t

N(t)t−1/2(

N(t)∑k=1

Wk − λtw + λtw − N(t)w)

=t

N(t)

(t−1/2(

N(t)∑k=1

Wk − λtw)− t−1/2w(N(t)− λt))

⇒ λ−1(λ1/2(W − wU) + wλ3/2U) = ΩΩΩ,

so that λ1/2W = λΩΩΩ+λ1/2wU−wλ3/2U . Finally, ΥΥΥ as the limit of LLL(n) corresponds to

the sixth term in (1.2) of Glynn and Whitt (1988b), and if we further have RRR(n) ⇒ 000,

then ΥΥΥ as the limit of QQQ(n) is the time-averaged occupancy level of the system which

corresponds to the eighth term in (1.2) of Glynn and Whitt (1988b). Both the sixth

and the eighth term have the common limit

λ1/2(W − wU) = λΩΩΩ + λ1/2wU − wλ3/2U − λ1/2wU

= λΩΩΩ− wλ3/2U = λΩΩΩ + wΛΛΛ.


Note that λ and w being the arrival rate and mean LoS correspond to λλλ and WWW ,

respectively, in our notation, so the terms in the two theorems match perfectly.

In the notation of Glynn and Whitt (1988b), our Theorem 3.1 in the case d = 1

actually states that if

t−1/2(N(t)− λt,

N(t)∑k=1

Wk − N(t)w)⇒ (ΛΛΛ,ΩΩΩ),

and Theorem 2(f) of Glynn and Whitt (1988b) (which is exactly (C1)) holds, then

we have the joint convergence of the second, sixth and eighth terms in (1.2) of Glynn

and Whitt (1988b).

Unlike Theorem 1 of Glynn and Whitt (1988b), Theorem 4.3 and Proposition

4.4 do not require stationarity. That is explained by the extra convergence RRR(n) ⇒

000, which corresponds to Theorem 2(f) of Glynn and Whitt (1988b). That extra

convergence in Glynn and Whitt (1988b) is implied by the stationarity. Inspired by

that observation, we will obtain a similar (stronger) sufficient condition for RRR(n) ⇒ 000

involving stationarity in Theorem 5.2.

4.4 Statistical Applications

The CLT versions of the PLL have statistical applications. First, Theorem 4.3 sup-

ports using the indirect estimator of the occupancy level via the PLL. Moreover,

confidence areas for the occupancy-level estimators can be constructed. In particu-

lar, if Theorem 4.1 or Corollary 4.5 holds, then we can have confidence ellipsoids forLLL.

To be specific, LLL(n) ⇒ ΥΥΥ ∼ N(000,ΣΥ) in Rd where ΣΥ, the covariance matrix of ΥΥΥ, is

determined by (4.19). Then when n is large n(LLL−LLL(n))T (ΣΥ)−1(LLL−LLL(n)) has approx-

imately a standard normal distribution. Let qα be such that P (|N(0, 1)| ≤ qα) = 1−α,


where 0 ≤ α < 1, then

xxx ∈ Rd : n(xxx− LLL(n))T (ΣΥ)−1(xxx− LLL(n)) ≤ qα, (4.20)

which is an ellipsoid centered at LLL(n), is a confidence ellipsoid for LLL with approximate

confidence level 1−α. By (4.15) and (4.19), the more negatively related ΛΛΛ and ΓΓΓ are,

the more asymptotically efficient the indirect estimator of the number of customer in

the system becomes. However, unlike the case for the ordinary Little’s law discussed

in Glynn and Whitt (1989), there are many covariance terms in (4.19) even if we only

consider the variance of ΥΥΥk for a given k, so it is not straightforward to compare the

asymptotic efficiency of the estimator when we change the other two elements in the

PLL.

Secondly, Theorem 4.1 as well as the two more sufficient conditions we will intro-

duce in Chapter 5 tell us when will the indirect estimator LLL(n) and the natural direct

estimator QQQ(n) for the number of customers in the system have the same asymp-

totic efficiency, i.e. LLL(n) and QQQ(n) converge to the same random variable. Then the

confidence area analysis above in (4.20) also holds for QQQ(n).

To apply (4.20), we need to estimate ΣΥ. The expression is complicated, but

we will soon establish a useful sufficient condition. In particular, under the con-

ditions of Theorem 5.1, we can estimate ΣΥ by (5.2) using λk(n) to estimate λk,

(n− 1)−1∑n

m=1(AAAm − λλλ(n))(AAAm − λλλ(n))T to estimate ΣΛ and the empirical distribu-

tions to estimate Fk,j and F ck,j.

CHAPTER 5. SUFFICIENT CONDITIONS FOR THE CLT VERSION OF THEPLL 49

Chapter 5

Sufficient Conditions for the

Central-Limit-Theorem Version of


In this chapter we provide convenient sufficient conditions for the two conditions in

Theorem 4.3 and RRR(n) ⇒ 000 to be satisfied, so that we have Proposition 4.4 and the

joint convergence in (4.16).

The first sufficient condition is independence. To state it, for n = 0, 1, 2, · · · , let

AAAn be the vector of arrivals at the d times within period n, i.e.,

AAAn = (A0+nd, A1+nd, · · · , Ad−1+nd) ∈ Rd. (5.1)

For vectors, let > mean strict order for all components.

Theorem 5.1 (independent case). If the following conditions hold:

(I1) AAAn, n = 0, 1, 2, · · · , are i.i.d. with EAAA0 = λλλ > 0 and Var(AAA0) = ΣΛ,


(I2) the LoS are mutually independent and independent of the arrival process, having

a cumulative distribution fuction that depends only on the discrete time period

k, and

(I3) the LoS distribution satisfies F ck,j ∼ O(j−(3+δ)) for all k and some δ > 0,

then conditions (C1), (C2) and RRR(n) ⇒ 000 are satisfied, so that the joint convergence

in (4.16) holds. Further, (ΛΛΛ,ΓΓΓ) has a zero-mean Gaussian distribution in Rd × (ℓ1)d

with ΛΛΛ ∼ N(000,ΣΛ), Cov(Γk,j,Γk,s) = λkFk,jFck,s for 0 ≤ k ≤ d− 1 and 0 ≤ j ≤ s, and

ΛΛΛ,ΓΓΓ0,ΓΓΓ1, · · · ,ΓΓΓd−1 are independent. As a special case of Corollary 4.5, (ΩΩΩ,ΥΥΥ) also

has a zero-mean Gaussian distribution with, for 1 ≤ k ≤ l ≤ d− 1,

Cov(ΩΩΩ,ΩΩΩ)k,l =

λk

∑∞i=0

∑∞j=0 Fk,mini,jF

ck,maxi,j, for k = l,

0, for k = l,

Cov(ΥΥΥ,ΥΥΥ)k,l =k∑

s=0

λ3s(∞∑

m=0

∞∑n=0

Fs,mink−s+md,l−s+ndFcs,maxk−s+md,l−s+nd)

+l∑

s=k+1

λ3s(∞∑

m=1

∞∑n=0


+d−1∑

s=l+1

λ3s(∞∑

m=1

∞∑n=1


+d−1∑i=0

d−1∑j=0

ck,ick,jΣΛi,j,

Cov(ΩΩΩ,ΥΥΥ)k,l =∞∑i=1

∞∑m=0

λ2kFk,mini,l−k+mdFck,maxi,l−k+md. (5.2)

where

ck,j =

∑∞

m=0 Fcj,k−j+md for 0 ≤ j ≤ k,∑∞

m=1 Fcj,k−j+md for k + 1 ≤ j ≤ d− 1.

The proofs can be found in §8.2.6.


Remark 5.1 (necessity of condition (I3)). In condition (I3) we assume a 3 + δ rate

of decay of the tail distribution of the LoS, which is stronger than an ordinary CLT

requires. However, this is needed in the proof as can be seen in (8.39). More generally,

it remains to determine if condition (I3) is necessary, i.e., if it can be replaced by a

2 + δ rate of decay.

It is significant that Theorem 5.1 can be applied to the MT/GIt/∞ queueing

model we built for the ED patient flow analysis in Whitt and Zhang (2017), which

had mutually independent Gaussian daily totals for the arrivals, but dependence

among the hourly arrivals within each day. However, we conjecture that more data

would show dependence among the daily totals as well. Moreover, we want to treat

queueing models that are not infinite-server models. For infinite-server models, each

LoS (sojourn time) coincides with the service time, but that is not the case for other

service systems. Hence, we next show that Theorem 4.3 and Proposition 4.4 can

be applied to more general queueing models by replacing the i.i.d. condition with

stationarity plus appropriate mixing, as in Chapter 4 of Billingsley (1999).

We start with the framework used in Chapter 3, where we consider the composite

arrival+service input process stochastic process YYY ≡ YYY n : n ∈ N with

YYY n ≡ Ynd+k,j : 0 ≤ k ≤ d− 1; j ≥ 0. (5.3)

For simplicity, we assume the LoS distributions are bounded, i.e. Yk,j = 0, j ≥ J , for

some constant J > 0, then each YYY n becomes a finite dimensional vector, and we can


regard YYY n as a d× J random matrix

YnYnYn =

Ynd+0,0 Ynd+0,1 · · · Ynd+0,J−1

Ynd+1,0 Ynd+1,1 · · · Ynd+1,J−1

... ... ... ...

Ynd+d−1,0 Ynd+d−1,1 · · · Ynd+d−1,J−1

∈ Rd×J . (5.4)

We will start by assuming YYY n : n ∈ N is a stationary process. By adding a suitable

mixing condition, we can show that it satisfies a multivariate CLT, then we exploit

the relationship between YYY n : n ∈ N and (λλλ(n), F cF cF c(n)) and show that (C1) and

(C2) hold. Finally, we show that RRR(n) ⇒ 000 so that Theorem 4.3 and Proposition 4.4

hold.

Before we state the theorem, we make a few definitions. For convenience, we

directly use YYY n, n ∈ N in (5.3) as the example. We say the process YYY n, n ∈ N is

strictly stationary if

(YYY 0,YYY 1, · · · ,YYY m)d= (YYY 0+n,YYY 1+n, · · · ,YYY m+n), for all m ≥ 0 and n ≥ 0, (5.5)

where d= means equal in distribution. For m ≤ n, let Fn

m ≡ σ(YYY m,YYY m+1, · · · ,YYY n)

be the sigma-algebra generated by the family of random variables, where we allow

n = ∞. And we say YYY n, n ∈ N is strongly mixing (or α-mixing) if α(n) → 0, where

α(n) ≡ supm∈N,A∈Fm

0 ,B∈F∞m+n

|P (A ∩B)− P (A)P (B)|, (5.6)

is called the strong mixing coefficient, following Theorem 18.5.3 of Ibragimov and

Linnik (1971) and Theorem 0 of Bradley (1985).

Now we have the following theorem, whose proof is provided in §8.2.7.


Theorem 5.2 (stationary+bounded case). If the following conditions hold:

(S1) the composite process YYY n, n ∈ N in (5.3) is strictly stationary and strongly

mixing,

(S2) EAAA0 = λλλ > 0 and there exists some δ > 0, such that E||AAA0||2+δ∞ < ∞ and∑∞

n=1 α(n)δ/(2+δ) <∞, and

(S3) The LoS distributions are bounded (by J),

for YYY n in (5.3) and AAA0 in (5.1), then condtions (C1), (C2) and RRR(n) ⇒ 000 are satisfied,

so that we have the joint convergence in (4.16). Furthermore, (ΛΛΛ,ΓΓΓ) has a zero-mean

Gaussian distribution in Rd × (RJ)d, so that Corollary 4.5 is also satisfied.

We remark that there is a large literature on the CLT under weak dependence so

that many generalizations of Theorem 5.2 are possible. For example, in Chapter 4

of Billingsley (1999) CLTs under mixing conditions in §19 are reduced to CLTs for

martingale-difference sequences in §18.

Finally, other than the two conditions we discussed in this section, there could be

many other sufficient conditions such that Proposition 4.4 holds.

CHAPTER 6. WEAK CONVERGENCE OF RANDOM ELEMENTS OF (ℓ1)d 54

Chapter 6

Weak Convergence of Random

Elements of (ℓ1)d(ℓ1)d(ℓ1)d

In this chapter, we provide background on the weak convergence of random elements

of (ℓ1)d. We used this space so that we could apply the continuous mapping theorem

to prove Theorem 4.3. In particular, condition (C1) in Theorem 4.3 requires the

convergence F cF cF c(n) ⇒ ΓΓΓ in the space (ℓ1)d.

Within functional analysis, (ℓ1)d is quite standard, as it is a finite product space

associated with the well known Banach space ℓ1. However, neither ℓ1 nor (ℓ1)d is

standard in weak convergence theory, especially in applications on queueing systems.

In fact, we know of no previous use in queueing theory. Of course, we have the well-

established general and powerful weak convergence theory in general metric spaces,

such as Billingsley (1999), but we would like to specialize the general theory to the

space (ℓ1)d. Besides, there is a substantial literature on Banach spaces and weak

convergence in Banach spaces, e.g., Araújo and Giné (1980), Ledoux and Talagrand

(1991), and Megginson (1998). We will draw on this developed theory.

We will give a practical criterion for checking weak convergence in the space (ℓ1)d.


The criterion is not used directly in our main theorems, but we hope it helps readers

better understand the meaning of weak convergence in (ℓ1)d and could be useful in

future research. We will also discuss the established CLT for Banach-space random

variables and specialize it to the space ℓ1, which, by giving this special case, helps to

understand when we can have such convergence as in (C1) and Gaussian limits as in

Corollary 4.5.

6.1 A Criterion for Weak Convergence in (ℓ1)d(ℓ1)d(ℓ1)d

The ℓ1 space is a well-defined separable Banach space. Throughout this paper, we

always use B∗ to represent the dual space of a Banach space B. As a special case

of a general metric space, the weak convergence of probability distributions on it is

well defined, just as in general metric spaces; see Billingsley (1999). For more on

random variables in Banach spaces and convergence of those random variables, see

Araújo and Giné (1980) and Ledoux and Talagrand (1991). Here we briefly state

some definitions and results about Banach spaces and probability measures defined

on them, which can be found in Megginson (1998) and Ledoux and Talagrand (1991).

Then we exploit the general results in the special case of (ℓ1)d space.

The (ℓ1)d space is the product (or in some literature called direct sum and denoted

as ℓ1 ⊕ ℓ1 ⊕ · · · ⊕ ℓ1) of ℓ1 spaces. We refer to Megginson (1998) for general Banach

spaces. Since ℓ1 is a separable Banach space, (ℓ1)d is also a separable Banach space

with the norm || · ||1,d we defined earlier in §4.1. The dual of a product of Banach

spaces is the product of the corresponding duals. Because (ℓ1)∗ = ℓ∞, thus the dual

of (ℓ1)d is (ℓ∞)d.

We want to have a good sufficient condition for weak convergence in (ℓ1)d. For

that purpose, let UUU (n) ≡ (UUU(n)0 , . . . ,UUU

(n)d−1) ∈ (ℓ1)

d, where UUU (n)k ≡ (UUU

(n)k,j : j ≥ 0) ∈ ℓ1,


and similarly for a prospective limit UUU . A result from Ledoux and Talagrand (1991)

states that UUU (n) converges weakly to UUU as soon as h(UUU (n)) converges weakly (as

a sequence of real valued random variables) to h(UUU) for every h in a weakly dense

subset of ((ℓ1)d)∗ = (ℓ∞)d, and UUU (n) is tight, i.e. for each ϵ > 0, there exists a

compact set K ⊆ (ℓ1)d such that P (UUU (n) ∈ K) ≥ 1 − ϵ for all n. Now we transform

the above conditions to more explicit ones in the case of (ℓ1)d space.

For n ≥ 1, denote

On ≡ (yyy ∈ (R∞)d : yk,j = 0, 0 ≤ k ≤ d− 1, j ≥ n− 1,

then O = ∪∞n=1On is the subset of (ℓ∞)d with only finite many non-zero components.

Lemma 6.1 (weakly dense set of (ℓ1)d). The subset O of (ℓ∞)d containing elements

with only finitely many non-zero components is a weakly dense subset of (ℓ∞)d.

Proof. It suffices to show that for any yyy ∈ (ℓ∞)d, there exists a sequence yyy(i) ⊆ O

which weakly converges to yyy. Let

y(i)k,j =

yk,j, if j < i,

0, otherwise,(6.1)

for all 0 ≤ k ≤ d−1 and i ≥ 1. Then yyy(i) ∈ O. Now we show that it weakly converges

to yyy.

Note that ((ℓ∞)d)∗ = (ℓ∞,s)d, where ℓ∞,s ≡ xxx = (x0, x1, · · · ) ∈ R∞ :

∑∞i=0 xi <

∞ is the set of all summable (but not necessarily absolutely summable) sequences.


For any xxx ∈ ((ℓ∞)d)∗ = (ℓ∞,s)d,

xxx(yyy(i)) =d−1∑k=0

∞∑j=0

xk,jy(i)k,j =

d−1∑k=0

i−1∑j=0

xk,jyk,j →d−1∑k=0

∞∑j=0

xk,jyk,j = xxx(yyy) as i→ ∞,

(6.2)

which indicates yyy(i) weakly converges to yyy, hence O is weakly dense in (ℓ∞)d.

Next, we describe the compact sets in (ℓ1)d space.

Lemma 6.2 (compact sets in (ℓ1)d space). A set K ⊆ (ℓ1)

d is compact if and only if

K is bounded and closed and for each ϵ > 0, there exists Jϵ such that for all yyy ∈ K,∑d−1k=0

∑∞j=Jϵ

|yk,j| < ϵ.

Proof. If K ⊆ (ℓ1)d is compact, then K must be closed and bounded. Further, it is

well known that a subset in metric space is compact if and only if it is complete and

totally bounded (see for example Theorem 45.1 of Munkres (2000)). Totally bounded

means that, for any ϵ > 0 given, we can find a finite set of yyy(i)Ni=1 ⊆ K such that

K is covered by the N ϵ-balls centered at those points. Because the set is finite, we

can find Jϵ large enough such that∑d−1

k=0

∑∞j=Jϵ

|y(i)k,j| < ϵ for all yyy(i) ∈ K. Then for

any yyy ∈ K, we can find yyy(i) such that ||yyy − yyy(i)||1,d < ϵ, so

d−1∑k=0

∞∑j=Jϵ

|yk,j| ≤d−1∑k=0

∞∑j=Jϵ

|yk,j − y(i)k,j|+

d−1∑k=0

∞∑j=Jϵ

|y(i)k,j|

≤d−1∑k=0

∞∑j=0

|yk,j − y(i)k,j|+

d−1∑k=0

∞∑j=Jϵ

|y(i)k,j|

≤ d||yyy − yyy(i)||1,d +d−1∑k=0

∞∑j=Jϵ

|y(i)k,j|

≤ dϵ+ ϵ. (6.3)


Conversely, assume K ⊆ (ℓ1)d is bounded and closed and for each ϵ > 0, there

exists Jϵ such that for all yyy ∈ K,∑d−1

k=0

∑∞j=Jϵ

|yk,j| < ϵ. Since (ℓ1)d is a Banach

space and K is closed, K is a complete subset. Assume K is bounded by C/d,

i.e. ||yyy||1,d ≤ C/d for all yyy ∈ K. For any ϵ > 0 fixed, let Jϵ be as above. Let

K0 ≡ xxx ∈ (RJϵ)d : ||xxx||1 ≤ C, which is bounded, thus totally bounded since it has

finite dimension. So there exists a finite set xxx(i)Ni=1 ⊆ (RJϵ)d such that the ϵ-balls

centered at those points cover K0. Assume yyy(i)Ni=1 ⊆ (ℓ1)d are the corresponding

points of xxx(i)Ni=1 when we naturally embed (RJϵ)d to (ℓ1)d, i.e. y(i)k,j = x

(i)k,j for j ≤ Jϵ

and 0 otherwise. For any yyy ∈ K, let xxx be the natural projection of yyy on (RJϵ)d.

Notice that ||xxx||1 ≤ d||yyy||1,d ≤ C, which means xxx ∈ K0, so there is xxx(i) such that

||xxx− xxx(i)||1 ≤ ϵ. Then,

||yyy − yyy(i)||1,d ≤d−1∑k=0

Jϵ−1∑j=0

|yk,j − y(i)k,j|+

d−1∑k=0

∞∑j=Jϵ

|yk,j − y(i)k,j|

≤ ||xxx− xxx(i)||1 +d−1∑k=0

∞∑j=Jϵ

|yk,j|

≤ ϵ+ ϵ, (6.4)

which implies that K is covered by the finite number of ϵ-balls in (ℓ1)d centered at

yyy(i)Ni=1. So K is totally bounded, thus compact.

Now we can give an easy-to-check sufficient condition for weak convergence of

random elements in (ℓ1)d, which can be used to establish (C1) when applying Theorem

4.3. For any m,C > 0, let

Km,C ≡ xxx ∈ (ℓ1)d : ||x||1,d ≤ C,

d−1∑k=0

∞∑j=mn

|xk,j| < n−1 for all n ≥ 1 (6.5)

which is obviously a compact set by Lemma 6.2.


Theorem 6.3 (criterion for convergence of random elements of (ℓ1)d)). Convergence

in distribution UUU (n) ⇒ UUU in (ℓ1)d as n→ ∞ holds if

(i) For all ϵ > 0, there exists m, C and corresponding Km,C, such that

P (UUU (n) ∈ Km,C) > 1− ϵ (6.6)

and

(ii) for all J , 0 ≤ J <∞,

(UUU(n)k,j : 0 ≤ k ≤ d− 1; 0 ≤ j ≤ J) ⇒ (UUUk,j : 0 ≤ k ≤ d− 1; 0 ≤ j ≤ J) in RdJ .

(6.7)

Condition (ii) holds if and only if (iii) for all J , 0 ≤ J < ∞, and for all sets of

real number ak,j : 1 ≤ k ≤ d− 1, 0 ≤ j ≤ J

d−1∑k=0

J∑J=0

ak,jUUU(n)k,j ⇒

d−1∑k=0

J∑j=0

ak,jUUUk,j. (6.8)

Proof. Condition (i) ensures that UUU (n) is tight. Condition (ii) is equivalent to

condition (iii) by the familiar Cramer-Wold device; see p. 382 of Billingsley (1995).

And condition (iii) mean yyy(UUU (n)) → yyy(UUU) in distribution for all yyy ∈ O, which by

Lemma 6.1 is a weakly dense set in ((ℓ1)d)∗. So that condition (i) with condition (iii)

ensures UUU (n) ⇒ UUU in (ℓ1)d.

6.2 Central Limit Theorem in ℓ1ℓ1ℓ1

In this section we specialize the CLT in general Banach spaces for i.i.d. random

elements to ℓ1. First, we emphasize that, even for i.i.d. random elements, the classical


CLT does not hold in all Banach spaces, but only for some “good” Banach spaces.

Fortunately, ℓ1 is such a “good” Banach space. To make this clear, we do a quick

review; more related theory can be found in Araújo and Giné (1980) and Ledoux and

Talagrand (1991).

We first introduce two concepts: cotype of a Banach space and pre-Gaussian

random variables. A separable Banach space B with norm || · || is said to be of cotype

q if there is a constant Cq such that for all finite sequences xi ∈ B,

(∑i

||xi||q)1/q ≤ Cq||∑i

ϵixi||, (6.9)

where ϵi is a Rademacher sequance, i.e., i.i.d. random variables with P (ϵi = 1) =

P (ϵi = −1) = 1/2. It is known that ℓ1 is of cotype 2; see page 274 in §9.2 of Ledoux

and Talagrand (1991).

We say that a random variable UUU in a Banach space B has a Gaussian distribution

if, for every h in the dual space B∗, h(UUU) has a one-dimensional Gaussian distribution.

A random variable UUU in B, with Eh(UUU) = 0 and Eh2(UUU) <∞ for every h in B∗ (i.e.

weakly centered and square integrable), is pre-Gaussian if its covariance is also the

covariance of a Gaussian Borel probability measure on B. A weakly centered and

square integrable random variable UUU = (U0, U1, · · · ) in ℓ1 is pre-Gaussian if and only

if∞∑k=0

(E|Uk|2)1/2 <∞; (6.10)

see page 261 of Ledoux and Talagrand (1991).

Theorem 6.4 (CLT for Banach-space random variables from Ledoux and Talagrand

(1991) Theorem 10.7). If UUU is pre-Gaussian with values in a separable cotype-2 Banach

space, then UUU satisfies the CLT, i.e., n−1/2∑n

i=1UUU(i) where UUU (i) are i.i.d. copies of


UUU , converges in distribution, where the limit is Gaussian.

We remark that the limit distribution must have a Gaussian distribution because

if UUU satisfies the CLT in B, then h(UUU) satisfies the ordinary CLT with a Gaussian

limit for h ∈ B∗. If we include (6.10), then we get the following corollary.

Corollary 6.5 (CLT for ℓ1-valued random variables). If UUU = (U0, U1, · · · ) ∈ ℓ1 is

pre-Gaussian, then UUU satisfies the CLT. The limit has a Gaussian distribution with

the same covariance structure as UUU .

We now discuss how this background theory is relevant here. We will be consid-

ering a discrete random variable Y taking values in the non-negative integers. Let

F cj ≡ P (Y ≥ j) for k = 0, 1, · · · be the ccdf of Y and Fj ≡ 1− F c

j . Let Y (i) be i.i.d.

random variables each distributed as Y and let

U(i)j ≡ IY (i)≥j − F c

j .

Then UUU (i) are i.i.d. random variables in ℓ1 distributed as UUU with Eh(UUU) = 0 for all

h ∈ ℓ∞ (i.e., for all h ∈ (ℓ1)∗). We want UUU to be pre-Gaussian, so we require that

(6.10) holds, i.e.,

∞∑k=0

(E(IY (1)≥j − F cj )

2)1/2 =∞∑k=0

(FjFcj )

1/2 <∞. (6.11)

Hence, a sufficient condition for (6.11) is

F cj ∼ O(j−(2+ϵ)) for some ϵ > 0, (6.12)

which is actually implied by condition (I3) in Theorem 5.1. Condition (6.12) is also

sufficient for Eh(UUU) = 0 and Eh2(UUU) < ∞ for every h in ℓ∞. Hence, UUU is pre-


Gaussian.

63

Part II

An Application of the Periodic

Little’s Law

CHAPTER 7. REAL-TIME PREDICTOR FOR THE OCCUPANCY LEVEL 64

Chapter 7

Real-Time Predictor for the

Occupancy Level

In this chapter we propose a real-time procedure for predicting the hourly occupancy

level of an Emergency Department, given we have good predictions for daily arrival

totals. It is a forecast problem for a specific sample path. So instead of using the long-

run limit terms as in the PLL formula, we directly use the counts in our estimator.

The predictor is in the indirect estimation form of the occupancy level, inspired by

the PLL. A key observation is that predicting indirectly via the PLL can have a

better result than directly using the occupancy level history records to construct the

estimator. This application uses the same patient flow dataset in §1.3 as our applied

motivation of the PLL. Our goal is to predict the occupancy level of the Emergency

Department for the next few hours. For readers’ convenience, we will brief the dataset

and the problem settings in §7.1, then we introduce the real-time predictor based on

the PLL in §7.2.


7.1 The Data and the Problem

Armony et al. (2015) introduces a patient flow dataset of the Emergency Department

of Rambam Hospital located in Haifa, Isreal. The data is owned by the Service

Enterprise Engineering (SEE) Center at the Technion – Israel Institute of Technology.

They made the data publicly available so that researchers may conduct reproducible

studies based on it.

The data includes arrival and departure times of individual patients from January

2004 to October 2007, but it is also limited in that it does not contain a detailed

account of all the steps and processes that take place during a patient’s stay. In

Whitt and Zhang (2017) we proposed an aggregate time-varying stochastic queueing

model for the patient flow after doing a careful investigation to 25 weeks data. One key

observation is that both the arrival rates and the LoS distributions are periodic time-

varying instead of being constant. Then in Whitt and Zhang (2019b) we addressed

the prediction issue by involving the whole available data.

This chapter is based on section 5 of Whitt and Zhang (2019b), where we have

got good predictions for the daily arrival totals for the Emergency Department in the

previous sections of that paper by a complicated time-series model. We also have

estimations of the LoS distributions by using historical empirical LoS distributions.

We showed that in short term (within a few months), we can view the LoS distribution

for any given hour in a week as a constant. The idea is to forecast the future occupancy

level via arrival rates and the LoS distributions, which is exactly what the PLL enables

us to do. Moreover, we make it in real-time by using the most recent available data

up to the time epoch that we make predictions.

We divide the dataset into a training set and a test set. For simplicity, we use

the Lebanon war period as our split point because the war period is an outlier that


needs to discard, so the training set contains 923 days from Jan. 1, 2004 to July 11,

2006 and the test set has 443 days from Aug. 15, 2006 to Oct. 31, 2007. In Whitt

and Zhang (2019b), we use the training set to find the optimal model to forecast the

daily arrival totals, which turns out to be a seasonal autoregressive integrated moving

average with exogenous regressors model (SARIMAX), which incorporates the local

holiday indicators and weather variables. We use the test set to check our choice and

compare with other methods. We use mean squared error (MSE, by which we mean

the estimate) to measure the precision of our prediction, defined by

MSE =1

N

N∑i=1

(yi − yi)2, (7.1)

where N is the sample size, yi are the true values, and yi are the predicted values.

7.2 Real-Time Predictor for the Occupancy Level

We start by predicting the occupancy level for 1 hour ahead. Afterward, we show

that the method can be extended to more hours ahead, but losing predictive power

as the time interval increases. For this occupancy prediction, we exploit the elapsed

service times of the patients initially in the system. Thus, an important reference

point is the average LoS, which is about 4− 4.5 hours for our data. Clearly the value

of information about current patients will dissipate as the time interval increases to

4 hours and beyond.

7.2.1 Predicting Occupancy Level One Hour Ahead

Here we use the same discrete-time framework as we used in Whitt and Zhang (2017)

Whitt and Zhang (2019c) and Whitt and Zhang (2019a) since we assume both the


arrival rate and the LoS distribution are fixed within an hour. Let Yk,j denote the

number of patients that arrived in the kth discrete time period (in our case one time

period represents an hour) and had a LoS larger or equal to j hours, i.e., they left at

or after time period k + j. We assume all the arrivals occurred at the beginning of

each time period and departures occurred at the end of the time period and we count

the number of patients (occupancy level) in the middle of each time period. Under

this rule, the occupancy level in time period k is

Qk =∞∑j=0

Yk−j,j. (7.2)

Taking expectations of the above equation under proper assumptions gives us the

stochastic version of the PLL as in Theorem 3.1. Our goal is to approximate Qk,

assuming that we are given all the arrival and departure epochs for each patient that

occurred before time period k.

We built a stochastic model for the patients flow in Whitt and Zhang (2017),

which describes the arrival process and the time-varying LoS distributions quite well.

Given the daily arrival totals, hourly arrival rate curve and the LoS distributions for

each hour, we can easily calculate the occupancy level of the system. However, how

can we predict the occupancy level for the next hour given all the history up to now?

We propose a real-time predictor as follow.

To predict Qk, first we observed that we can use finite summation to approximate

the infinite sum, because in reality we can always view the LoS distributions as

bounded. Actually only 3.60% of all the patients had a LoS greater or equal to 13

hours. Hence we divided Qk into two parts and estimated them separately, letting

Qk =12∑j=0

Yk−j,j +∞∑

j=13

Yk−j,j ≡12∑j=0

Yk−j,j +Rk,13. (7.3)


Note that Yk,0 is the total number of arrivals in time period k. Since we showed

in Whitt and Zhang (2017) that the arrival process could be modeled as a non-

homogeneous Poisson process within a day given the daily arrival totals, given that

we have already got the predicted daily arrival totals, we only need to estimate the

arrival rate function within a day. We use the empirical hourly arrival rate curve

by combining the arrivals on the same day-of-week of the latest 10 weeks as our

estimated arrival rate for a day. We estimated the LoS distribution similarly by using

the empirical distribution of the same hour from the latest 10 weeks because the LoS

distributions also change slowly.

To conveniently express our estimator, we re-index our time period from k to

a three-element tuple (w, d, h), where w ∈ 0, 1, 2, · · · represents the week index,

d ∈ 0, 1, 2, 3, 4, 5, 6 is the day-of-week index and h ∈ 0, 1, 2, · · · , 23 is the hour

index. There is a one-to-one relation between these indices k and the tuples (w, d, h),

so that we will use them exchangeably in the following part of the thesis. We also

let (w, d, h) ± x denote add/minus x time periods (hours) to time period (w, d, h).

Let A(w,d) be the predicted daily arrival totals of week w, day-of-week d, and let

A(w,d) ≡∑23

h=0 Y(w,d,h),0 to be the true daily arrival totals. Then we let the estimator

for Y(w,d,h),0 be

Y(w,d,h),0 ≡ A(w,d) ∗λ(w,d,h)∑23h=0 λ(w,d,h)

= A(w,d) ∗∑10

i=1 Y(w−i,d,h),0∑10i=1A(w−i,d)

, (7.4)

where λ(w,d,h) ≡ (1/10)∑10

i=1 Y(w−i,d,h),0 is the estimated hourly arrival rate for time

period (w, d, h).

To estimate Yk−j,j for j = 1, 2, · · · , 12, note that we can observe Yk−j,j−1 for

j = 1, 2, · · · , 12 at time period k − 1, i.e. the number of patients that arrived at

time period k − j and still stayed in the system at time period k − 1. Let Fk be the


information filtration; i.e., Fk denotes all the observable arrival and LoS information

up to time k. Since we showed in Whitt and Zhang (2017) that the LoSs as i.i.d.

random variables given the arrival time, the conditional expectation of Yk−j,j can be

expressed as the product of Yk−j,j−1 and the corresponding survival probability of

each patient, which is

E(Yk−j,j|Fk−1) = Yk−j,j−1 ∗ Pk−j(W ≥ j|W ≥ j − 1) ≡ Yk−j,j−1 ∗ pk−j,j−1, (7.5)

where W is a random variable following the LoS distribution of customers that arrived

at time period k − j, and pk−j,j−1 is the probability of a patient that arrived at time

period k− j and did not leave the system up to time period (k− j) + (j− 1) = k− 1

will still be there at time period k. Again, we use the latest 10 weeks history data to

estimate that probability. We estimated p(w,d,h),j by

p(w,d,h),j ≡∑10

i=1 Y(w−i,d,h),j+1∑10i=1 Y(w−i,d,h),j

, j = 0, 1, · · · , 11. (7.6)

After combining these components, the estimator for Y(w,d,h)−j,j is

Y(w,d,h)−j,j = Y(w,d,h)−j,j−1 ∗ p(w,d,h)−j,j−1, j = 1, 2, · · · , 12. (7.7)

Finally, we need to estimate Rk,13, the number of patients in time period k that

had already been in the system for greater or equal to 13 hours. Instead of estimating

Rk,13 directly, we actually estimate rk,13 ≡ Rk,13/Qk = 1 − (∑12

j=0 Yk−j,j)/Qk by the

history data of the latest 10 weeks. Under the other indexing scheme, this can be

written as

r(w,d,h),13 ≡ 1−∑10

i=1

∑12j=0 Y(w−i,d,h)−j,j∑10

i=1Q(w−i,d,h)

. (7.8)


The reason we did like this is that we only used Yi,j and Qi for i ≤ k and j =

0, 1, · · · ,min12, k− i, which means we only need to keep a finite dimensional record

of the Y matrix. Of course the number we chose here (12) is somewhat arbitrary,

we could also keep record of Yk,j for j up to 11 or 13, say. However, in making this

choice, we note that smaller values will cause us to waste some information we have,

while larger values will likely produce inaccurate estimates of pk,j for large j’s.

Combining the estimation for each part in (7.4), (7.6), (7.7) and (7.8), we get the

estimator for Q(w,d,h) as

Q(w,d,h) ≡∑12

j=0 Y(w,d,h)−j,j

1− r(w,d,h),13

. (7.9)

We use the prediction results of daily arrival totals by SARIMAX(6, 1, 0)(0, 0, 2)7

as we introduced in §4.4 of Whitt and Zhang (2019b) for A(w,d) in (7.4). Since we

need 10 weeks data to estimate the arrival rates and the LoS distributions before we

start predicting the occupancy level, we make the hourly occupancy prediction from

12 p.m. Oct.25, 2006, which is 10 weeks after the start of test set, to 11 p.m. Oct.31,

2007, i.e. 8928 hours or equivalently 372 days.

The MSE of 1-hour-ahead real-time occupancy prediction is 14.65 (MAPE 10.59).

For comparison, we also apply two other simple prediction methods applied to the

same test period. Both are direct estimators which only use the occupancy level

history. The first alternative method is to directly use the current hour occupancy

level as the prediction for the next hour’s occupancy level (i.e., Q(w,d,h) = Q(w,d,h−1)),

and the MSE of this method is 23.04. The second method is to use the average

of 10 weeks’ observed occupancy levels at the same hour in a week before the one

we want to forecast as the predictor (i.e., Q(w,d,h) = 1/10∑10

i=1Q(w−i,d,h)). By this

second method, the MSE is 51.91. So we make a 30% improvement of the first

naive prediction method and even better to the second. Figure 7.1 plots part of the


prediction compared to the actual values, showing that the prediction curve is quite

close to the true curve.

2007-02-02 2007-02-03 2007-02-04 2007-02-05 2007-02-06 2007-02-07 2007-02-08 2007-02-09 2007-02-10 2007-02-11 2007-02-12time

10

20

30

40

50

60

70

occu

panc

y leve

l

truepredicted

Figure 7.1: Predicted occupancy level one hour ahead compared to the true occupancylevel.

7.2.2 Predicting Occupancy Level More Than One Hour Ahead

The real-time occupancy predictor for one hour ahead in the previous section can

easily be generalized to estimate the occupancy level two or more hours in the future.

However, as we predict farther into the future, we encounter two problems: (i) the

relevance of the elapsed times of current patients will decrease and (ii) the number of

estimators we must use will increase. For both reasons, we expect that the error will

increase. We take predicting 2 hours ahead as an example to show how to construct

the predictor; others can be done in the same way.

Recall that

Q(w,d,h) =12∑j=0

Y(w,d,h)−j,j +∞∑

j=13

Y(w,d,h)−j,j ≡12∑j=0

Y(w,d,h)−j,j +R(w,d,h),13


as in (7.3). Since we are predicting 2 hours ahead, we assume that we are now at

time (w, d, h)− 2. The estimator for Y(w,d,h)−0,0 is the same as in (7.4). However, we

can no longer apply (7.7) to estimate Y(w,d,h)−j,j because Y(w,d,h)−j,j−1 is not available

to us as we assume we are now at (w, d, h)−2. Analogous to (7.5), for j ≥ 2, we have

E(Y(w,d,h)−j,j|F(w,d,h)−2) = Y(w,d,h)−j,j−2 ∗ P(w,d,h)−j(W ≥ j|W ≥ j − 2)

≡ Y(w,d,h)−j,j−2 ∗ p(w,d,h)−j,j−2,2, (7.10)

where we use p(w,d,h),j,l to denote the probability that a patient arrived at time period

(w, d, h) will stay for at least another l hours given that the patient has been in the

system for j hours. Similar to (7.6), we estimate p(w,d,h),j,l by

p(w,d,h),j,l ≡∑10

i=1 Y(w−i,d,h),j+l∑10i=1 Y(w−i,d,h),j

. (7.11)

Hence, for Y(w,d,h)−j,j, j = 2, 3, · · · , 12, we use

Y(w,d,h)−j,j = Y(w,d,h)−j,j−2 ∗ pk−j,j−2,2, (7.12)

as the estimator. For Y(w,d,h)−1,1, since it is in the future, we use the product of the

predicted total arrivals and the corresponding survival probability,; i.e., we use (7.6)

for j = 1 and replace Y(w,d,h)−1,0 on the right hand side by Y(w,d,h)−1,0. Finally, we use

the same equation (7.9) to get the estimator for Q(w,h,d).

When making predictions for more than 1 hour ahead, we need to use predictions

for daily arrival totals for two or more days ahead. This can be seen from equation

(7.4). When we predict the hourly total arrivals, we need the forecast of daily arrival

totals for that day. However, we can make 1 day ahead daily volume prediction only

after 00:00 on that day. For example, assume that we are now at 23:00 on day k and


we want to predict the occupancy level for 00:00-01:00 on day k+1. According to our

real-time prediction procedure, we need to know both Ak and Ak+1. Note that here

Ak is the 1-day-ahead daily arrival total prediction while Ak+1 must be 2-days-ahead

daily arrival total prediction because we have not observed the arrivals in 23:00-24:00

on day k so that we don’t know Ak yet, and the last observed daily arrival total is

Ak−2.

Table 7.1 shows the MSE for predicting the occupancy level more than 1 hour

ahead for up to 6 hours. For comparison, Table 7.2 shows the corresponding MSE by

using rolling history occupancy average using n weeks history.

# of hours ahead 1 2 3 4 5 6MSE 14.65 25.21 35.05 66.33 113.59 160.16

Table 7.1: MSE for predicting the occupancy several hours ahead.

n 1 2 3 4 5MSE 82.24 64.29 57.43 55.74 54.92n 6 7 8 9 10

MSE 53.92 53.10 52.78 52.52 51.91

Table 7.2: MSE for predicting the occupancy by the rolling averages of n weeks.

Table 7.1 shows that beyond 4 hours ahead, the error is larger than the MSE 51.91

obtained by using rolling history occupancy average using n = 10 weeks history. For

another comparison, the MSE is only 58.83 if we predict the occupancy level by

the observed occupancy level two hour ago. Hence, we conclude that our real-time

occupancy predictor outperforms the rolling average predictor for forecasting from

1 to 3 hours in the future, but not longer. That is roughly what we expect, given

that the mean LoS for all patients is about 4 hours. Beyond that time interval, the

current state information will not provide much information. At the same time, we are

required to make too many estimations in the real-time estimation procedure, which


will make it perform worse than the rolling average estimator, which only removes

noise but does not use the recent system state.

75

Part III

Proofs

CHAPTER 8. PROOFS 76

Chapter 8

Proofs

We now provide all the postponed proofs.

8.1 Proofs for the Sample-Path Version of the Pe-

riodic Little’s Law

In this section all the limits are in the sense of almost sure convergence. Hence, we

can focus on one sample path where all the limits exist.

8.1.1 Proof of Lemma 2.1

We will show that under the Assumptions (A1), (A2) and (A3) in (2.2), we have

limn→∞

λk+ld(n), limn→∞

F ck+ld,j(n) and lim

n→∞Wk+ld(n) exist for all 0 ≤ k ≤ d − 1, l ≥ 0,


j ≥ 0, and

limn→∞

λk+ld(n) ≡ λk+ld = λk,

limn→∞

F ck+ld,j(n) ≡ F c

k+ld,j = F ck,j, and

limn→∞

Wk+ld(n) ≡ Wk+ld = Wk, (8.1)

where λk, F ck,j and Wk are the same constants as in (2.2).

By the definition of λk,

λk = limn→∞

λk(n) = limn→∞

1

n

n∑m=1

Ak+(m−1)d

= limn→∞

1

n(Ak +

n−1∑m=1

A(k+d)+(m−1)d) = limn→∞

n− 1

n

1

n− 1

n−1∑m=1

A(k+d)+(m−1)d

= λk+d. (8.2)

Next, by (2.1), we have Yk,j(n) = λk(n)Fck,j(n). By Assumptions (A1), (A2) and

(A3) in (2.2), we know that limn→∞

Yk,j(n) = λkFck,j exists for all 0 ≤ k ≤ d − 1 and

j ≥ 0. Using the same argument as for λk, we know that limn→∞

Yk+d,j(n) = limn→∞

Yk,j(n)

for all 0 ≤ k ≤ d− 1, j ≥ 0. Then,

F ck+d,j = lim

n→∞F ck+d,j(n) =

limn→∞

Yk+d,j(n)

limn→∞

λk+d(n)

=λkF

ck,j

λk= F c

k,j, for all 0 ≤ k ≤ d− 1, j ≥ 0. (8.3)

Similarly, for Wk+d we have

Wk+d =∞∑j=0

F ck+d,j =

∞∑j=0

F ck,j = Wk for all 0 ≤ k ≤ d− 1. (8.4)


By induction, we proved (8.1).

8.1.2 Proof of Theorems 2.2 and 2.3

The proof is done in two steps. In step 1, we show that limn→∞

Lk(n) =∑∞

j=0 λk−jFck−j,j

for k = 0, 1, · · · , d−1, and then in step 2, we show that L ≡ limn→∞

Qk(n) = limn→∞

Lk(n),

thus completing the proof of Theorem 2.2 and 2.3 together.

Step 1: Given a fixed ϵ > 0, by assumption (A1), there exists N1, such that for

any n > N1, we have sup0≤k≤d−1

|λk(n)− λk| < ϵ. Given assumption (A2), by the series

form of Scheffé’s lemma (p.215 of Billingsley (1995)), we know that assumption (A3)

is equivalent to

limn→∞

∞∑j=0

|F ck,j(n)− F c

k,j| = 0. (8.5)


So there exists N2, such that for any n > N2, we have sup0≤k≤d−1

∑∞j=0 |F c

k,j(n)−F ck,j| < ϵ.

Let N3 = maxN1, N2, then when n > N3,

|Lk(n)−∞∑j=0

λk−jFck−j,j|

= |k∑

i=0

λi(n)∞∑l=0

F ci,k−i+ld(n) +

d−1∑i=k+1

λi(n)∞∑l=1

F ci,k−i+ld(n)

−k∑

i=0

λi

∞∑l=0

F ci,k−i+ld −

d−1∑i=k+1

λi

∞∑l=1

F ci,k−i+ld|

= |k∑

j=0

λk−j(n)Fck−j,j(n) +

∞∑m=1

d∑j=1

λd−j(n)Fcd−j,(m−1)d+j+k(n)

−k∑

j=0


∞∑m=1

d∑j=1

λd−jFcd−j,(m−1)d+j+k|

≤ |k∑

j=0

λk−jFck−j,j(n) +

∞∑m=1

d∑j=1

λd−jFcd−j,(m−1)d+j+k(n)

−k∑

j=0


∞∑m=1

d∑j=1

λd−jFcd−j,(m−1)d+j+k|

+ ϵ( k∑

j=0

F ck−j,j(n) +

∞∑m=1

d∑j=1

F cd−j,(m−1)d+j+k(n)

)≤ ( max

0≤k≤d−1λk)

( k∑j=0

|F ck−j,j(n)− F c

k−j,j|

+∞∑

m=1

d∑j=1

|F cd−j,(m−1)d+j+k(n)− F c

d−j,(m−1)d+j+k|)

+ ϵ( k∑

j=0

F ck−j,j(n) +

∞∑m=1

d∑j=1

F cd−j,(m−1)d+j+k(n)

)≤ ( max

0≤k≤d−1λk)dϵ+ ϵd

(max

0≤k≤d−1Wk + ϵ

). (8.6)

Hence limn→∞

|Lk(n)−∑∞

j=0 λk−jFck−j,j| = 0 for all 0 ≤ k ≤ d− 1.


Step 2: Next, we show that Lk ≡ limn→∞

Qk(n) = limn→∞

Lk(n). To do so, actually we

will prove that

E(n) ≡d−1∑k=0

Ek(n) ≡d−1∑k=0

|Lk(n)− Qk(n)| → 0 as n→ ∞. (8.7)

We further divide this step into 2 substeps. In the first substep, we compute the

expression of E(n), and then in the second substep we show it goes to 0 as n goes to

infinity.

Step 2.1: By some transformation, we know that

Qk(n) ≡1

n

n∑m=1

Qk+(m−1)d =1

n

n∑m=1

k+(m−1)d∑j=0

Yk+(m−1)d−j,j

=

1

n

n∑m=1

k∑j=0

Yk−j+(m−1)d,j +1

n

n∑m=2

k+(m−1)d∑j=k+1

Yk−j+(m−1)d,j

=1

n

n∑m=1

k∑j=0

Yk−j+(m−1)d,j +1

n

n−1∑m=1

md∑j=1

Yd−j+(m−1)d,j+k

=1

n

n∑m=1

k∑j=0

Yk−j+(m−1)d,j +1

n

n−1∑m=1

d∑j=1

Yd−j+(m−1)d,j+k

+1

n

n−2∑m=1

md∑j=1

Yd−j+(m−1)d,j+k+d

=1

n

n∑m=1

k∑j=0

Yk−j+(m−1)d,j +1

n

n−1∑s=1

n−s∑m=1

d∑j=1

Yd−j+(m−1)d,j+k+(s−1)d, (8.8)


and

Lk(n) =k∑

j=0

Yk−j,j(n) +∞∑s=1

d∑j=1

Yd−j,j+k+(s−1)d(n)

=1

n

n∑m=1

k∑j=0

Yk−j+(m−1)d,j +1

n

∞∑s=1

n∑m=1

d∑j=1

Yd−j+(m−1)d,j+k+(s−1)d. (8.9)

We may now study the absolute difference between Lk(n) and Qk(n). Here

Ek(n) ≡ |Lk(n)− Qk(n)|

=1

n

n−1∑s=1

n∑m=n−s+1

d∑j=1

Yd−j+(m−1)d,j+k+(s−1)d

+1

n

∞∑s=n

n∑m=1

d∑j=1


=1

n

n∑m=1

d∑j=1

∞∑s=n−m+1

Yd−j+(m−1)d,j+k+(s−1)d, (8.10)

and summing over k = 0, 1, · · · , d− 1 further gives

E(n) ≡d−1∑k=0

Ek(n) =d−1∑k=0

1

n

n∑m=1

d∑j=1

∞∑s=n−m+1


=1

n

n∑m=1

d∑j=1

∞∑s=(n−m)d

Yd−j+(m−1)d,j+s. (8.11)

Step 2.2: Now it suffices to show that E(n) → 0 as n→ ∞. For that purpose,

Let N1, N2 and N3 be the same as in the beginning of the proof, depending on given


ϵ. Then, when n > N3, we have

|∞∑j=0

Yk,j(n)−∞∑j=0

λkFck,j| = |

∞∑j=0

λk(n)Fck,j(n)−

∞∑j=0

λkFck,j|

≤ |∞∑j=0

λkFck,j(n)−

∞∑j=0

λkFck,j|+ ϵ

∞∑j=0

F ck,j(n)

≤ λk

∞∑j=0

|F ck,j(n)− F c

k,j|+ ϵ (Wk + ϵ)

≤ λkϵ+ ϵ (Wk + ϵ) . (8.12)

Assumptions (A1) and (A2) indicate that limn→∞

Yk,j(n) = λkFck,j. Again, by Scheffé’s

lemma, (8.12) is equivalent to

limn→∞

∞∑j=0

|Yk,j(n)− λkFck,j| = 0. (8.13)

For any ϵ > 0, since∑∞

j=0 λkFck,j converges for each k = 0, 1, · · · , d− 1, we know

that there exists J , such that

d−1∑k=0

∞∑j=J

λkFck,j < ϵ.

Let N4 ≡ ⌈J/d⌉ where ⌈x⌉ means the smallest integer greater than x, then when

n ≥ N4, we have nd > J . By (8.13), there exists N5 such that when n > N5,

d−1∑k=0

∞∑j=0

|Yk,j(n)− λkFck,j| < ϵ.


And let N6 ≡ maxN4, N5, then when n ≥ N6,

d−1∑k=0

∞∑j=N6d

Yk,j(n) ≤ 2ϵ. (8.14)

And let N7 ≡ ⌈N6/ϵ⌉, then when n > N7, we have (n−N6)/n > 1− ϵ. Finally, when

n > maxN7, 2N6,

E(n) =1

n

n∑m=1

d∑j=1

∞∑s=(n−m)d

Yd−j+(m−1)d,j+s

=d∑

j=1

1

n

n−N6∑m=1

∞∑s=(n−m)d

Yd−j+(m−1)d,j+s

+d∑

j=1

1

n

n∑m=n−N6+1

∞∑s=(n−m)d

Yd−j+(m−1)d,j+s

≤d∑

j=1

1

n

n−N6∑m=1

∞∑s=N6d

Yd−j+(m−1)d,j+s +d∑

j=1

1

n

n∑m=n−N6+1

∞∑s=0

Yd−j+(m−1)d,s

≤d∑

j=1

1

n

n∑m=1

∞∑s=N6d

Yd−j+(m−1)d,j+s +d∑

j=1

1

n

n∑m=n−N6+1

∞∑s=0

Yd−j+(m−1)d,s

=d∑

j=1

∞∑s=N6d

Yd−j,j+s(n) +d∑

j=1

∞∑s=0

(Yd−j,s(n)−n−N6

nYd−j,s(n−N6))

≤ 2ϵ+ 2ϵ+ ϵ(d−1∑k=0

∞∑j=0

λkFck,j + ϵ). (8.15)

The inequality in the third line is just relaxing the index s, the next inequality

is relaxing the index m for the first term, and the last inequality is given by the

definition of N6 and N7, where the first term is bounded by 2ϵ by using (8.14), and


the second term is bounded by

d∑j=1

∞∑s=0

(Yd−j,s(n)−n−N6

nYd−j,s(n−N6))

≤d∑

j=1

∞∑s=0

(Yd−j,s(n)− (1− ϵ)Yd−j,s(n−N6))

=d∑

j=1

∞∑s=0

(Yd−j,s(n)− Yd−j,s(n−N6)) + ϵYd−j,s(n−N6)

≤2ϵ+ ϵ(d−1∑k=0

∞∑j=0

λkFck,j + ϵ). (8.16)

So we have proved that Lk ≡ limn→∞

Qk(n) = limn→∞

Lk(n), which completes the proof

of Theorem 2.2 and 2.3.

8.1.3 Proof of the Second Half of Corollary 2.4

Proof: Here we give the second half of the proof of Corollary 2.4, i.e., the explicit

bound in (2.9).

From equation (8.10), we know that

Ek(n) =1

n

n∑m=1

d∑j=1

∞∑s=n−m+1

Yd−j+(m−1)d,j+k+(s−1)d.

By the condition, we know that Yi,j = 0 for i ≥ 0 and j ≥ dmu. Hence, when n ≥ mu,

Ek(n) =1

n

d∑j=1

n∑m=n−mu

mu+1∑s=n−m+1


≤ 1

n

d∑j=1

n∑m=n−mu

mu+1∑s=n−m+1

Ad−j+(m−1)d ≤1

n

d∑j=1

n∑m=n−mu

mu+1∑s=n−m+1

λu

=1

nλud

(mu + 1)(mu + 2)

2≤ λud(mu + 2)2

2n. (8.17)


So we have proved Corollary 2.4.

8.2 Proofs for the Central-Limit-Theorem Version

of the Periodic Little’s Law

In this section we provide the proofs of theorems, corollaries that related to the CLT

version of the PLL. We use ⇒ for convergence in distribution.

8.2.1 Proof of Lemma 4.2

To prove fzzz(xxx(1),xxx(2), yyy) is continuous from Rd × Rd × (ℓ1)

d to Rd, it suffices to

show that fzzz,k(xxx(1),xxx(2), yyy) is continuous from Rd × Rd × (ℓ1)d to R, where xxx(i) =

(x(i)0 , x

(i)1 , · · · , x

(i)d−1) ∈ Rd for i = 1, 2, and yyy = (yyy0, yyy1, · · · , yyyd−1) ∈ (ℓ1)

d are variables

and zzz = (zzz0, zzz1, · · · , zzzd−1) ∈ (ℓ1)d is a constant. Further, it suffices to show that

hk(xxx,yyy) is continuous from Rd × (ℓ1)d to R, where xxx = (x0, x1, · · · , xd−1) ∈ Rd. For

convenience, we use the maximum norm in Rd, i.e. ||xxx||∞ ≡ max0≤k≤d−1

|xk|, which is an

equivalent norm to the usual Euclidean distance, and for the space Rd × Rd × (ℓ1)d

and Rd × (ℓ1)d, we use the metric induced by the norm ||(xxx(1),xxx(2), yyy)||Rd×Rd×(ℓ1)d ≡

||xxx(1)||∞+||xxx(2)||∞+||yyy||1,d and ||(xxx,yyy)||Rd×(ℓ1)d ≡ ||xxx||∞+||yyy||1,d respectively. Because

the notation of infinity norm is the same in Rd and ℓ∞, it should not cause confusion.

First we observe that hk can be written as a inner product of two projection

functions:

hk(xxx,yyy) = ΠΠΠ1k(xxx) ·ΠΠΠ2

k(yyy), (8.18)


where

ΠΠΠ1k(xxx) ≡ (xk, xk−1, · · · , x0, xd−1, xd−2, · · · , x0, xd−1, xd−2, · · · ) and

ΠΠΠ2k(yyy) ≡ (yk,0, yk−1,1, · · · , y0,k, yd−1,k+1, yd−2,k+1, · · · , y0,k+d, yd−1,k+d+1, · · · ). (8.19)

Note that ||ΠΠΠ1k(xxx)||∞ = ||xxx||∞ < ∞, and ||ΠΠΠ2

k(yyy)||1 ≤∑d−1

k=0 ||yyyk||1 < ∞. So obviously

ΠΠΠ1k(xxx) is continuous from (Rd, || · ||∞) to (ℓ∞, || · ||∞) and ΠΠΠ2

k(yyy) is continuous from

((ℓ1)d, || · ||1,d) to (ℓ1, || · ||1). We also have that ΠΠΠ1

k(xxx) − ΠΠΠ1k(xxx) = ΠΠΠ1

k(xxx − xxx) and

Π2kΠ2kΠ2k(yyy) −ΠΠΠ2

k(yyy) = ΠΠΠ2k(yyy − yyy) for xxx, xxx ∈ Rd and yyy, yyy ∈ (ℓ1)

d. For any δ > 0 and (xxx,yyy)

fixed, choose (xxx, yyy) such that

||(xxx,yyy)− (xxx, yyy)||Rd×(ℓ1)d = ||xxx− xxx||∞ +d−1∑k=0

||yyyk − yyyk||1 < δ,

then

|hk(xxx,yyy)− hk(xxx, yyy)| = |ΠΠΠ1k(xxx) ···ΠΠΠ2

k(yyy)−ΠΠΠ1k(xxx) ···ΠΠΠ2

k(yyy)|

≤ |ΠΠΠ1k(xxx) ···ΠΠΠ2


k(yyy)|+ |ΠΠΠ1k(xxx) ···ΠΠΠ2


k(yyy)|

≤ ||ΠΠΠ1k(xxx)||∞||ΠΠΠ2

k(yyy − yyy)||1 + ||ΠΠΠ1k(xxx− xxx)||∞||ΠΠΠ2

k(yyy)||1

≤ ||xxx||∞δ + δ

d−1∑k=0

||yyyk||1 ≤ ||xxx||∞δ + δ

d−1∑k=0

(||yyyk||1 + ||yyyk − yyyk||1)

≤ ||xxx||∞δ + δ

d−1∑k=0

||yyyk||1 + δ2 → 0 as δ → 0. (8.20)

Given that each function fzzz,k is continuous from (Rd×Rd×(ℓ1)d, ||·||Rd×Rd×(ℓ1)d) to

(R, | · |), we can conclude that fzzz is continuous from (Rd×Rd×(ℓ1)d, || · ||Rd×Rd×(ℓ1)d) to

(Rd, ||·||∞). Finally, note that we can write g(yyy) = f000(eee,000, yyy), where eee = (1, 1, · · · , 1) ∈

Rd, so that g is also continuous from ((ℓ1)d, || · ||1,d) to (Rd, || · ||∞) and the lemma is


proved.

8.2.2 Proof of Theorem 4.3

Condition (C1) implies that F cF cF c(n) ∈ (ℓ1)d as well as F cF cF c ∈ (ℓ1)

d, which indicates that

the limiting distributions all have finite means. Moreover, (C1) implies that

(λλλ(n), F cF cF c(n)) ⇒ (λλλ,F cF cF c) in Rd × (ℓ1)d. (8.21)

Note that WWW (n) = g(F cF cF c(n)) and WWW = g(F cF cF c), so by continuous mapping theorem,

(λλλ(n), WWW (n), F cF cF c(n)) ⇒ (λλλ,WWW,F cF cF c) in R2d × (ℓ1)d. (8.22)

By the definition of LLL(n) in (4.2) and (2.6), LLL(n) is a function of (λλλ(n), F cF cF c(n)),

i.e. LLL(n) = f000(λλλ(n),000, FcF cF c(n)) where f is defined in (4.12). By Lemma 4.2, f is a

continuous function, hence

(λλλ(n), WWW (n), LLL(n), F cF cF c(n)) ⇒ (λλλ,WWW,LLL,F cF cF c) in R3d × (ℓ1)d. (8.23)

For the CLT-scaled terms, notice that WWW (n) = g(F cF cF c(n)), so by the continuous

mapping theorem,

(λλλ(n), WWW (n), F cF cF c(n)) ⇒ (ΛΛΛ,ΩΩΩ,ΓΓΓ) in R2d × (ℓ1)d, (8.24)

where ΩΩΩ = g(ΓΓΓ).


Combining (8.23) and (8.24), by Theorem 3.9 of Billingsley (1999), we have

(λλλ(n), WWW (n), LLL(n), λλλ(n), WWW (n), F cF cF c(n), F cF cF c(n)) ⇒ (λλλ,WWW,LLL,ΛΛΛ,ΩΩΩ,F cF cF c,ΓΓΓ)

in R5d × (ℓ1)2d. (8.25)

Now we turn to LLL(n). Note that for k = 0, 1, · · · , d − 1, we can write the kth

component of LLL(n) as

√n(Lk(n)− Lk) =

√n( ∞∑

j=0

λ[k−j](n)Fc[k−j],j(n)−

∞∑j=0

λ[k−j]Fc[k−j],j

)=

√n

∞∑j=0

(λ[k−j](n)F

c[k−j],j(n)− λ[k−j](n)F

c[k−j],j

+ λ[k−j](n)Fc[k−j],j − λ[k−j]F

c[k−j],j

)= hk(λλλ(n), F

cF cF c(n)) + hk(λλλ(n), FcF cF c)

= fF cF cF c,k(λλλ(n), λλλ(n), FcF cF c(n)), (8.26)

so

LLL(n) = fF cF cF c

(λλλ(n), λλλ(n), F cF cF c(n)

). (8.27)

By Lemma 4.2, fF cF cF c is a continuous function. In addition, ΓΓΓ ∈ (ℓ1)d w.p.1, so we

can apply the continuous mapping theorem again to get

(λλλ(n), WWW (n), LLL(n), λλλ(n), WWW (n), LLL(n), F cF cF c(n), F cF cF c(n)) ⇒ (λλλ,WWW,LLL,ΛΛΛ,ΩΩΩ,ΥΥΥ,F cF cF c,ΓΓΓ)

in R6d × (ℓ1)2d, (8.28)

where ΥΥΥ = fF cF cF c(λλλ,ΛΛΛ,ΓΓΓ), which is what we want as in (4.14).


8.2.3 Proof of Proposition 4.4

If RRR(n) ⇒ 0, then both RRR(n) = LLL(n)−QQQ(n) → 000 and LLL(n)−QQQ(n) → 0 in probability.

By applying Theorem 3.1 of Billingsley (1999) based on (8.28), we have

(λλλ(n), WWW (n), QQQ(n), LLL(n), λλλ(n), WWW (n), QQQ(n), LLL(n), F cF cF c(n), F cF cF c(n), RRR(n))

⇒ (λλλ,WWW,LLL,LLL,ΛΛΛ,ΩΩΩ,ΥΥΥ,ΥΥΥ,F cF cF c,ΓΓΓ,000) in R9d × (ℓ1)2d. (8.29)

Now consider the departure processes δδδ(n) and δδδ(n). Note that

δk(n) =1

n

n∑m=1

Dk+(m−1)d =1

n

n∑m=1

(Qk+(m−1)d −Qk+1+(m−1)d + Ak+1+(m−1)d)

=

Qk(n)− Qk+1(n) + λk+1(n), 0 ≤ k < d− 1,

Qd−1(n)− Q0(n+ 1) + λ0(n+ 1) +1

nQ0 −

1

nA0, k = d− 1,

(8.30)

and

δk(n) =

√n(Qk(n)− Qk+1(n) + λk+1(n)− Lk + Lk+1 − λk+1)

√n(Qd−1(n)− Q0(n+ 1) + λ0(n+ 1)− Lk + Lk+1 − λk+1 +

1

nQ0 −

1

nA0)

=

Qk(n)− Qk+1 + λk+1(n), 0 ≤ k < d− 1,

Qd−1(n)− Q0 + λ0(n) +1√nQ0 −

1√nA0, k = d− 1.

(8.31)

Once again by continuous mapping theorem, we get the joint convergence (4.16) in

Proposition 4.4, where δδδ is given by (4.4) and ∆k = Υ[k] − Υ[k+1] + Λ[k+1], 0 ≤ k ≤

d− 1.


8.2.4 Proof of Corollary 4.5

To make the proof clear, we first establish two lemmas.

Lemma 8.1. Assume Xn ∼ N(µn, σ2n), n = 1, 2, · · · , and Xn → X almost surely as

n→ ∞. Then X ∼ N(µ, σ2) where µ = limn→∞

µn, σ2 = limn→∞

σ2n and Xn → X in L2 as

n→ ∞.

Proof. Let ϕXn(t) = EeitXn = exp(iµnt − 2−1σ2nt

2) be the characteristic function

of Xn. Since Xn → X almost surely, by dominated convergence theorem, we know

that limn→∞

EeitXn = EeitX for each t. So we must have limn→∞

µi = µ, limn→∞

σ2i = σ2

for some µ and σ ≥ 0. (Note that we cannot have limn→∞

µn = ∞ or limn→∞

σ2n = ∞,

which would contradict Lévy’s continuity theorem.) Then we know that EeitX =

exp(iµt− 2−1σ2t2), which shows that X ∼ N(µ, σ2).

Note that limn→∞

µn = µ and limn→∞

σ2n = σ2 also implies that sup

nEX4

n = supn(µ4

n +

6µ2nσ

2n+3σ4

n) <∞. So X2n is uniformly integrable. Hence Xn → X in L2 as n→ ∞.

Lemma 8.2. Assume NNN ∈ ℓ1 is a zero-mean normally distributed random variable

with Cov(NNN,NNN) = ΣN and aaa, bbb ∈ ℓ0 are constants. Let X ≡∑∞

i=1 aiNi = aaaT ···NNN and

Y ≡∑∞

i=1 biNi = bbbT ···NNN , then (X,Y ) is jointly zero-mean normal distributed with

Cov(X,X) = Var(X) =∞∑i=1

∞∑j=1

aiajΣNi,j <∞,

Cov(Y, Y ) = Var(Y ) =∞∑i=1

∞∑j=1

bibjΣNi,j <∞ and

Cov(X,Y ) = E(XY ) =∞∑i=1

∞∑j=1

aibjΣNi,j <∞.


Proof. By the definition of normal distribution in ℓ1 space, X and Y are normally

distributed random variables on R. We only need to show that they are also jointly

normally distributed, i.e. for any given α, β ∈ R, αX + βY is a zero-mean normal

distributed random variable.

Denote Xn =∑n

i=1 aiNi and Yn =∑n

i=1 biNi. Since X and Y are well defined

almost surely, we know that αXn + βYn → αX + βY almost surely. Note that

αXn + βYn is normal distributed with mean zero and variance∑n

i=1

∑nj=1(αai +

βbi)(αaj+βbj)ΣNi,j. And we can apply Lemma 8.1 and know that αX+βY has normal

distribution with mean zero and variance∑∞

i=1

∑∞j=1(αai + βbi)(αaj + βbj)Σ

Ni,j. So

(X,Y ) is jointly normally distributed.

Now we derive E(XY ). By Lemma 8.1 we know that Xn → X and Yn → Y

in L2 as well, so XnYn → XY in L1 and we have E(XY ) = limn→∞

E(XnYn) =

limn→∞

∑ni=1

∑nj=1 aibjΣ

Ni,j =

∑∞i=1

∑∞j=1 aibjΣ

Ni,j <∞. Because X and Y are zero-mean

normal distributed, we have Cov(X,Y ) = E(XY ), where Cov(X,X) and Cov(Y, Y )

are special cases with X = Y .

Now we go back to the proof of 4.5

Proof Lemma 8.2 can be easily generalized to NNN ∈ (ℓ1)d only with some tedious

steps. Since ΓΓΓ ∈ (ℓ1)d almost surely, ΩΩΩ = g(ΓΓΓ) and ΥΥΥ = fF cF cF c(λλλ,ΛΛΛ,ΓΓΓ) are well defined

a.s. In addition, with the Gaussian assumption, we may apply the generalized version

of Lemma 8.2 and know that (ΩΩΩ,ΥΥΥ) has a zero-mean jointly normal distribution with


each element being well defined a.s. and in L2, where

(Cov(ΩΩΩ,ΩΩΩ))k,l = Cov(∞∑j=0

Γk,j,

∞∑j=0

Γl,j) =∞∑i=1

∞∑j=1

ΣΓ:k,li,j ,

(Cov(ΥΥΥ,ΥΥΥ))k,l = Cov(fF cF cF c,k(λλλ,ΛΛΛ,ΓΓΓ), fF cF cF c,l(λλλ,ΛΛΛ,ΓΓΓ))

= Cov(∞∑j=0

λ[k−j]Γ[k−j],j +∞∑j=0

F c[k−j],jΛ[k−j],

∞∑j=0

λ[l−j]Γ[l−j],j +∞∑j=0

F c[l−j],jΛ[l−j])

=∞∑i=0

∞∑j=0

λ[k−i]λ[l−j]ΣΓ:[k−i],[l−j]i+1,j+1 +

∞∑i=0

∞∑j=0

λ[k−i]Fc[l−j],jΣ

Λ,Γ:[k−j][l−j]+1,i+1

+∞∑i=0

∞∑j=0

F c[k−i],iλ[l−j]Σ

Λ,Γ:[l−j][k−i]+1,j+1

+∞∑i=0

∑j=0

F c[k−i],iF

c[l−j],jΣ

Λ[k−i]+1,[l−j]+1,

(Cov(ΩΩΩ,ΥΥΥ))k,l = Cov(∞∑j=0

Γk,j,∞∑j=0

λ[l−j]Γ[l−j],j +∞∑j=0

F c[l−j,j]Λ[l−j])

=∞∑i=1

∞∑j=0

λ[l−j]ΣΓ:k,[l−j]i+1,j+1 +

∞∑i=1

∞∑j=0

F c[l−j],jΣ

Λ,Γ:k[l−j]+1,i+1.

8.2.5 Proof of Corollary 4.6

If the LoS is bounded by J , then F ck,j(n) = 0 for all 0 ≤ k ≤ d− 1 and n when j > J .

So if we have (C1), then F ck,j = 0 must holds for all k and j > J and F cF cF c(n) ∈ (ℓ1)

d

w.p. 1. Hence Γk,j = 0 for all k and j > J , and ΓΓΓ ∈ (ℓ1)d w.p. 1.

To see that RRR(n) ⇒ 0 holds, by (4.3), RRR(n) = LLL(n)− QQQ(n) =√n(LLL(n)− QQQ(n)),

so it suffices to show that√nE(n) → 0, where E(n) ≡ ||LLL(n)− QQQ(n)||1. We take M

such that (M − 1)d < J ≤Md. When n > M , because Yk,j = 0 for j > Md ≥ J , we


have

√nE(n) =

√n1

n

n∑m=1

d∑j=1

∞∑s=(n−m)d

Yd−j+(m−1)d,j+s

= n−1/2

n∑m=n−M+1

d∑j=1

Md∑s=(n−m)d

Yd−j+(m−1)d,j+s

≤d∑

j=1

n−1/2

n∑m=n−M+1

MdAd−j+(m−1)d

≤ n−1/2d2M2C → 0 as n→ ∞, (8.32)

where C is the upper bound for the number of arrivals within a discrete time period.

This establishes RRR(n) ⇒ 0.


To prove Theorem 5.1, we need a lemma to establish (C1).

Lemma 8.3. Suppose Wk, k = 0, 1, · · · , d−1, are non-negative integer-valued random

variables with F ck,j ≡ 1−Fk,j ≡ P (Wk ≥ j) ∼ O(j−(3+δ)), j ≥ 0, for some δ > 0. W (i)

k

are i.i.d. samples of Wk, denote III(i)k ≡ (IW

(i)k ≥0

, IW

(i)k ≥1

, · · · ), FFF ck ≡ (F c

k,0, Fck,1, · · · ),

and XXX(i)k ≡ III

(i)k − FFF c

k. Assume YYY (i) are i.i.d. non-negative integer-valued random

variables in Rd with EYYY (i) = µYµYµY ≡ (µY,0, µY,1, · · · , µY,d−1) > 000 and Var(YYY (i)) = ΣY ,

where all the W (i)k and YYY (j) are independent for k = 0, 1, · · · , d− 1, and i, j ≥ 1. Let

SSS(n) ≡ (S0(n), S1(n), · · · , Sd−1(n)) ≡n∑

i=1

YYY (i)

and

GGG(n) ≡ (GGG0(n),GGG1(n), · · · ,GGGd−1(n)) ≡ (

S0(n)∑i=1

XXX(i)0 ,

S1(n)∑i=1

XXX(i)1 , · · · ,

Sd−1(n)∑i=1

XXX(i)d−1).


We claim that

n−1/2(SSS(n)− nµYµYµY ,GGG(n)) ⇒ (ΛΛΛ,ΓΓΓ) in Rd × (ℓ1)d, (8.33)

where ΓΓΓ = (ΓΓΓ0,ΓΓΓ1, · · · ,ΓΓΓd−1) and (ΛΛΛ,ΓΓΓ) has a zero-mean Gaussian distribution in

Rd × (ℓ1)d with ΛΛΛ ∼ N(000,ΣY ), Cov(Γk,j,Γk,s) = µY,kFk,jF

ck,s for 0 ≤ k ≤ d − 1 and

0 ≤ j ≤ s, and ΛΛΛ,ΓΓΓ0,ΓΓΓ1, · · · ,ΓΓΓd−1 are independent.

Proof. The classical multivariate CLT implies that

n−1/2(SSS(n)− nµYµYµY ) ⇒ ΛΛΛ ∼ N(000,ΣY ). (8.34)

Let ZZZk ∈ ℓ1 for 0 ≤ k ≤ d − 1 be zero-mean Gaussian distributed random variables

with Cov(Zk,j, Zk,l) = Fk,jFck,l. By Theorem 1.1 of Gut (2009),

(Sk(n))−1/2

Sk(n)∑i=1

XXX(i)k ⇒ ZZZk in ℓ1 for 0 ≤ k ≤ d− 1, (8.35)

then after applying Slutsky’s theorem, we have

n−1/2

Sk(n)∑i=1

XXX(i)k ⇒ ΓΓΓk = µ

1/2Y,kZZZk in ℓ1 for 0 ≤ k ≤ d− 1, (8.36)

so that ΓΓΓk has a zero-mean Gaussian distribution with Cov(Γk,j,Γk,s) = µY,kFk,jFck,s

for j ≤ s. Unfortunately we cannot get the joint convergence directly since they are

not independent of each other, however, we can use independent copies of YYY (i) as a

bridge to prove it.

Assume that YYY (i) are i.i.d. copies of YYY (i) and are also independent of other


variables. Let

SSS(n) ≡ (S0(n), S1(n), · · · , Sd−1(n)) ≡n∑

i=1

YYY(i).

Because n−1/2(SSS(n)−nµYµYµY ) is independent of n−1/2GGG0(n) ≡ n−1/2∑S0(n)

i=1 XXX(i)0 , we can

apply Theorem 11.4.4 of Whitt (2002) to obtain

n−1/2(SSS(n)− nµYµYµY , GGG0(n)) ⇒ (ΛΛΛ,ΓΓΓ0) in Rd × ℓ1, (8.37)

where we can make ΛΛΛ be independent of ΓΓΓ0. For what we want, we need to show that

n−1/2||GGG0(n) − GGG0(n)||1 → 0 in probability as n → ∞ and then apply Theorem 3.1

from Billingsley (1999). For any ϵ > 0,

P (n−1/2||GGG0(n)− GGG0(n)||1 > ϵ) = P (||S0(n)∑i=1

XXX(i)0 −

S0(n)∑i=1

XXX(i)0 ||1 > n1/2ϵ)

≤P (||S0(n)∑i=1

XXX(i)0 −

S0(n)∑i=1

XXX(i)0 ||1 > n1/2ϵ, |S0(n)− S0(n)| ≤ n3/4)

+ P (|S0(n)− S0(n)| > n3/4)

≤2P ( maxI=1,2,··· ,⌊n3/4⌋

||I∑

i=1

XXX(i)0 ||1 > n1/2ϵ) + P (|S0(n)− S0(n)| > n3/4)

=2P ( maxI=1,2,··· ,⌊n3/4⌋

∞∑j=0

|I∑

i=1

X(i)0,j| > n1/2ϵ) + P (|S0(n)− S0(n)| > n3/4)

≤2P (∞∑j=0

maxI=1,2,··· ,⌊n3/4⌋

|I∑

i=1

X(i)0,j| > n1/2ϵ) + P (|S0(n)− S0(n)| > n3/4). (8.38)

For the first part, let δ1 ≤ δ/2 and C ≡ (∑∞

j=0 j−(1+δ1))−1 be constants, note that


Var(X(i)0,j) = F0,jF

c0,j, so

P (∞∑j=0

maxI=1,2,··· ,⌊n3/4⌋

|I∑

i=1

X(i)0,j| > n1/2ϵ)

≤∞∑j=0

P ( maxI=1,2,··· ,⌊n3/4⌋

|I∑

i=1

X(i)0,j| > Cn1/2ϵj−(1+δ1))

≤3∞∑j=0

maxI=1,2,··· ,⌊n3/4⌋

P (|I∑

i=1

X(i)0,j| > Cn1/2ϵj−(1+δ1)/3)

≤27∞∑j=0

maxI=1,2,··· ,⌊n3/4⌋

Var(∑I

i=1X(i)0,j)

C2nϵ2j−(2+2δ1)

≤27∞∑j=0

C−2n−1ϵ−2j2+2δ1⌊n3/4⌋F0,jFc0,j

≤27C−2ϵ−2n−1⌊n3/4⌋∞∑j=0

j2+2δ1F c0,j → 0 as n→ ∞, (8.39)

because as a consequence of F ck,j ∼ O(j−(3+δ)),

∑∞j=0 j

2+2δ1F c0,j ≤ ∞, and the second

and third inequalities follow from Etemadi’s inequality (see page 256 of Billingsley

(1999)) and Chebyshev’s inequality respectively.

As for the second part, again using Chebyshev’s inequality, we have

P (|S0(n)− S0(n)| > n3/4) ≤ Var(S0(n)− S0(n))

n3/2≤

2nΣY1,1

n3/2→ 0 as n→ ∞.

(8.40)

So (8.38) goes to 0 as n→ ∞, hence

n−1/2(SSS(n)− nµYµYµY ,GGG0(n)) ⇒ (ΛΛΛ,ΓΓΓ0) in Rd × ℓ1. (8.41)

Using the same argument (making new i.i.d. copies of YYY (i)), we can add

GGG1(n),GGG2(n), · · · ,GGGd−1(n) one by one and in the end get (8.33).

Now we go back the proof of Theorem 5.1


Proof. Take λλλ = E(A0+(m−1)d, A1+(m−1)d, · · · , Ad−1+(m−1)d) and F cF cF c to be the com-

plementary distribution functions of the LoS. Let WWW and LLL be as in (C2), in which

case WWW is the vector of mean LoS for a period.

We can use Lemma 8.3 to establish (C1) by letting YYY (i) be the AAAi and W(i)k be

the LoS of ith customer that arrived at discrete time period k + (m − 1)d for all

m = 1, 2, · · · . (So EW(i)k = Wk and σ2

W,k ≡ Var(W (i)k ) < ∞.) The conclusion of

Lemma 8.3 is exactly (C1).

Then we need to establish RRR(n) ⇒ 0. Note that (I1) and (I2) imply that the

SLLN holds, i.e. equation (2.2) hold, so that we have Theorem 2.2. Since RRR(n) =

LLL(n) − QQQ(n) = n1/2(LLL(n) − QQQ(n)), it suffices to show that for each 0 ≤ k ≤ d − 1,

n1/2(Lk(n)− Qk(n)) → 0 in probability as n→ ∞.

From the proof of Theorem 2.2 in Whitt and Zhang (2019c), we know that

Lk(n)− Qk(n) = n−1

n∑m=1

d∑j=1

∞∑s=n−m+1

Yd−j+(m−1)d,j+k+(s−1)d.

So for any ϵ > 0,

P (n1/2(Lk(n)− Qk(n)) > ϵ)

=P (n−1/2

n∑m=1

d∑j=1

∞∑s=n−m+1

Yd−j+(m−1)d,j+k+(s−1)d > ϵ)

≤d∑

j=1

P (n∑

m=1

∞∑s=n−m+1

Yd−j+(m−1)d,j+k+(s−1)d > n1/2d−1ϵ)

≤d∑

j=1

(P (

n−⌊n1/4⌋∑m=1

∞∑s=n−m+1

Yd−j+(m−1)d,j+k+(s−1)d > n1/2d−1ϵ)

+ P (n∑

m=n−⌊n1/4⌋+1

∞∑s=n−m+1

Yd−j+(m−1)d,j+k+(s−1)d > n1/2d−1ϵ)). (8.42)


Then we only need to prove that each of the two probabilities go to zero as n→ ∞.

For the first part, we have

P (

n−⌊n1/4⌋∑m=1

∞∑s=n−m+1

Yd−j+(m−1)d,j+k+(s−1)d > n1/2d−1ϵ)

≤P (n−⌊n1/4⌋∑

m=1

∞∑s=⌊n1/4⌋+1

Yd−j+(m−1)d,j+k+(s−1)d > n1/2d−1ϵ)

≤P (n∑

m=1

∞∑s=⌊n1/4⌋+1

Yd−j+(m−1)d,j+k+(s−1)d > n1/2d−1ϵ)

≤∑n

m=1

∑∞s=⌊n1/4⌋+1EYd−j+(m−1)d,j+k+(s−1)d

n1/2d−1ϵ

≤n1/2dϵ−1

∞∑s=⌊n1/4⌋+1

λd−jFcd−j,j+k+(s−1)d

≤n1/2dλd−jϵ−1

∞∑s=⌊n1/4⌋+1

F cd−j,s ≤ n1/2Cdλd−jϵ

−1

∫ ∞

⌊n1/4⌋s−(3+δ)ds

=n1/2Cdλd−j(2 + δ)−1ϵ−1⌊n1/4⌋−(2+δ) → 0 as n→ ∞. (8.43)

For the second part, let Sk(n) =∑n

m=1Ak+(m−1)d. Then

P (n∑

m=n−⌊n1/4⌋+1

∞∑s=n−m+1

Yd−j+(m−1)d,j+k+(s−1)d > n1/2d−1ϵ)

≤P (n∑

m=n−⌊n1/4⌋+1

∞∑s=0

Yd−j+(m−1)d,s > n1/2d−1ϵ)

=P (

Sd−j(⌊n1/4⌋)∑i=1

W(i)d−j > n1/2d−1ϵ)

≤Var(

∑Sd−j(⌊n1/4⌋)i=1 W

(i)d−j) + (E(

∑Sd−j(⌊n1/4⌋)i=1 W

(i)d−j))

2

nd−2ϵ2

=n−1d2ϵ−2(⌊n1/4⌋ΣΛd−j+1,d−j+1Wd−j + ⌊n1/4⌋λd−jσ

2W,d−j + ⌊n1/4⌋2λ2d−jW

2d−j)

→ 0 as n→ ∞. (8.44)


Hence, we have proved that n1/2(Lk(n) − Qk(n)) → 0 in probability as n → ∞,

i.e., RRR(n) ⇒ 0. Since we have established all three conditions in Theorem 4.3 and

Proposition 4.4, so their conclusions follow. As a special case of Corollary 4.5, the

covariance matrix of (ΩΩΩ,ΥΥΥ) has the same form as in (4.19) with ΣΛ:k,l = 0 for k = l,

ΣΛ,Γ:k = 0 for all k and ΣΓ:k,ki+1,j+1 = λkFk,iF

ck,j for 0 ≤ i ≤ j.


As we discussed before Theorem 5.2, we will first use a CLT for stationary processes

with mixing conditions to YYY n : n ∈ N (Step 1). Such type of CLT was established

by Ibragimov; see Theorem 18.5.3 in Ibragimov and Linnik (1971) or Theorem 0 in

Bradley (1985). We apply it in RdJ by utilizing Cramér-Wold device; see Theorem

29.4 in Billingsley (1995). Then, in Step 2, we show that (C1) holds. Finally, in Step

3, we show RRR(n) ⇒ 000 to complete the proof.

Step 1: Firstly, by the definition of Yi,j and Ai, we observe that E||AAA0||2+δ∞ < ∞

in (S2) implies that E|Yk+nd,j|2+δ < ∞ for all n ≥ 0, 0 ≤ k ≤ d − 1 and 0 ≤ j ≤

J − 1. Let F ck,j ≡ EYk,j

EYk,0=

EYk,jλk

for 0 ≤ k ≤ d − 1 and 0 ≤ j ≤ J − 1, where

λλλ = (λ0, λ1, · · · , λd−1) is in (S2). Now the FFF we defined in (4.4) can also be reduced

to finite dimensional space Rd×J , i.e.

F cF cF c =

F0,0 F0,1 · · · F0,J−1

F1,0 F1,1 · · · F1,J−1

... ... ... ...

Fd−1,0 Fd−1,1 · · · Fd−1,J−1

. (8.45)

We want to show that YYY n : n ∈ N satisfies a CLT. By Cramér-Wold device, we

only need to show that∑d−1

k=0

∑J−1j=0 θk,jYk+nd,j satisfies the corresponding CLT for


each θθθ = (θk,j)d×J ∈ Rd×J . Let

Zn(θθθ) ≡d−1∑k=0

J−1∑j=0

θk,j(Yk+nd,j − λkFck,j), n ∈ N, (8.46)

be the centralized strictly stationary process, and we need to show that it satis-

fies Theorem 0 in Bradley (1985), i.e. for some δ > 0, E|Zn(θθθ)|2+δ < ∞ and∑∞n=1 αZθθθ

(n)δ/(2+δ), where αZθθθ(n) is the strong mixing coefficient for Zn like we

introduced in (5.6). We take the δ as in (S2). Note that

E|Zn(θθθ)|2+δ = E|d−1∑k=0

J−1∑j=0

θk,j(Yk+nd,j − λkFck,j)|2+δ

≤d−1∑k=0

J−1∑j=0

|θk,j|2+δE|Yk+nd,j − λkFck,j|2+δ

≤d−1∑k=0

J−1∑j=0

|θk,j|2+δ((E|Yk+nd,j|2+δ)1/(2+δ) + λkFck,j)

2+δ <∞, (8.47)

where the first inequality uses the linearity of expectation and the second is by

Minkowski’s inequality. Since Zn(θθθ) is a linear combination of the elements of YYY n, the

corresponding sigma-algebra generated by it is smaller than the one generated by YYY n,

so that by the definition of strong mixing coefficient we know that αZθθθ(n) ≤ α(n).

Hence, given∑∞

n=1 α(n)δ/(2+δ) <∞ as in (S2), we have

∑∞n=1 αZθθθ

(n)δ/(2+δ) <∞. By

Theorem 0 in Bradley (1985), we know that σ2θθθ = EZ0(θθθ)

2 + 2∑∞

i=1E(Z0(θθθ)Zi(θθθ))

exists with the sum being absolutely convergent and n−1/2∑n−1

i=0 Zi(θθθ) ⇒ N(0, σ2θθθ).

Let

YYY (n) ≡√n(YYY (n)− diag(λλλ)F cF cF c) ≡

√n(

1

n

n−1∑i=0

YYY i − diag(λλλ)F cF cF c), (8.48)


where diag(λλλ) ∈ Rd×d is the diagonal matrix with λλλ on the diagonal. By Cramér-Wold

device, we know that

YYY (n) ⇒ ψψψ ≡

ψ0,0 ψ0,1 · · · ψ0,J−1

ψ1,0 ψ1,1 · · · ψ1,J−1

... ... ... ...

ψd−1,0 ψd−1,1 · · · ψd−1,J−1

as n→ ∞, (8.49)

where ψψψ is normal distributed with mean 0 and covariance

Cov(ψk,j, ψl,s) =E((Yk,j − λk, Fck,j)(Yl,s − λlF

cl,s))

+ 2∞∑i=1

E((yk,j − λkFck,j)(Yl+id,j − λlF

cl,s)). (8.50)

Step 2: Now we show that (8.49) implies conditon (C1) holds. Actually it suffices

to show that (λλλ(n), F cF cF c(n)) is actually a continuous function of YYY (n). It is trivial to

see that λλλ(n) = YYY (n)eee1, where eee1 = (1, 0, 0, · · · , 0)T ∈ RJ is a continuous map from

Rd×J to Rd. For F cF cF c(n) =√n(F cF cF c(n)−F cF cF c), note that it is, by (S3), a finite dimensional

matrix, so it suffices to show that each element of it is a continuous function of YYY (n).

To see it, observe that

F ck,j(n) =

√n(F c

k,j(n)− F ck,j) =

√n(Yk,j(n)

Yk,0(n)− F c

k,j)

=√n(Yk,j(n)− λkF

ck,j)− F c

k,j(Yk,0(n)− λk)

Yk,0(n)

=Yk,j(n)− F c

k,jYk,0(n)

Yk,0(n). (8.51)


In addition, (8.49) implies that Yk,0(n) → λk in probability, so by continuous mapping

theorem, Slutsky’s theorem and Cramér-Wold device, we know that

(λλλ(n), F cF cF c(n)) ⇒ (ΛΛΛ,ΓΓΓ) in Rd × Rd×J , (8.52)

where ΛΛΛ = (ψ0,0, ψ1,0, · · · , ψd−1,0) and Γk,j =ψk,j − F c

k,jψk,0

λk, 0 ≤ k ≤ d− 1, 0 ≤ j ≤

J − 1.

Step 3: Finally, since the LoSs are bounded, we only need to show RRR(n) ⇒ 000

to establish Theorem 4.3 and Proposition 4.4. Moreover, if (ΛΛΛ,ΓΓΓ) has a Gaussian

distribution, Theorem 4.5 holds as well.

Analogously to the proof of Corollary 4.6, it suffices to show that√nE(n) → 0 in

probability. To see that, for any ϵ > 0, take the same M as in the proof of Corollary

4.6, and based on (8.32), we have

P (√nE(n) > ϵ) ≤ P (n−1/2Md

d∑j=1

n∑m=n−M+1

Ad−j+(m−1)d > ϵ)

≤ P (n−1/2Md2n∑

m=n−M+1

||AAA(m−1)||∞ > ϵ)

≤ M2d2E||AAA0||∞ϵn1/2

→ 0 as n→ ∞, (8.53)

where in the last inequality we apply Markov inequality and exploit the stationarity of

AAAn. So we know RRR(n) ⇒ 000 and together with Step 2, we have proved the theorem.

BIBLIOGRAPHY 103

Bibliography

Araújo, A. and Giné, E. (1980). The Central Limit Theorem for Real and BanachValued Random Variables. Wiley series in probability and mathematical statistics.John Wiley & Sons.

Armony, M., Israelit, S., Mandelbaum, A., Marmor, Y. N., Tseytlin, Y., and Yom-Tov, G. B. (2015). On patient flow in hospitals: A data-based queueing-scienceperspective. Stochastic Systems, 5(1):146–194.

Baccelli, F. and Brémaud, P. (1994). Elements of Queueing Theory: Palm-MartingaleCalculus and Stochastic Recurrences, volume 26 of Applications of mathematics.Springer-Verlag.

Bertsimas, D. and Mourtzinou, G. (1997). Transient laws of non-stationary queueingsystems and their applications. Queueing Systems, 25(1–4):115–155.

Billingsley, P. (1995). Probability and Measure, volume 245 of Wiley Series in Prob-ability and Statistics. Wiley, 3rd edition.

Billingsley, P. (1999). Convergence of Probability Measures. Wiley Series in Proba-bility and Statistics. Wiley-Interscience, 2nd edition.

Bradley, R. C. (1985). On the central limit question under absolute regularity. TheAnnals of Probability, 13(4):1314–1325.

Breiman, L. (1968). Probability, volume 7 of Classics in Applied Mathematics. SIAM,1st edition.

Brockmeyer, E., Halstrøm, H. L., and Jensen, A. (1948). The Life and Works of A.k.Erlang. Akademiet for de Tekniske Videnskaber.

Brumelle, S. L. (1971). On the relation between customer and time averages in queues.Journal of Applied Probability, 8(3):508–520.

BIBLIOGRAPHY 104

Brumelle, S. L. (1972). A generalization of L = λW to moments of queue length andwaiting times. Operations Research, 20(6):1127–1136.

Bruneel, H. and Kim, B. G. (1993). Discrete-Time Models for Communication Sys-tems Including ATM, volume 205 of The Springer International Series in Engi-neering and Computer Science. Springer Science+Business Media, New York, 1stedition.

Calegari, R., Fogliatto, F. S., Lucini, F. R., Neyeloff, J., Kuchenbecker, R. S., andSchaan, B. D. (2016). Forecasting daily volume and acuity of patients in theemergency department. Computational and Mathematical Methods in Medicine,2016(2):1–8.

Cobham, A. (1954). Priority assignment in waiting line problems. Journal of theOperations Research Society of America, 2(1):70–76.

de Bruin, A. M., van Rossum, A. C., Visser, M. C., and Koole, G. M. (2007). Modelingthe emergency cardiac in-patient flow: an application of queuing theory. HealthCare Management Science, 10(2):125–137.

Doob, J. L. (1953). Stochastic Processes. John Wiley & Sons.

Eilon, S. (1969). A simpler proof of L = λW . Operations Research, 17(5):915–917.

El-Taha, M. and Stidham Jr., S. (1999). Sample-Path Analysis of Queueing Systems,volume 11 of International Series in Operations Research & Management Science.Springer Science+Business Media, New York, 1st edition.

Fiems, D. and Bruneel, H. (2002). A note on the discretization of Little’s result.Operations Research Letters, 30(1):17–18.

Fralix, B. H. and Riaño, G. (2010). A new look at transient versions of Little’s law, andM/G/1 preemptive last-come-first-served queues. Journal of Applied Probability,47(2):459–473.

Glynn, P. W. and Whitt, W. (1986). A central-limit-theorem version of L = λW .Queueing Systems, 1(2):191–215. (See Correction Note on L = λW , QueueingSystems, 12 (4), 1992, 431-432. The results are correct; minor but important changeneeded in proofs.).

Glynn, P. W. and Whitt, W. (1987). Sufficient conditions for functional limit theoremversions of L = λW . Queueing Systems, 1(3):279–287.

BIBLIOGRAPHY 105

Glynn, P. W. and Whitt, W. (1988a). An lil version of L = λW . Mathematics ofOperations Research, 13(4):693–710.

Glynn, P. W. and Whitt, W. (1988b). Ordinary CLT and WLLN versions of L = λW .Mathematics of Operations Research, 13(4):674–692.

Glynn, P. W. and Whitt, W. (1989). Indirect estimation via L = λW . OperationsResearch, 37(1):82–103.

Gross, D., Shortle, J. F., Thompson, J. M., and Harris, C. M. (2018). Foundamentalsof Queueing Theory. Wiley Series in Probability and Statistics. John Wiley & Sons,5th edition.

Gut, A. (2009). Stopped Random Walks: Limit Theorems and Applications. SpringerSeries in Operations Research and Financial Engineering. Springer, 2 edition.

Heyman, D. P. and Stidham, Jr., S. (1980). The relation between customer and timeaverages in queues. Operations Research, 28(4):983–994.

Hoot, N. R., LeBlanc, L. J., Jones, I., Levin, S. R., Zhou, C., Gadd, C. S., andAronsky, D. (2008). Forecasting emergency department crowding: a discrete eventsimulation. Annals of Emergency Medicine, 52(2):116–125.

Hung, G. R., Whitehouse, S. R., O’Neill, C. B., Gray, A. P., and Kissoon, N. (2007).Computer modeling of patient flow in a pediatric emergency department usingdiscrete event simulation. Pediatric Emergency Care, 23(1):5–10.

Ibragimov, I. A. and Linnik, I. V. (1971). Independent and Stationary Sequences ofRandom Variables. Groningen: Wolters-Noordhoff.

Ibrahim, R. and Whitt, W. (2011a). Real-time delay estimation based on delayhistory in many-server service systems with time-varying arrivals. Production andOperations Management, 20(5):654–667.

Ibrahim, R. and Whitt, W. (2011b). Wait-time predictors for customer service sys-tems with time-varying demand and capacity. Operations research, 59(5):1106–1118.

Jewell, W. S. (1967). A simple proof of: L = λW . Operations Research, 16(6):1109–1116.

BIBLIOGRAPHY 106

Jones, S. A., Joy, M. P., and Pearson, J. (2002). Forecasting demand of emergencycare. Health Care Management Science, 5(4):297–305.

Jones, S. S., Thomas, A., Scott, E. R., Welch, S. J., Haug, P. J., and Snow, G. L.(2008). Forecasting daily patient volumes in the emergency department. AcademicEmergency Medicine, 15(2):159–170.

Kim, S.-H. and Whitt, W. (2013). Statistical analysis with Little’s law. OperationsResearch, 61(4):1030–1045.

Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces: Isoperimetryand Processes. Classics in Mathematics. Springer-Verlag Berlin Heidelberg, 1stedition.

Little, J. D. and Graves, S. C. (2008). Little’s law. In Chhajed, D. and Lowe,T. J., editors, Building Intuition: Insights From Basic Operations ManagementModels and Principles, volume 115 of International Series in Operations Research& Management Science, chapter 5, pages 81–100. Springer.

Little, J. D. C. (1961). A proof of the queueing formula: L = λW . OperationsResearch, 9(3):383–387.

Little, J. D. C. (2011). Little’s law as viewed on its 50th anniversary. OperationsResearch, 59(3):536–549.

Marcilio, I., Hajat, S., and Gouveia, N. (2013). Forecasting daily emergency depart-ment visits using calendar variables and ambient temperature readings. AcademicEmergency Medicine, 20(8):769–777.

Medeiros, D. J., Swenson, E., and DeFlitch, C. (2008). Improving patient flow ina hospital emergency department. In 2008 Winter Simulation Conference, pages1526–1531. IEEE.

Megginson, R. E. (1998). An Introduction to Banach Space Theory, volume 183 ofGraduate Texts in Mathematics. Springer-Verlag New York, 1st edition.

Morse, P. M. (1958). Queues, Inventories and Maintenance: The Analysis of Oper-ational Systems with Variable Demand and Supply. John Wiley & Sons.

Munkres, J. R. (2000). Topology. Featured Titles for Topology Series. Prentice Hall,2nd edition.

BIBLIOGRAPHY 107

Rolski, T. (1989). Relationships between characteristics in periodic Poisson queues.Queueing Systems, 4(1):17–26.

Rudin, W. (1976). Principles of Mathematical Analysis. International Series in Pureand Applied Mathematics. McGraw-Hill, 3rd edition.

Schweigler, L. M., Desmond, J. S., McCarthy, M. L., Bukowski, K. J., Ionides, E. L.,and Younger, J. G. (2009). Forecasting models of emergency department crowding.Academic Emergency Medicine, 16(4):301–308.

Sigman, K. (1995). Stationary Marked Point Processes: An Intuitive Approach, vol-ume 2 of Stochastic Modeling Series. Chapman & Hall/CRC.

Sigman, K. and Whitt, W. (2019). Marked point processes in discrete time. QueueingSystems. Forthcoming.

Stidham, Jr., S. (1974). A last word on L = λW . Operations Research, 22(2):417–421.

Tandberg, D. and Qualls, C. (1994). Time series forecasts of emergency depart-ment patient volume, length of stay, and acuity. Annals of Emergency Medicine,23(2):299–306.

Whitt, W. (1980). Some useful functions for functional limit theorems. Mathematicsof Operations Research, 5(1):67–85.

Whitt, W. (1991). A review of L = λW and extensions. Queueing Systems, 9(3):235–268.

Whitt, W. (1992). Correction note on L = λW . Queueing Systems, 12(3–4):431–432.(The results in the previous papers are correct, but minor important changes areneeded in some proofs.).

Whitt, W. (1999). Predicting queueing delays. Management Science, 45(6):870–888.

Whitt, W. (2002). Stochastic-Process Limits. Springer, New York.

Whitt, W. (2012). Extending the FCLT version of L = λW . Operations ResearchLetters, 40(4):230–234.

Whitt, W. (2018). Time-verying queues. Queueing Models and Servic Management,1(2):79–164.

BIBLIOGRAPHY 108

Whitt, W. and Zhang, X. (2017). A data-driven model of an emergency department.Operations Research for Health Care, 12:1–15.

Whitt, W. and Zhang, X. (2019a). A central-limit-theorem version of the periodicLittle’s law. Queueing Systems, 91(1–2):15–47.

Whitt, W. and Zhang, X. (2019b). Forecasting arrivals and occupancy levels in anemergency department. Operations Research for Health Care, 21:1–18.

Whitt, W. and Zhang, X. (2019c). Periodic Little’s law. Operations Research,67(1):267–280.

Wolff, R. W. and Yao, Y.-C. (2014). Little’s law when the average waiting time isinfinite. Queueing Systems, 76(3):267–281.

Zeltyn, S., Marmor, Y. N., Mandelbaum, A., Carmeli, B., Greenshpan, O., Mesika,Y., Wasserkrug, S., Vortman, P., Shtub, A., Lauterman, T., Schwartz, D.,Moskovtch, K., Tzafrir, S., and Basis, F. (2011). Simulation-based models of emer-gency departments: Operational, tactical, and strategic staffing. ACM Transactionson Modeling and Computer Simulation, 21(4):1–25.

Periodic Little's Law - Columbia University

Documents

Transcript of Periodic Little's Law - Columbia University