Periodic Little's Law - Columbia University
Transcript of Periodic Little's Law - Columbia University
Periodic Little’s Law
Xiaopei Zhang
Submitted in partial fulfillment of the
requirements for the degree
of Doctor of Philosophy
in the Graduate School of Arts and Sciences
COLUMBIA UNIVERSITY
2019
©2019
Xiaopei Zhang
All Rights Reserved
ABSTRACT
Periodic Little’s Law
Xiaopei Zhang
In this dissertation, we develop the theory of the periodic Little’s law (PLL) as well
as discussing one of its applications. As extensions of the famous Little’s law, the
PLL applies to the queueing systems where the underlying processes are strictly or
asymptotically periodic. We give a sample-path version, a steady-state stochastic
version and a central-limit-theorem version of the PLL in the first part. We also
discuss closely related issues such as sufficient conditions for the central-limit-theorem
version of the PLL and the weak convergence in countably infinite dimensional vector
space which is unconventional in queueing theory.
The PLL provides a way to estimate the occupancy level indirectly. We show how
to construct a real-time predictor for the occupancy level inspired by the PLL as an
example of its applications, which has better forecasting performance than the direct
estimators.
Table of Contents
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
1 Introduction 1
1.1 Little’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Extensions of Little’s Law . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.1 Time-Varying Little’s Law . . . . . . . . . . . . . . . . . . . . 5
1.2.2 Central-Limit-Theorem version of Little’s Law . . . . . . . . . 8
1.3 Applied Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4 Applications of the Periodic Little’s Law . . . . . . . . . . . . . . . . 11
1.5 Main Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
I Periodic Little’s Law 14
2 Sample-Path Version of the PLL 15
2.1 Notation and Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 The Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3 The Assumptions in Theorem 2.2 . . . . . . . . . . . . . . . . . . . . 18
2.4 Indirect Estimation of Lk via the PLL . . . . . . . . . . . . . . . . . 20
2.5 Departure Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
i
2.6 Connection to Cumulative Processes . . . . . . . . . . . . . . . . . . 24
2.7 Different Orders of Events in Discrete Time . . . . . . . . . . . . . . 26
3 Steady-State Stochastic Versions of the PLL 30
3.1 Periodic Steady State . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2 The Stochastic PLL . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3 The Discrete-Time Periodic Gt/GIt/∞ Model . . . . . . . . . . . . . 33
3.4 Exploiting the Palm Theory in Continuous Time . . . . . . . . . . . 34
4 CLT Version of the PLL 36
4.1 More Notation and Definitions . . . . . . . . . . . . . . . . . . . . . . 37
4.2 A Practical Version for Applications . . . . . . . . . . . . . . . . . . 38
4.3 The General Version . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.4 Statistical Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5 Sufficient Conditions for the CLT Version of the PLL 49
6 Weak Convergence of Random Elements of (ℓ1)d 54
6.1 A Criterion for Weak Convergence in (ℓ1)d . . . . . . . . . . . . . . . 55
6.2 Central Limit Theorem in ℓ1 . . . . . . . . . . . . . . . . . . . . . . . 59
II An Application of the Periodic Little’s Law 63
7 Real-Time Predictor for the Occupancy Level 64
7.1 The Data and the Problem . . . . . . . . . . . . . . . . . . . . . . . 65
7.2 Real-Time Predictor for the Occupancy Level . . . . . . . . . . . . . 66
7.2.1 Predicting Occupancy Level One Hour Ahead . . . . . . . . . 66
7.2.2 Predicting Occupancy Level More Than One Hour Ahead . . 71
ii
III Proofs 75
8 Proofs 76
8.1 Proofs for the Sample-Path Version of the Periodic Little’s Law . . . 76
8.1.1 Proof of Lemma 2.1 . . . . . . . . . . . . . . . . . . . . . . . 76
8.1.2 Proof of Theorems 2.2 and 2.3 . . . . . . . . . . . . . . . . . . 78
8.1.3 Proof of the Second Half of Corollary 2.4 . . . . . . . . . . . . 84
8.2 Proofs for the Central-Limit-Theorem Version of the Periodic Little’s
Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
8.2.1 Proof of Lemma 4.2 . . . . . . . . . . . . . . . . . . . . . . . 85
8.2.2 Proof of Theorem 4.3 . . . . . . . . . . . . . . . . . . . . . . . 87
8.2.3 Proof of Proposition 4.4 . . . . . . . . . . . . . . . . . . . . . 89
8.2.4 Proof of Corollary 4.5 . . . . . . . . . . . . . . . . . . . . . . 90
8.2.5 Proof of Corollary 4.6 . . . . . . . . . . . . . . . . . . . . . . 92
8.2.6 Proof of Theorem 5.1 . . . . . . . . . . . . . . . . . . . . . . . 93
8.2.7 Proof of Theorem 5.2 . . . . . . . . . . . . . . . . . . . . . . . 99
Bibliography 103
iii
List of Figures
1.1 Estimated arrival rate at an Emergency Department over a week. . . 6
1.2 A comparison of the estimated mean ED occupancy level from (i)
simulations of multiple replications of the model fit to the data to (ii)
direct estimates from the data. . . . . . . . . . . . . . . . . . . . . . 11
2.1 An example of a periodic queueing system with d = 5 and n = 4. The
vertical line is placed at time nd = 20. Area A corresponds to CQ(4)
in (2.16) below, while Area A∪B corresponds to CL(4) in (2.16). . . . 25
7.1 Predicted occupancy level one hour ahead compared to the true occu-
pancy level. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
iv
List of Tables
7.1 MSE for predicting the occupancy several hours ahead. . . . . . . . . 73
7.2 MSE for predicting the occupancy by the rolling averages of n weeks. 73
v
Acknowledgments
I would like to convey my profoundest gratitude to Professor Ward Whitt. As my
Ph.D. advisor, Professor Whitt has not only enlightened me on academic knowledge
but also given me a model of integrity, enthusiasm, and devotion. He is always
energetic and passionate in doing research as well as being very patient and skillful
in mentoring his students. He is a leading authority in queueing theory, but he still
keeps curiosity and open-mindedness to new areas and ideas, which deeply inspires
me. He trains his students in a very responsible way. He illustrates how to do research
systematically from coming up with ideas, exploring questions and narrowing down
the topic to recording and organizing research results, writing in an academic style
and responding to reviewers. As an international student from China, I appreciate
his emphasis on both writing and oral communication skills, which has been proved
to be valuable and crucial even beyond research. I am also grateful for his awareness
of cultural differences and his efforts to help us overcome it. It is impossible to have
such a wonderful Ph.D. life without his continued kind support. I would like to thank
Mrs. Joanna Whitt as well for her understanding and support such that Professor
Whitt could devote so much time to work.
I wish to express great appreciation to Professor Karl Sigman, Professor Mark
Brown, Professor Van-Anh Truong and Professor Henry Lam for their time serving
on my dissertation committee and providing invaluable advices. Special thanks go
to Professor Karl Sigman who provided generous help during my stressful time and
vi
Professor Mark Brown who gave me great support for my Ph.D. application.
Being a member of the Department of IEOR is a wonderful experience. The faculty
members and staffs are invariably supportive. I would like to thank Professor Ward
Whitt, Professor Donald Goldfarb, Professor David D. Yao, and Professor Clifford
Stein for their instruction in the Ph.D. core courses as well as Professor Jose H.
Blanchet, Professor Shipra Agrawal for the great seminar courses I attended. I also
thank Professor Daniel Bienstock, Professor Yuan Zhong, Professor David Yao, and
Professor Karl Sigman, with whom I worked as a teaching assistant and had good
cooperation. I appreciate the daily supports from Lizbeth Morales, Kristen Maynor,
Jaya Mohanty, Carmen Ng, and other staff team members. I am also grateful to
many faculty members and staffs in the Department of Statistics, including Professor
Mark Brown, Professor Victor H. de la Peña, Professor Michael E. Sobel, Professor
Tian Zheng, Professor Philip E. Protter, Professor Yang Feng, Professor Demissie
Alemayehu, and Ms. Dood Kalicharan. Without their help, I would not be able to
get my Master’s degree and pursue my Ph.D. degree.
My sincere thanks also go to all friends during my Ph.D. life, Yang Kang, Ji Xu,
Jing Wu, Wei You, Di Xiao, Ni Ma, Shuoguang Yang, Fengpei Li, Chaoxu Zhou, Fan
Zhang, Yanping Liu, Yunjie Sun, Yeqing Zhou, Xu Sun, Zhipeng Liu, Yanan Pei, Yin
Lu, Cun Mu, Chun Wang, Xinshang Wang, Lin Chen, Xiaowei Tan, Jingtong Zhao
and Xuan Zhang among others, who made this four-year life colorful.
Last but not least, I give my sincerest thanks to my wife Jing Ai, my parents
Jing Zhang and Ying Wang, and my other family members for their unconditional
and unfailing love in my life. They share my happiness and sadness. I can hardly go
through all these years without them.
vii
To
my loving family
and
all the people suffering in queues.
viii
CHAPTER 1. INTRODUCTION 1
Chapter 1
Introduction
The study of queueing systems involves mathematical modeling of a service system.
It includes those systems providing services from humans such as call centers, hos-
pitals, banks, supermarkets, etc. and those providing other kinds of services such as
computing, communication, transportation, and manufacturing systems. The service
providers are modeled as ”servers”, and the service receivers are called ”customers”
or ”tasks”. As a branch of operations research, this abstract yet versatile frame-
work is proved to be powerful for people to optimize the system according to desired
performance with limited resources by understanding and exploiting its stochastic
nature.
Queueing theory originates from the research of Danish mathematician Agner K.
Erlang when he built mathematical models for the Copenhagen telephone exchange
and published ”The Theory of Probabilities and Telephone Conversations” in 1909;
see Brockmeyer et al. (1948) and Chapter 1 of Gross et al. (2018). It experienced a
slow development until 1950-1970, when a great number of papers came out, among
them the famous Little’s law by Little (1961).
Regarded as one of the fundamental principles for queueing systems, Little’s law
CHAPTER 1. INTRODUCTION 2
relates three elemental quantities in a queueing system – the queue length (L), the
arrival rate (λ), and the sojourn time (W ) – by a concise formula:
L = λW. (1.1)
The queueing length may also be called the occupancy level or the number of cus-
tomers in the system and the sojourn time may also be called the service time. For
healthcare systems, as we will discuss soon as our motivation, the sojourn time of a
patient may be called the patient’s length of stay (LoS) as well. We will use them in-
terchangeably throughout this thesis. However, the practical meaning of those terms
can change, depending on the setting, even for a fixed model. For example, the sys-
tem can be one server or all the servers or the whole system including the waiting
space. Of course, there are mathematical complexities behind this simple formula.
For example, all three quantities are in the sense of long-run averages or expectations.
We will go into details in later sections. Little’s law in queueing theory acts as Ohm’s
law in circuitry, and when two of the three quantities for a system are specified, the
third must be determined by the law. However, we must understand that Little’s
law is a mathematical theorem. It is a tautology instead of a physical law where we
summarize the pattern from the experiment data, as emphasized by Little and Graves
(2008).
This thesis contributes to Little’s law by generalizing it to a periodic time-varying
case in a discrete-time framework. We allow the three quantities L, λ and W to be
periodic functions of time. Though Little’s law holds as well for periodic queueing
systems if we treat the three quantities as long-run averages or, equivalently, the
averages over a cycle, the new work provides a more accurate description of such
systems.
CHAPTER 1. INTRODUCTION 3
The rest of the introduction is divided into five sections. In §1.1, we briefly review
the history of the ordinary Little’s law and discuss it in a formal mathematical way.
The next section (§1.2) reviews two ways to extend Little’s law and introduces the
motivation of this work from a theoretical perspective. Then in §1.3, we introduce
our motivation from an applied view. We introduce the applications of the Periodic
Little’s law in §1.4. Our main contributions are summarized in §1.5.
1.1 Little’s Law
The first use of Little’s law is often attributed to Cobham (1954). He asserted that
the expected number of units in a queue equals to the product of the arrival rate
and the expected waiting time without any proof. Then, in Chapter 7 of Philip
M. Morse’s book ”Queues, Inventories and Maintenance” published in 1958 Morse
(1958), the well-known formula first time appeared in its explicit form ”L = λW”,
but was proved for a specific situation. It is John D. C. Little who made the first
effort in Little (1961) to show that this seemingly intuitive relationship holds in the
general case. He proved the formula by exploiting the ergodic theorems for strictly
stationary stochastic processes defined on (−∞,∞) based on Joseph L. Doob’s work
Doob (1953). He also drew a realized sample path to facilitate the proof and was
close to a sample-path proof, even though that was not accomplished in that paper.
Attempts were made in the following years as in Jewell (1967) and Eilon (1969) to
simplify the proof by assuming the processes to be regenerative. A milestone was
reached in 1974 when Shaler Stidham, Jr. gave a clean sample-path proof of Little’s
law with minimum assumptions in his paper ”A Last Word on L = λW” Stidham
(1974). However, far from ”a last word”, it is rather a beginning of sample-path
analysis; see El-Taha and Stidham Jr. (1999), Fiems and Bruneel (2002) and Wolff
CHAPTER 1. INTRODUCTION 4
and Yao (2014). Extensions and generalizations of Little’s law were also developed
afterward. We will review different forms of Little’s law in §1.2, so in this section, we
only focus on the ordinary sample-path version of Little’s law.
There are also useful reviews of Little’s law, e.g., Whitt (1991), Whitt (1992),
Little and Graves (2008) and Little (2011).
We remark that Little’s law can be regarded as a special case of a more general
relationship H = λG, where λ is the arrival rate and H and G are respectively time
and customer averages of some arbitrary queue statistics having a certain relationship
to each other; see Brumelle (1971), Brumelle (1972) and Heyman and Stidham (1980).
Since this generalization is out of the scope of this thesis, we shall not discuss it
further.
Here we state the ordinary sample-path version of Little’s law under the framework
of Stidham (1974) with slightly different notation (same as in Whitt (1991)), and we
will use this framework in our work. Suppose that customers arrive at a system at
time epochs Ak, where k = 1, 2, · · · is the index of the customer and Ak is a real-
number sequence satisfying 0 ≤ Ak ≤ Ak+1. Let Dk be the departure time epochs
respectively with Ak ≤ Dk so that Wk ≡ Dk −Ak ≥ 0 is the service time (also called
sojourn time or length of stay, LoS). We use ”≡” for ”equal by definition” throughout
this thesis. Let Ik(t) be the indicator function of customer k, i.e., Ik(t) = 1 if the kth
customer is in the system at time t and Ik(t) = 0 otherwise. Then Wk can also be
express as Wk =∫∞0Ik(t)dt. Let Q(t) ≡
∑∞k=0 Ik(t) be the number of customers in
the system at time t. Let A0 ≡ 0 and A(t) ≡ maxk ≥ 0 : Ak ≤ t count the number
of arrivals in [0, t].
Theorem 1.1 (sample-path version of Little’s law). If limt→∞
A(t)/t = λ where 0 ≤ λ ≤
∞ and limn→∞
∑nk=1Wk/n = W , then lim
t→∞
∫ t
0Q(s)ds/t = L and (1.1), i.e. L = λW ,
CHAPTER 1. INTRODUCTION 5
holds.
Remark 1.1 (asymmetry in Little’s law). In Theorem 1.1, the relation between the
three quantities is asymmetric. We allow W to be ∞, in which case L will also
be ∞. However, it is not true that the existence of any two of the three limits
implies the convergence of the third; see Example 1 of Glynn and Whitt (1986) for a
counterexample.
1.2 Extensions of Little’s Law
Like other good mathematical theorems, Little’s law has clean and strong conclusions
with mild assumptions. However, many kinds of extensions are also worth exploring.
Here we identify two directions:
1. from non-time-varying to time-varying, and
2. from strong-law-of-large-number (SLLN or LLN) type to central-limit-theorem
(CLT) type.
We will discuss these two extensions in the following two sections.
1.2.1 Time-Varying Little’s Law
It is usually good to start research from simple settings. Most stochastic processes
textbooks only cover the non-time-varying queueing models. However, there are also
many works on time-varying queues; see Whitt (2018) for an overview of time-varying
queues.
Little’s law is concerned with the long-run limits or expectations of queue length,
arrival rate and LoS. If the system is deterministic and has a constant arrival rate
CHAPTER 1. INTRODUCTION 6
and LoS, then Little’s law perfectly describes everything of the system. However,
except the speed of light in vacuum, few things in the real world are deterministic
and constant. Figure 1.1 taken from Whitt and Zhang (2017) shows the estimated
hourly patient arrival rates to an Emergency Department (ED) located in Haifa, Israel
using 25 weeks data collected from December 2004 to May 2005. We can see that
the arrival rate is obviously time-varying instead of being constant. If the arrival rate
and LoS distribution are changing over time, as long as the long-run limits or the
expectation under a proper stochastic framework exist, then Little’s law still holds,
but we lose the information about its time-varying nature and capture only the first
moment feature of the system by using the ordinary Little’s law. Hence it is natural
to seek generalizations of Little’s law that will be able to characterize the time-varying
nature of a queueing system.
05
1015
20
time of a week
patie
nts
per
hour
Sun Mon Tue Wed Thu Fri Sat
Figure 1.1: Estimated arrival rate at an Emergency Department over a week.
In 1997, Dimitris Bertsimas and Georgia Mourtzinou established a transient Lit-
tle’s law for non-stationary queueing models in Bertsimas and Mourtzinou (1997) as
a generalization of the classical stationary version of Little’s law by identifying the
CHAPTER 1. INTRODUCTION 7
structural relationships between the number of customers in the system and the delay
at time t. Let Λ(t) = E(A(t)) where A(t) is the arrival counting process, and assume
that Λ(t) =∫ t
0λ(u)du. The main result has the form
E(L(t)) =
∫ t
0
λ(u)P (Su > t− u)du, (1.2)
where L(t) is the number of customers in the system at time t and Su represents the
time spent in the system for a customer that arrived at time u. Despite the good
idea, the mathematics there is not entirely satisfying. Fralix and Riaño (2010) put it
on more solid ground with more applications using Palm measures.
In this thesis, we make extensions by assuming the system is periodic and in
discrete time framework. However, as can be seen later, rather than directly assuming
its periodicity, we make it implicit by assuming the existence of limit if we take
averages every d time periods. The periodic Little’s law states that, under appropriate
conditions,
Lk =∞∑j=0
λk−jFck−j,j, k = 0, 1, . . . , d− 1, (1.3)
where d is the number of time points within each periodic cycle, Lk is the long-run
average number of customers in system at time k, λk is the long-run average number
of arrivals at time k, and F ck,j, j ≥ 0, is the long-run proportion of arrivals at time
k that remain in the system for at least j time points, which can be viewed as the
complementary cumulative distribution function (ccdf) of the length of stay of an
arbitrary arrival. The long-run averages are over all indices of the form k + md,
m ≥ 0. These quantities λk, F ck,· and Lk are periodic functions of the time index k,
exploiting the extension of these periodic functions to all integers, negative as well
as positive. We can see that (1.3) can be viewed as a discrete-time analog to (1.2).
CHAPTER 1. INTRODUCTION 8
Chapter 2 is devoted to providing a sample-path version of the PLL and Chapter 3
provides a steady-state version of PLL. We remark that the PLL in this thesis has
already motivated further research related to it, such as Sigman and Whitt (2019),
where a general framework for stationary marked point processes in discrete time is
developed and used to provide a direct derivation of the periodic-stationary Little’s
law.
Our work is initially motivated by our analysis to the patient flow of an Emergency
Department in Whitt and Zhang (2017) rather than directly from a theoretical view.
We describe our motivation from an applied perspective in §1.3.
1.2.2 Central-Limit-Theorem version of Little’s Law
From non-time-varying to time-varying is one way to describe the system more accu-
rately. Another way is to not only characterize the long-run limits of those quantities,
but also consider the rate of convergence. We say a ”sample-path version” which
means that the proof depends on the analysis to an arbitrary sample path without
any probability characterization of the system, thus when we impose some stochastic
structure of the system, as long as the sample-path analysis holds, the result is true
with probability 1 (w.p. 1). Taking averages over time to approximate the mean w.p.
1 is of the same spirit of the strong law of large numbers (SLLN). In probability,
besides the SLLN which assures us the convergence of sample mean, we also (with
strong assumptions) have central limit theorem (CLT) which tells us the ”conver-
gence speed” of the sample mean. Further, we have functional central limit theorem
(FCLT) known as Donsker’s theorem as a functional extension of CLT. This inspires
people to do the same for Little’s law.
From 1986 to 1989, Peter W. Glynn and Ward Whitt wrote a series of papers
CHAPTER 1. INTRODUCTION 9
in this direction. In Glynn and Whitt (1986) they established a FCLT version of
Little’s law where they identified the key remainder terms (ends effect) that must
be asymptotically negligible such that the relationship holds. It involves the theory
of weak convergence of probability measure on the function space D[0,∞), which
includes all right continuous with left limits (also known as càdlàg for French: ”con-
tinue à droite, limite à gauche”) functions on [0,∞); see Billingsley (1999) and Whitt
(1980). The proof is essentially based on the continuous mapping theorem. Later in
Glynn and Whitt (1987), they provided sufficient conditions based on regenerative
structure such that the remainder terms to go to 0. The FCLT version of Little’s
law will imply the CLT version if the function is continuous at the point 1. A CLT
version of Little’s law with direct deduction (not as a corollary of a FCLT version)
was provided in Glynn and Whitt (1988b). The key is again identifying the remain-
der terms that need to vanish when taking limits and then find proper conditions
on the joint process of arrival process and LoSs such that it holds. They found that
stationarity of the input process was a sufficient but not necessary condition for the
CLT version of Little’s law. As an important and interesting statistical application of
the FCLT/CLT version of Little’s law, Glynn and Whitt (1989) discuss the issue of
indirectly estimating the occupancy level via Little’s law. They showed that the CLT
version of Little’s law could compare the accuracy of the occupancy level estimation
by the variance of the limit variables, thus provide a guide on choosing between direct
and indirect estimations.
Inspired by the above works, we also established a CLT version of the PLL and
provided sufficient conditions for it. It requires stronger conditions compared to the
sample-path version of the PLL but provides a characterization of the convergence
rates of related quantities. We will discuss this in Chapter 4 and Chapter 5. A
discussion of its statistical application is also included there.
CHAPTER 1. INTRODUCTION 10
We remark that there is also a law-of-iterated-logarithm version of Little’s law;
see Glynn and Whitt (1988a). It gives a more accurate description of the convergence
rate for Little’s law with proper conditions, just like the law of iterated logarithm for
independent, identically distributed random variables.
1.3 Applied Motivation
As mentioned in the previous section, the PLL was motivated by an applied pa-
per Whitt and Zhang (2017), where we built a data-driven stochastic model for the
Emergency Department of an Israeli hospital, using 25 weeks of data from the data
repository associated with the study by Armony et al. (2015). We found that both
the arrival rates (as shown in Figure 1.1) and the LoS distributions are time-varying
and can be viewed as periodic with circle length being 1 week. Given the sample size,
we assume the arrival rates and LoS distributions are constants within each hour.
This gives us a periodic discrete-time framework. We proposed a queueing model
for the system and then conducted simulations to test our model. We simulated the
arrivals and LoSs of each customer so that we could calculate the occupancy level.
We repeated the simulation and compared the mean curve to the empirical one, and
as shown in Figure 1.2 (from Whitt and Zhang (2017)), we observed that the two
curves coincided with each other almost perfectly. On the one hand, it means that our
proposed model fits the data well; on the other hand, we realized that there is some
Little’s-law-type theorem behind it that ensures the relationship between occupancy
level, arrival rates, and LoS distributions.
From a statistical perspective, if we have any data from a periodic system, it
is a very natural thought to take averages over periods as statistics for estimating
interested quantities. For a queueing system, we can estimate the occupancy level,
CHAPTER 1. INTRODUCTION 11
010
2030
4050
60
time of a week
num
ber
of p
atie
nts
Sun Mon Tue Wed Thu Fri Sat
Estimated mean from dataEstimated mean from simulation
Figure 1.2: A comparison of the estimated mean ED occupancy level from (i) sim-ulations of multiple replications of the model fit to the data to (ii) direct estimatesfrom the data.
the arrivals rates, and the LoS distributions in such a way from the data. Then
we can calculate the occupancy level from the estimated arrival rates and the LoS
distributions, and we call it the indirect estimation of the occupancy level. Then
the question is: under what condition, the indirect estimation of occupancy level is
consistent with the direct estimation? This is very similar to the question in Glynn
and Whitt (1989), but with a periodic setting.
We answered the question above by the periodic Little’s law in this thesis.
1.4 Applications of the Periodic Little’s Law
The PLL comes from the ED patients flow analysis. In the end, it contributes back
to our ED patients flow forecast problem. In Whitt and Zhang (2019b), we find
that based on good model and predicting method for the arrival rates and the LoS
distributions, the indirect estimation of future occupancy level via the PLL which
CHAPTER 1. INTRODUCTION 12
incorporates the real-time information beats the corresponding direct estimation. We
propose a real-time occupancy predictor which exploits the currently observed occu-
pancy level and the empirical hazard functions, given the elapsed LoS for each patient
in the system. That is a variant of the approach suggested in §6 of Whitt (1999); see
Ibrahim and Whitt (2011a,b) and references there for related work. The construction
of the real-time occupancy level as an indirect estimator heavily relies on our PLL.
We think it is an excellent example of the potential power of the PLL.
Due to limited resource and high cost, healthcare has become one of the major
application fields of operations research. Considerable research has been done to
forecast patient flow in the ED. Some of them, e.g. Tandberg and Qualls (1994), Jones
et al. (2002), Jones et al. (2008), Calegari et al. (2016) and Marcilio et al. (2013),
focus on the arrival pattern only; some, e.g. Hoot et al. (2008) and Schweigler et al.
(2009), focus on the occupancy level only; others, e.g. de Bruin et al. (2007), consider
queueing models but only with a stationary model. With the PLL, our forecasting
method connects the prediction of arrival rates and the occupancy level while keeps
the time-varying nature of the ED patient flow. Research has also been done on the
patient flow based on simulation methods, e.g., Hung et al. (2007), Medeiros et al.
(2008) and Zeltyn et al. (2011), however, none of them realized the fundamental
relationship between the occupancy level, the arrivals and the LoS distributions they
used, though seemingly trivial, could be and should be formalized and justified.
We will demonstrate this application of the PLL in Chapter 7.
1.5 Main Contributions
In summary, the main contributions of this thesis are:
1. We develop both a sample-path version and a stationary version of the periodic
CHAPTER 1. INTRODUCTION 13
Little’s law in discrete time. It generalizes the ordinary Little’s law.
2. We establish a CLT-version of the periodic Little’s law. It parallels the result
in Glynn and Whitt (1988b) which is for the ordinary Little’s law. We also
provide practical sufficient conditions for the CLT-version of the PLL. Potential
statistical applications are also discussed.
3. We give an example of the application of the PLL in forecasting the occupancy
level of an Emergency Department. It shows the potential use of our theorems
in applied problems.
14
Part I
Periodic Little’s Law
CHAPTER 2. SAMPLE-PATH VERSION OF THE PLL 15
Chapter 2
Sample-Path Version of the
Periodic Little’s Law
In this chapter, we develop the sample-path Periodic Little’s Law in discrete time.
This version is general in that (i) we do not directly make any stochastic assumptions
and (ii) we do not directly impose any periodic structure. Instead, we assume that
natural limits exist, which we take to be with probability 1 (w.p.1). It turns out that
the periodicity of the limit emerges automatically from the assumed existence of the
limits.
2.1 Notation and Definitions
We index the discrete time points by integers i, i ≥ 0. Because multiple events can
happen at these times, we need to specify the order of events as in the large literature
on discrete-time queues, e.g., Bruneel and Kim (1993). We assume that all arrivals at
one time occur before any departures. Moreover, we count the number of customers
in the system after the arrivals but before the departures. Thus, each arrival can
CHAPTER 2. SAMPLE-PATH VERSION OF THE PLL 16
spend time j in the system for any j ≥ 0. Our convention yields a conservative
upper bound on the occupancy. We discuss other possible orderings of events and the
relation between continuous time and discrete time in §2.7.
With these conventions, we focus on a single sequence, X ≡ Xi,j : i ≥ 0; j ≥ 0,
with Xi,j denoting the number of arrivals at time i that have a LoS j. We also could
have customers at the beginning, but without loss of generality, we can view them as
a part of the arrivals at time 0. We define other quantities of interest in terms of X.
In particular, with ≡ denoting equality by definition, the key quantities are:
Yi,j ≡∑∞
s=j Xi,s: the number of arrivals at time i with a LoS greater or equal to j,
j ≥ 0,
Ai ≡ Yi,0 =∑∞
s=0Xi,s: the total number of total arrivals at time i,
Qi ≡∑i
j=0 Yi−j,j =∑i
j=0Ai−jYi−j,j
Ai−j
: the number of customers in system at time i,
all for i ≥ 0. In the last line, and throughout the paper, we understand 0/0 ≡ 0, so
that we properly treat times with 0 arrivals.
We do not directly make any periodic assumptions, but with the periodicity in
mind, we consider the following averages over n periods:
λk(n) ≡ 1
n
n∑m=1
Ak+(m−1)d,
Qk(n) ≡ 1
n
n∑m=1
Qk+(m−1)d =1
n
n∑m=1
k+(m−1)d∑j=0
Yk+(m−1)d−j,j
,
Yk,j(n) ≡ 1
n
n∑m=1
Yk+(m−1)d,j, j ≥ 0,
F ck,j(n) ≡ Yk,j(n)
λk(n)=
∑nm=1 Yk+(m−1)d,j∑nm=1Ak+(m−1)d
, j ≥ 0, and
Wk(n) ≡∞∑j=0
F ck,j(n), 0 ≤ k ≤ d− 1, (2.1)
CHAPTER 2. SAMPLE-PATH VERSION OF THE PLL 17
where d is a positive integer.
Clearly, λk(n) is the average number of arrivals at time k, 0 ≤ k ≤ d − 1, over
the first n periods; Similarly, Qk(n) is the average number of customers in the system
at time k, while Yk,j(n) is the average number of customers that arrive at time k
that have a LoS greater or equal to j. Thus, F ck,j(n) is the empirical ccdf, which is
the natural estimator of the LoS distribution’s ccdf of an arrival at time k. Finally,
Wk(n) is the average LoS of customers that arrive at time k. We will let n→ ∞.
2.2 The Limit Theorem
With the framework introduced above, we can state one of our main theorems, the
sample-path version of the PLL. We first introduce our assumptions, which are just
as in the sample-path Little’s law, with one exception. In particular, we assume that
(A1) λk(n) → λk, w.p.1 as n→ ∞, 0 ≤ k ≤ d− 1,
(A2) F ck,j(n) → F c
k,j, w.p.1 as n→ ∞, 0 ≤ k ≤ d− 1, j ≥ 0, and
(A3) Wk(n) → Wk ≡∞∑j=0
F ck,j w.p.1 as n→ ∞, 0 ≤ k ≤ d− 1, (2.2)
where the limits are deterministic and finite. For the sample-path Little’s law, d = 1
and we do not need (A2).
The assumptions above only assume the existence of limits within the first period,
but the limits immediately extend to all k ≥ 0, showing that the limit functions must
be periodic functions. We then extend these periodic functions to the entire real line,
including the negative time indices. We give a proof of the following in §8.1.1
Lemma 2.1 (periodicity of the limits). If the three assumptions in (2.2) hold, then
the limits hold for all k ≥ 0, with the limit functions being periodic with period d.
CHAPTER 2. SAMPLE-PATH VERSION OF THE PLL 18
We are now ready to state a sample-path version of PLL; we give the proof in
§8.1.2.
Theorem 2.2 (sample-path PLL). If the three assumptions (A1), (A2) and (A3) in
(2.2) hold, then Qk(n) defined in (2.1) converges w.p.1 as n → ∞ to a limit that we
call Lk. Moreover,
Lk =∞∑j=0
λk−jFck−j,j <∞, 0 ≤ k ≤ d− 1, (2.3)
where λk and F ck,j are the periodic limits in (A1) and (A2) extended to all integers,
negative as well as positive.
Remark 2.1 (the extension to negative indices). To have convenient notation, we
have extended the periodic limit functions to the negative indices, but we do not
consider the averages and their limits in Assumptions (A1), (A2) and (A3) for negative
indices.
2.3 The Assumptions in Theorem 2.2
When d = 1, the PLL reduces to the non-time-varying ordinary Little’s law. In that
case, k = 0 represents all time indices since it is non-time-varying. In Theorem 2.2,
L0 ≡ limn→∞
Q0(n) is the limiting time-average number of customers in the system while
limn→∞
λ0(n) = λ0 is the limiting average number of arrivals at each time and the right
hand side of (2.3) becomes
∞∑j=0
λk−jFck−j,j = λ0
∞∑j=0
F c0,j = λ0W0. (2.4)
CHAPTER 2. SAMPLE-PATH VERSION OF THE PLL 19
And Theorem 2.2 claims that
L0 = λ0W0,
which is exactly the ordinary Little’s law. Of course, the ordinary Little’s law can be
applied to the time-varying case as well, but then we will lose the time structure and
get overall averages.
There is a difference between our assumptions in (2.2) and the assumptions in
the Little’s law. For the Little’s law, we let L be the limiting time-average number
of customers in system, λ be the limiting average arrival rate of customers, and W
be the limiting customer-average waiting time (time spent in the system or length of
stay). Then, if both λ and W exist and are finite, then L exists and is finite, and
L = λW . Our limit for λk(n) in (A1) is the natural extension; the only difference
is that now we require that λk(n) converges w.p.1 for each k, 0 ≤ k ≤ d − 1. The
third limit for Wk(n) in (A3) parallels the limit for the average waiting time, but
again we require that Wk(n) converges w.p.1 for each k, 0 ≤ k ≤ d − 1. However,
these two limits alone are not sufficient to determine the number of customers for the
periodic case. Now we need to require that the LoS distribution converges for each
k, 0 ≤ k ≤ d − 1, as stated in (A2). We show that this extra condition is needed in
the following example.
Example 2.1 (the need for convergent ccdfs). We now show that we need to assume
the limit F ck,j(n) → F c
k,j in (2.2). For simplicity, let d = 2, so that we have two
time points in each periodic cycle. Suppose we have two systems. In the first one,
we deterministically have 2 arrivals at the first time of each periodic cycle, (i.e., 2
arrivals at each even-indexed time), with one of them having a LoS of 0 and the other
having a LoS of 2. In the other system, we also deterministically have 2 arrivals in the
first time of each periodic cycle, but with both of them having a LoS of 1. Suppose
CHAPTER 2. SAMPLE-PATH VERSION OF THE PLL 20
there is no arrival at the odd indexed time for both of the two systems. Now the two
systems have the same λk and Wk. (λ0 = 2, λ1 = 0, W0 = 2 and W1 = 0.) However,
if we count the number of customers in the system, we have limn→∞
Q0(n) = 3 for the
first system and limn→∞
Q0(n) = 2 for the second one.
2.4 Indirect Estimation of Lk via the PLL
The PLL in Theorem 2.2 provides an indirect way to estimate the long-run average
occupancy level Lk through the right-hand side of (2.3), as discussed in Glynn and
Whitt (1989) for the ordinary LL. Here we show that the indirect estimator for Lk is
consistent with the direct estimator.
Since we only have data going forward in time from time 0, we start by rewriting
(1.3) as
∞∑j=0
λk−jFck−j,j =
k∑i=0
λi
∞∑l=0
F ci,k−i+ld +
d−1∑i=k+1
λi
∞∑l=1
F ci,k−i+ld, 0 ≤ k ≤ d− 1. (2.5)
Guided by (2.5), we let our indirect estimator for Lk be
Lk(n) ≡k∑
i=0
λi(n)∞∑l=0
F ci,k−i+ld(n)+
d−1∑i=k+1
λi(n)∞∑l=1
F ci,k−i+ld(n), 0 ≤ k ≤ d−1, (2.6)
where λi(n) and F ci,j(n) are defined in (2.1). With data, it is likely that the infinite
sums in (2.6) would be truncated to finite sums, but at a level growing with n; we do
not address that truncation modification, which we regard as minor.
We now show that the estimator Lk(n) in (2.6) is asymptotically equivalent to
the direct estimator Qk(n) in (2.1); we will prove this result together with Theorem
2.2 in §8.1.2
CHAPTER 2. SAMPLE-PATH VERSION OF THE PLL 21
Theorem 2.3 (indirect estimation through the PLL). Under the conditions of The-
orem 2.2,
limn→∞
Lk(n) = Lk w.p.1 for 0 ≤ k ≤ d− 1, (2.7)
where Lk(n) is defined in (2.6) and Lk is as in Theorem 2.2.
In applications, the LoS often can be considered to be bounded, i.e., for some
m > 1, Xi,j = 0 when j ≥ md. In that case, condition (A3) is directly implied by
condition (A2) and it is possible to bound the error between the direct and indirect
estimators for Lk, defined as
Ek(n) ≡ |Lk(n)− Qk(n)|, (2.8)
for Qk(n) in (2.1) and Lk(n) in (2.6), as we show now.
Corollary 2.4 (the bounded case). If, in addition to conditions (A1) and (A2) in
Theorem 2.2, there exists some mu > 0 such that Xi,j = 0 for i ≥ 0, j ≥ dmu, then
assumption (A3) is necessarily satisfied. If, in addition, there exists some λu > 0
such that Ai ≤ λu for i ≥ 0, then
Rn ≡ max0≤k≤d−1Ek(n) ≤ λud(mu + 2)2
2n, n ≥ mu, (2.9)
for Ek(n) in (2.8).
Proof: Here we show the proof of the first part of the corollary, i.e. if the LoS is
bounded, then assumption (A3) is implied from (A2), and we postpone the second
half of proof until §8.1.3, since it depends on part of the proof of Theorems 2.2 and
2.3.
CHAPTER 2. SAMPLE-PATH VERSION OF THE PLL 22
If Xi,j = 0 for i ≥ 0, j > dmu, then Fk,j(n) = 0 for 0 ≤ k ≤ d − 1 and j ≥ dmu.
So
Wk(n) =dmu∑j=1
F ck,j(n), 0 ≤ k ≤ d− 1,
is a finite summation and F ck,j = 0 for 0 ≤ k ≤ d− 1 and j > dmu. Then
limn→∞
Wk(n) = limn→∞
dmu∑j=0
F ck,j(n) =
dmu∑j=0
limn→∞
F ck,j(n) =
dmu∑j=0
F ck,j = Wk,
which is assumption (A3).
Remark 2.2 (when does (A2) imply (A3)). In addition to the boundedness condition
presented in Corollary 2.4, there are other mathematical conditions under which (A2)
implies (A3), i.e., under which we can interchange the order of the limits. Uniform
integrability is a standard condition for this purpose; see p. 185 of Billingsley (1995)
and Section 2.6 of El-Taha and Stidham Jr. (1999). We prefer (A3) plus (A2) because
that makes our conditions easier to compare to the conditions in the ordinary LL.
2.5 Departure Processes
Besides relating the occupancy level with the arrival processes and the LoS as in the
Little’s law, we can also establish the relationship between the departure processes
and the other quantities. This will also be helpful to understand the error between
different ways of counting what happens at each time point, as we will explain in
§2.7.
Let Di ≡∑i
j=0Xi−j,j, i ≥ 0, be the number of departures at time i. Given the
CHAPTER 2. SAMPLE-PATH VERSION OF THE PLL 23
arrivals occur before departures at each time, it is easy to see that
Di = Qi −Qi+1 + Ai+1, for i ≥ 0. (2.10)
Paralleling (2.1), we look at the averages
δk(n) ≡1
n
n∑m=1
Dk+(m−1)d, for 0 ≤ k ≤ d− 1. (2.11)
Corollary 2.5 (departure averages). Under the conditions of Theorem 2.2, δk(n)
defined in (2.11) converges w.p.1 as n → ∞ to a periodic limit that we call δk.
Moreover,
δk =∞∑j=0
λk−jfk−j,j ≡∞∑j=0
λk−j(Fck−j,j − F c
k−j,j+1) (2.12)
for 0 ≤ k ≤ d− 1, where λk and F ck,j are the same periodic limits as in Theorem 2.2
and fk,j ≡ F ck,j − F c
k,j+1 is the discrete time probability mass function of the LoS.
Proof. The proof is easy given we have Theorem 2.2 and equation (2.10). By
equations (2.10) and (2.1),
δk(n) =1
n
n∑m=1
Dk+(m−1)d =1
n
n∑m=1
(Qk+(m−1)d −Qk+1+(m−1)d + Ak+1+(m−1)d)
=
Qk(n)− Qk+1(n) + λk+1(n), 0 ≤ k < d− 1,
Qd−1(n)− Q0(n+ 1) + λ0(n+ 1) +1
nQ0 −
1
nA0, k = d− 1.
(2.13)
CHAPTER 2. SAMPLE-PATH VERSION OF THE PLL 24
Since limn→∞
1
nQ0 = 0 and lim
n→∞
1
nA0 = 0, by Theorem 2.2 and (A1), we have
limn→∞
δk(n) = Lk − Lk+1 + λk+1 =∞∑j=0
λk−jFck−j,j −
∞∑j=0
λk+1−jFck+1−j,j + λk+1
=∞∑j=0
λk−j(Fck−j,j − f c
k−j,j+1)− λk+1 + λk+1 =∞∑j=0
λk−jfk−j,j. (2.14)
2.6 Connection to Cumulative Processes
The direct and indirect estimators of the occupancy level we introduced in (2.1) and
(2.6), respectively, can be related through the cumulative processes, as depicted in
Figure 2.1.
We focus on the two cumulative processes associated with the occupancy level
and total LoS, respectively, i.e.,
CQ(n) ≡n∑
m=1
d−1∑k=0
Qk+(m−1)d =nd−1∑i=0
Qi,
CL(n) ≡n∑
m=1
d−1∑k=0
∞∑j=0
(j + 1)Xk+(m−1)d,j =n∑
m=1
d−1∑k=0
∞∑j=0
Yk+(m−1)d,j
=nd−1∑i=0
∞∑j=0
Yi,j. (2.15)
The first, CQ(n), is the cumulative occupancy level up to time n, while the second,
CL(n), is the cumulative total LoS of customers that arrived up time n.
Figure 2.1 helps to understand the two cumulative quantities. In the figure, we
plot the time intervals that each of the first 35 arrivals spends in the system as
horizontal bars, each with a height 1 placed in order of the arrival times. The left
end point is the arrival time, while the right end point is the departure time, which
need not be in order of arrival. We can see that CQ(n) and CL(n) correspond to two
CHAPTER 2. SAMPLE-PATH VERSION OF THE PLL 25
areas respectively.
Figure 2.1: An example of a periodic queueing system with d = 5 and n = 4. Thevertical line is placed at time nd = 20. Area A corresponds to CQ(4) in (2.16) below,while Area A∪B corresponds to CL(4) in (2.16).
We can further relate the two cumulative processes to the averages in (2.1) and
(2.6), as stated in the following proposition.
Proposition 2.6. The cumulative processes and the averages are related by
Area(A) = CQ(n) = n
d−1∑k=0
Qk(n),
Area(A ∪B) = CL(n) = nd−1∑k=0
λk(n)Wk(n) = n
d−1∑k=0
Lk(n). (2.16)
CHAPTER 2. SAMPLE-PATH VERSION OF THE PLL 26
Proof: The proof follows directly from the definitions, especially for CQ(n). For
CL(n), observe that
CL(n) =n∑
m=1
d−1∑k=0
∞∑j=0
Yk+(m−1)d,j = nd−1∑k=0
∞∑j=0
Yk,j(n) = nd−1∑k=0
∞∑j=0
λk(n)Fck,j(n)
= nd−1∑k=0
λk(n)Wk(n). (2.17)
By (2.6), if we sum over 0 ≤ k ≤ d− 1 and adjust the order of summation, we have
d−1∑k=0
Lk(n) =d−1∑k=0
∞∑j=0
λk(n)Fck,j(n) =
d−1∑k=0
λk(n)Wk(n). (2.18)
In the context of Figure 2.1, Theorem 2.2 and Theorem 2.3 assert that
E(n) ≡d−1∑k=0
Ek(n) = Area(B)/n→ 0 as n→ ∞, (2.19)
where Ek(n) defined in (2.8).
We remark that Figure 2.1 is a variant of Figure 1 in Whitt (1991) and Figures
2 and 3 in Kim and Whitt (2013), as well as similar figures in earlier papers. The
figures in Kim and Whitt (2013) are different, because of the initial edge effect, which
we avoid by treating arrivals before time 0 in the system as arrivals at time 0.
2.7 Different Orders of Events in Discrete Time
We have assumed that all arrivals occur before all departures at each time, and that we
count the number of customers in system after the arrivals but before the departures.
That produces an upper bound on the system occupancy for all possible orderings.
Suppose instead that all departures occur before all arrivals at each time, and that we
CHAPTER 2. SAMPLE-PATH VERSION OF THE PLL 27
count the number of customers in system after the departures but before the arrivals.
That obviously produces a lower bound. The consequence of any other ordering will
fall in between these two.
For a discrete-time system, the order may be given, but many applications start
with a continuous-time system. In that case, the modeler can choose which ordering
to use in the discrete-time version. We have chosen the conservative upper bound. In
this section we derive an expression for the alternative lower bound and the difference
between the upper bound and the lower bound. From that difference, we can see
that the difference between an initial continuous-time system and a discrete-time
“approximation” will become asymptotically negligible as we refine the discrete-time
version by increasing the number of discrete time points within a fixed continuous-
time cycle.
For the alternative departure-first (lower-bound) ordering, let QLi be the number
of customers in the system at time i and let
QLk (n) ≡
1
n
n∑m=1
QLk+(m−1)d. (2.20)
Assume that we keep the meaning of Xi,j and Yi,j.
Proposition 2.7 (the lower-bound occupancy). Under the conditions of Theorem
2.2, QLk (n) converges w.p.1 as n→ ∞ to a limit that we call LL
k , and we have
LLk = Lk − δk − 2λk + λkfk,0 ≤ Lk for 0 ≤ k ≤ d− 1, (2.21)
where Lk is in Theorem 2.2, λk is in (A1), δk and fk,0 are the same as in Corollary
2.5.
CHAPTER 2. SAMPLE-PATH VERSION OF THE PLL 28
Proof. For the new departures-first ordering, the number of customers at time i is
QLi =
i∑j=1
Yi−j,j+1 − Yi,0 =i∑
j=1
Yi−j,j −i∑
j=1
Xi−j,j − Yi,0
=i∑
j=0
Yi−j,j −i∑
j=0
Xi−j,j − 2Ai +Xi,0 = Qi −Di − 2Ai +Xi,0, (2.22)
where we regard the summation as 0 if the lower bound is larger than the upper
bound. By equation (2.22) and (2.20), we have
QLk (n) =
1
n
n∑m=1
QLk+(m−1)d
=1
n
n∑m=1
(Qk+(m−1)d −Dk+(m−1)d − 2Ak+(m−1)d +Xk+(m−1)d,0)
= Qk(n)− δk(n)− 2λk(n) + (Yk,0(n)− Yk,1(n)). (2.23)
Next observe that (A1) and (A2) imply that limn→∞
Yk,j(n) = λkFck,j, so that
limn→∞
QLk (n) = Lk − δk − 2λk + λk(F
ck,0 − F c
k,1) = Lk − δk − 2λk + λkfk,0. (2.24)
We can apply Proposition 2.7 to deduce that the error in an incorrect ordering
of events is asymptotically negligible if we start with a continuous-time system and
choose sufficiently short time periods. For the arrival process A(t), we assume that
t−1A(t) → Λ(t) as t→ ∞, (2.25)
where
Λ(t) =
∫ t
0
λ(u) du and 0 < λL ≤ λ(u) ≤ λU <∞, (2.26)
for all non-negative u and t, with λ being a periodic function with period c. We make
CHAPTER 2. SAMPLE-PATH VERSION OF THE PLL 29
a similar assumption for the departure process D(t) with the limiting rate being δ(t).
Corollary 2.8 (from continuous to discrete time). Suppose that we start with a
continuous-time periodic system with period c and construct a discrete-time system
by considering d evenly spaced time intervals within each periodic cycle. If the arrival
and departure processes satisfy (2.25) and (2.26), and if the conditions of Theorem
2.2 hold for each d, then the error in the discrete-time approximation due to the order
of events at each time point goes to 0 as d→ ∞.
Proof. From (2.21), we know that 0 ≤ Lk − LLk = δk + 2λk − λkfk,0. Hence, if we
consider a sequence of systems indexed by d, then the condition implies that λd,k → 0
and δd,k → 0 as d→ ∞.
We can use Corollary 2.8 to define what we mean by a continuous-time sample-
path PLL. We also exploit the basic properties of Riemann integrals; see §6 of Rudin
(1976).
Definition 2.1 (continuous-time sample-path PLL). Consider a continuous-time sys-
tem with arrival process having a periodic arrival rate and satisfying (2.25) and (2.26).
A continuous-time sample-path PLL holds, yielding relation (1.2), if (i) the discrete-
time PLL holds for each d when we form d time periods within each periodic cycle,
and (ii) the sequence of discrete-time PLLs in (1.3) can be regarded as converging
Riemann-sum approximations for the continuous-time relation in (1.2).
CHAPTER 3. STEADY-STATE STOCHASTIC VERSIONS OF THE PLL 30
Chapter 3
Steady-State Stochastic Version of
the Periodic Little’s Law
In this chapter, we discuss the stochastic analogs of Theorem 2.2. In §3.1 we define a
stationary framework in continuous time. In §3.2 we derive a steady-state stochastic
PLL by applying Theorem 2.2. In §3.3 we establish a version of PLL for theGt/GIt/∞
time-varying infinite-server model proposed in Whitt and Zhang (2017). Finally, in
§3.4 we establish a continuous-time stochastic PLL, which primarily follows from
Rolski (1989) and Fralix and Riaño (2010).
For this initial version, we aim for simplicity. Thus, we assume that the basic
stochastic process is both stationary and ergodic, so that steady-state means coincide
with long-run averages. The main idea is that we now interpret the key quantities
Lk and λk appearing in (1.3) as appropriate expected values of random variables
associated with the system in periodic steady state. As we see it, there are two main
issues:
1. What is meant by periodic steady state?
CHAPTER 3. STEADY-STATE STOCHASTIC VERSIONS OF THE PLL 31
2. What is F ck,j or, equivalently, what is the probability distribution of the LoS of
an arbitrary arrival in period k?
3.1 Periodic Steady State
In order to construct periodic steady-state, we assume that the basic stochastic pro-
cess Yn : n ∈ Z with
Yn ≡ Ynd+k,j : 0 ≤ k ≤ d− 1; j ≥ 0 (3.1)
introduced in §2.1 is a stationary sequence of non-negative random elements indexed
by the integer n. For each integer n, the random element Yn takes values in the space
(Zd)∞ ≡ Zd×Zd×· · · . See Chapter 6 of Breiman (1968), Baccelli and Brémaud (1994)
and Sigman (1995) for background on stationary processes and their application to
queues. Just as for the time-varying LL, as discussed in Fralix and Riaño (2010), it
is important to apply the Palm transformation, but we avoid that issue by exploiting
the established limits for the averages.
Without loss of generality, we now regard our stochastic processes as stationary
processes on the integers Z, negative as well as positive; see Proposition 6.5 of Breiman
(1968). As usual, we mean strictly stationary; i.e., the finite-dimensional distributions
are independent of time shifts, which in turn means that, for each k and each k-tuple
(n1, . . . , nk) of integers in Z,
(Yn1 , . . . ,Ynk)
d= (Yn1+m, . . . ,Ynk+m) for all m ∈ Z, (3.2)
with d= denoting equality in distribution.
As a consequence of the stationarity assumed for Yn : n ∈ Z, we also have
CHAPTER 3. STEADY-STATE STOCHASTIC VERSIONS OF THE PLL 32
stationarity for the associated stochastic process (And+k, Qnd+k) : n ∈ Z, where
And+k is the number of arrivals at time nd+ k and Qnd+k is the number of customers
in the system at time nd+k, both of which are defined in §2.1, but now that we have
stationarity on all integers, negative as well as positive. Thus, we have
And+k ≡ Ynd+k,0 and Qnd+k ≡∞∑j=0
Ynd+k−j,j. (3.3)
Hence, with some abuse of notation, we let (Yk,j : j ≥ 0, Ak, Qk) be a stationary
random element. In this stochastic setting, we have
λk ≡ E(Ak) = E(Yk,0) and Lk ≡ E(Qk) =∞∑j=0
E(Yk−j,j). (3.4)
3.2 The Stochastic PLL
We now come to the second issue: In the stochastic setting it remains to define
F ck,j ≡ P (Wk > j), where Wk is the time in system for an arbitrary arrival in period
k. It is natural to define F ck,j by requiring that it agree with the limit of the averages
F ck,j(n) in (2.1). That limit is well defined if we assume that the basic sequence is
ergodic as well as stationary, with 0 < E(Yk,0) <∞ for all k.
With the stationary framework on all the integers, positive and negative, the
stochastic PLL becomes very elementary, because there are no edge effects.
Theorem 3.1 (stochastic PLL). Suppose that Yn : n ∈ Z in (3.1) is stationary and
ergodic with 0 < λk ≡ E(Yk,0) <∞, 0 ≤ k ≤ d− 1. Then, for each k, 0 ≤ k ≤ d− 1,
and j ≥ 0,
F ck,j(n) → F c
k,j ≡E(Yk,j)
E(Yk,0)w.p.1 as n→ ∞ (3.5)
CHAPTER 3. STEADY-STATE STOCHASTIC VERSIONS OF THE PLL 33
and
Lk ≡ E(Qk) =∞∑j=0
λk−jFck−j,j. (3.6)
Proof. First, the stationary and ergodic condition with the specified moment as-
sumptions implies that
Yk,j(n) → E(Yk,j) = λkFck,j as n→ ∞ w.p.1 (3.7)
for all k and j in view of (2.1), (3.4) and the definition of F ck,j in (3.5). Then (3.5)
follows immediately by continuity under the division operation by a quantity with a
strictly positive limit. If we multiply and divide by λk within the representation for
E(Qk) in (3.4), then we see that
E(Qk) =∞∑j=0
E(Yk−j,j) =∞∑j=0
λkE(Yk−j,j)
λk=
∞∑j=0
λkFck−j,j. (3.8)
In closing this section, we remark that the distribution Fk,j can be seen to corre-
spond to an underlying Palm measure, but we do not develop that framework here;
see Sigman and Whitt (2019).
3.3 The Discrete-Time Periodic Gt/GIt/∞ Model
The candidate model for the ED proposed in Whitt and Zhang (2017) was a special
case of the periodic Gt/GIt/∞ infinite-server model, in which the LoS variables are
mutually independent and independent of the arrival process, with a ccdf F ck,j ≡
P (Wk ≥ j) for a steady-state LoS in time period k that depends only on the time
period k within the periodic cycle (a week). This strong local condition provides a
sufficient condition for the steady-state stochastic PLL. That can be seen from the
CHAPTER 3. STEADY-STATE STOCHASTIC VERSIONS OF THE PLL 34
following proposition.
Proposition 3.2 (theGt/GIt/∞ special case). For the stationary Gt/GIt/∞ infinite-
server model specified above, where Y ≡ Yn : n ∈ Z in (3.1) is strictly stationary
with finite mean values, then
E(Yk,j) = F ck,jE(Yk,0), (3.9)
consistent with formula (3.5).
Proof. Let Wk,i : i ≥ 1 be a sequence of i.i.d. LoS variables associated with
arrivals in time period k, which is also independent of the arrival process and the
other LoS variables. Under those conditions,
E(Yk,j) =∞∑
m=1
m∑i=1
P (Wk,i > j)P (Ak = m)
=∞∑
m=1
mF ck,jP (Ak = m) = F c
k,jE(Ak) = F ck,jE(Yk,0).
3.4 Exploiting the Palm Theory in Continuous Time
Finally, we observe that the Palm theory in Rolski (1989) and Fralix and Riaño (2010)
also provides a general steady-state stochastic PLL in the conventional continuous-
time setting. For this steady-state stochastic result, we now assume that the arrival
process is a simple point process (arrivals occur one a time) on the entire real line
with a well-defined arrival rate λ(t) at time t. This was satisfied in the ED application
(our applied motivation) because the arrival times have very detailed time stamps.
Thus, letting N(t)−N(s) be the number of arrivals in the interval [s, t], we assume
CHAPTER 3. STEADY-STATE STOCHASTIC VERSIONS OF THE PLL 35
that
E(N(t))− E(N(s)) =
∫ t
s
λ(s) ds, (3.10)
where the arrival-rate function λ(t) is a periodic function with periodic cycle length
c, which is also right continuous with left limits.
As in section 2 of Fralix and Riaño (2010), we let W (t) be the waiting time of the
last arrival before time t. That convention yields a well-defined waiting-time process
W (t) : t ∈ R.
As a continuous-time analog of the periodic stationarity assumed in §3.1, we
assume that the queue-length (the number of customers in system) process Q ≡
Q(t) : t ∈ R has a distribution that is invariant under time shifts by c. The
arrival process is included by the upward jumps on Q. As a consequence of this
c-stationarity, the set of Palm measures Ns : s ∈ R associated with the arrival
process N is periodic with period c. The mean queue length is expressed in terms of
the tail probabilities
F ct,x ≡ Pt(W (t) > x) (3.11)
under the Palm measures Pt, which are periodic with period c.
Then, paralleling the remark after Theorem 3.1 of Fralix and Riaño (2010), The-
orem 3.1 of Fralix and Riaño (2010) implies the following continuous-analog of (1.3),
which was already given for the Mt/GI/1 special case in Rolski (1989).
Theorem 3.3 (continuous-time PLL following from Rolski (1989) and Fralix and
Riaño (2010)). Under the conditions above,
E(Q(t)) =
∫ ∞
0
F ct−s,sλ(t− s) ds (3.12)
where F ct−s,s is defined in (3.11).
CHAPTER 4. CLT VERSION OF THE PLL 36
Chapter 4
Central-Limit-Theorem Version of
the Periodic Little’s Law
In this chapter, we establish a CLT version of the PLL, paralleling the CLT versions
of the ordinary Little’s law in Glynn and Whitt (1986), Glynn and Whitt (1987),
Glynn and Whitt (1988b), and Whitt (2012). We look for the relationship between
the CLT-scaled arrival rates, LoS distributions and occupancy level in the periodic
setting, so we now require stronger assumptions than for the sample-path PLL in
Theorem 2.2, but we obtain a rate of convergence for the sample-path PLL. We will
show that linking the indirect estimator of occupancy level with the arrival rates
and LoS distributions is straightforward by the continuous mapping theorem, how-
ever, stronger conditions are needed to ensure that the CLT-scaled direct estimator
converges to the same limit.
One simple and practical case, where we assume that both the number of arrivals
at each time and the LoS of each arrival are bounded and the limits are Gaussian, is
provided first in §4.2. It applies to the ED in Whitt and Zhang (2017) and evidently
to most practical cases. Two more complicated sufficient conditions are stated in
CHAPTER 4. CLT VERSION OF THE PLL 37
Chapter 5. The CLT versions of PLL also have statistical applications, as in Glynn
and Whitt (1989) and Kim and Whitt (2013), which we discuss in §4.4.
4.1 More Notation and Definitions
In this section we assume the LoSs are bounded by J , i.e. Xi,j = 0 for j ≥ J and all
i. All the vectors are understood to be column vectors.
For 0 ≤ k ≤ d− 1, let
F ck (n) ≡ (F c
k,0(n), Fck,1(n), · · · , F c
k,J−1(n)) ∈ RJ ,
F ck ≡ (F c
k,0, Fck,1, · · · , F c
k,J−1) ∈ RJ . (4.1)
Let the law-of-large-numbers-scaled (LLN-scaled) averages be
λλλ(n) ≡ (λ0(n), λ1(n), · · · , λd−1(n)) ∈ Rd,
δδδ(n) ≡ (δ0(n), δ1(n), · · · , δd−1(n)) ∈ Rd,
F cF cF c(n) ≡ (F c0 (n)
T , F c1 (n)
T , · · · , F cd−1(n)
T ) ∈ Rd×J ,
WWW (n) ≡ (W0(n), W1(n), · · · , Wd−1(n)) ∈ Rd,
QQQ(n) ≡ (Q0(n), Q1(n), · · · , Qd−1(n)) ∈ Rd,
LLL(n) ≡ (L0(n), L1(n), · · · , Ld−1(n)) ∈ Rd, (4.2)
CHAPTER 4. CLT VERSION OF THE PLL 38
and the CLT-scaled averages be
λλλ(n) ≡√n(λλλ(n)− λλλ) ∈ Rd,
δδδ(n) ≡√n(δδδ(n)− δδδ) ∈ Rd,
F cF cF c(n) ≡√n(F cF cF c(n)−F cF cF c) ∈ Rd×J ,
WWW (n) ≡√n(WWW (n)−WWW ) ∈ Rd,
QQQ(n) ≡√n(QQQ(n)−LLL) ∈ Rd,
LLL(n) ≡√n(LLL(n)−LLL) ∈ Rd, (4.3)
where the deterministic centering constants are
λλλ ≡ (λ0, λ1, · · · , λd−1) ∈ Rd,
δδδ ≡ (δ0, δ1, · · · , δd−1) ∈ Rd,
F cF cF c ≡ ((F c0 )
T , (F c1 )
T , · · · , (F cd−1)
T ) ∈ Rd×J ,
WWW ≡ (W0,W1, · · · ,Wd−1) ∈ Rd,
LLL ≡ (L0, L1, · · · , Ld−1) ∈ Rd. (4.4)
Again, all the above constant vectors and matrices in (4.4) can be extended as peri-
odic functions with period d, but for convenience, throughout this thesis, we use the
modulo function, [x] = x mod d, to treat k beyond 0 ≤ k ≤ d− 1.
4.2 A Practical Version for Applications
We now state our first CLT version of the PLL. Because it is a special case of the
more general one stated later, the proof is not given separately. The statement is
CHAPTER 4. CLT VERSION OF THE PLL 39
proved after we introduce Theorem 4.3, Proposition 4.4 and the two corollaries right
after them. Let ⇒ denote convergence in distribution.
Theorem 4.1 (practical CLT version of the PLL). If the following conditions hold:
(E1) (λλλ(n), F cF cF c(n)) ⇒ (ΛΛΛ,ΓΓΓ) in Rd × Rd×J as n→ ∞,
(E2) the number of arrivals in a single discrete time period is bounded, and
(E3) Wk =J−1∑j=0
F ck,j and Lk =
J−1∑j=0
λ[k−j]Fc[k−j],j for 0 ≤ k ≤ d− 1, (4.5)
for (λλλ(n), F cF cF c(n)) in (4.3) and (Wk, Lk) in (4.4), where the limit (ΛΛΛ,ΓΓΓ) in (E1) is a
zero-mean Gaussian random vector, then
(λλλ(n), δδδ(n), F cF cF c(n), QQQ(n), LLL(n), WWW (n)
)⇒ (ΛΛΛ,∆∆∆,ΓΓΓ,ΥΥΥ,ΥΥΥ,ΩΩΩ)
in R2d × Rd×J × R3d, (4.6)
where ΩΩΩ = (Ω0,Ω1, · · · ,Ωd−1), ΥΥΥ = (Υ0,Υ1, · · · ,Υd−1) and ∆∆∆ = (∆0,∆1, · · · ,∆d−1)
are given by
Ωk =J−1∑j=0
Γk,j, Υk =J−1∑j=0
λ[k−j]Γ[k−j],j +J−1∑j=0
Λ[k−j]Fc[k−j],j and
∆k = Υ[k] −Υ[k+1] + Λ[k+1] for 0 ≤ k ≤ d− 1, (4.7)
and (ΛΛΛ,∆∆∆,ΓΓΓ,ΥΥΥ,ΥΥΥ,ΩΩΩ) is also jointly zero-mean Gaussian distributed.
Theorem 4.1 implies that, given the joint convergence of the CLT-scaled arrival
and LoS processes (λλλ(n), F cF cF c(n)) in (E1), with the associated regularity conditions,
we get the associated convergence for the CLT-scaled direct estimate of occupancy
QQQ(n) as well as the indirect estimate LLL(n), jointly with the other processes. We get
CHAPTER 4. CLT VERSION OF THE PLL 40
the consistency requirement in (E3) from the PLL in Theorem 2.2.
4.3 The General Version
In this section, we introduce the more general CLT version of the PLL, where we
allow both the number of arrivals and the LoS distributions to be unbounded. Thus
it involves countably-infinite-dimensional spaces. We show that the connection among
the CLT-scaled indirect estimator of occupancy level LLL(n), the arrival rates λλλ(n) and
the LoS distributions F cF cF c(n) can be established using the continuous mapping theorem.
More conditions are needed to ensure that the CLT-scaled direct estimator QQQ(n) has
the same limit as LLL(n).
Let Rd be the d-dimensional real space with usual topology; let R∞ be the space
of sequences x = (x0, x1, · · · ) of real numbers with the topology determined by the
convergence of all finite-dimensional projections (which is induced by the metric in
Example 1.2 of Billingsley (1999)); let ℓ1 ⊆ R∞ be the subspace of R∞ which contains
sequences with finite absolute sums; and let ℓ∞ ⊆ R∞ be the subspace of R∞ which
contains sequences with bounded values, i.e.
ℓ1 ≡ xxx = (x0, x1, · · · ) ∈ R∞ : ||xxx||1 ≡∞∑i=0
|xi| <∞,
ℓ∞ ≡ xxx = (x0, x1, · · · ) ∈ R∞ : ||xxx||∞ ≡ supi
|xi| <∞ (4.8)
We equip ℓ1 and ℓ∞ with the norms || · ||1 and || · ||∞ as above, under which both
are Banach spaces. (We remark that ℓ1 ⊆ ℓ∞ and the dual space of ℓ1 is ℓ∞.) Then
we can define (ℓ1)d as the d-fold product space of ℓ1 with the norm equal to the
maximum of the ℓ1-norms of each component, i.e. if yyy ≡ (yyy0, yyy1, · · · , yyyd−1) ∈ (ℓ1)d,
where yyyi ≡ (yi,0, yi,1, · · · ) ∈ ℓ1, then ||yyy||1,d = maxi=0,··· ,d−1||yyyi||1, and we use the
CHAPTER 4. CLT VERSION OF THE PLL 41
topology induced by this norm; see §6.1 for more discussion of this space.
All the quantities related to LoS distributions are now in those infinite dimensional
spaces. To be specific,
F ck (n) ≡ (F c
k,0(n), Fck,1(n), F
ck,2(n), · · · ) ∈ ℓ1, 0 ≤ k ≤ d− 1,
F ck ≡ (F c
k,0, Fck,1, F
ck,2, · · · ) ∈ R∞, 0 ≤ k ≤ d− 1,
F cF cF c(n) ≡ (F c0 (n), F
c1 (n), · · · , F c
d−1(n)) ∈ (ℓ1)d,
F cF cF c(n) ≡√n(F cF cF c(n)−F cF cF c) ∈ (R∞)d. (4.9)
Other quantities are still the same as we defined in (4.3) and (4.4).
We also define the LLN-scaled and CLT-scaled difference between the direct and
indirect occupancy estimators as
RRR(n) ≡ LLL(n)− QQQ(n) ∈ Rd,
RRR(n) ≡ LLL(n)− QQQ(n) ∈ Rd. (4.10)
Just as in Glynn and Whitt (1986), the continuous mapping theorem plays a
crucial role in the proof of the theorem. Hence, we start by introducing the key
mappings and show that they are continuous.
For 0 ≤ k ≤ d− 1, let xxx ∈ Rd and yyy ∈ (ℓ1)d and define hk : Rd × (ℓ1)
d → R as
hk(xxx,yyy) =∞∑j=0
x[k−j]y[k−j],j, (4.11)
where, again, [k] = k mod d is the modulo function. As a critical condition, in the
following lemma we show two functions that will be used as mapping functions are
continuous. The proof can be found in §8.2.1.
CHAPTER 4. CLT VERSION OF THE PLL 42
Lemma 4.2. For a given constant zzz ∈ (ℓ1)d, let fzzz : Rd × Rd × (ℓ1)
d → Rd and
g : (ℓ1)d → Rd be defined by
fzzz(xxx(1),xxx(2), yyy) ≡ (fzzz,0(xxx
(1),xxx(2), yyy), fzzz,1(xxx(1),xxx(2), yyy), · · · , fzzz,d−1(xxx
(1),xxx(2), yyy)),
g(yyy) ≡ (∞∑j=0
y0,j,
∞∑j=0
y1,j, · · · ,∞∑j=0
yd−1,j), (4.12)
where fzzz,k : Rd × Rd × (ℓ1)d → R is defined as
fzzz,k(xxx(1),xxx(2), yyy) ≡ hk(xxx
(1), yyy) + hk(xxx(2), zzz), 0 ≤ k ≤ d− 1,
with hk defined in (4.11). The functions fzzz(xxx(1),xxx(2), yyy) and g(yyy) in (4.12) are con-
tinuous.
The following is a counterexample to show the two functions above are not con-
tinuous if we replace ℓ1 by R∞. Let 000 = (0, 0, · · · , 0) be the zero vector in proper
spaces depending on the context.
Example 4.1 (discontinuity of f and g when yyy ∈ (R∞)d). It suffices to see that
g0(yyy0) ≡∑∞
j=0 y0,j from R∞ → R is not continuous in general. Let y(i)0,j ≡ Ii=j
for all i, j ≥ 0, so that yyy(i)0 ≡ (y(i)0,0, y
(i)0,1, · · · ) are all 0 except the ith component.
Under the metric of R∞ as in Example 1.2 of Billingsley (1999), i.e. ρ(xxx(1),xxx(2)) =∑∞i=1min(1, |x(1)i − x
(2)i |)/2i, where xxx(j) = (x1, x2, · · · ), j = 1, 2, yyy(i)0 → 000 as i → ∞,
however limi→∞
g0(yyy(i)0 ) = 1 = 0 = g0(000).
We now state our general CLT version of the PLL with indirect estimator of the
occupancy level. The proof appears in §8.2.2.
Theorem 4.3 (CLT version of the PLL with indirect estimator). If the following
CHAPTER 4. CLT VERSION OF THE PLL 43
conditions hold:
(C1) (λλλ(n), F cF cF c(n)) ⇒ (ΛΛΛ,ΓΓΓ) in Rd × (ℓ1)d as n→ ∞,
(C2) WWW = g(F cF cF c) and LLL = fF cF cF c(000,λλλ,000), (4.13)
for (λλλ(n), F cF cF c(n)) in (4.3) and (λλλ,F cF cF c,WWW,LLL) in (4.4), using Lemma 4.2, then
((λλλ(n), F cF cF c(n), LLL(n), WWW (n)
),(λλλ(n), F cF cF c(n), LLL(n), WWW (n)
))⇒
((λλλ,F cF cF c,LLL,WWW
),(ΛΛΛ,ΓΓΓ,ΥΥΥ,ΩΩΩ
))in (Rd × (ℓ1)
d × R2d)× (Rd × (ℓ1)d × R2d), (4.14)
where ΩΩΩ = g(ΓΓΓ) and ΥΥΥ = fF cF cF c(λλλ,ΛΛΛ,ΓΓΓ); in particular,
Ωk =∞∑j=0
Γk,j, Υk =∞∑j=0
λ[k−j]Γ[k−j],j +∞∑j=0
Λ[k−j]Fc[k−j],j. (4.15)
Note that condition (C1) requires that the random elements belong to the specified
space. Also the limit for the first five elements in (4.14) yield a weak LLN, consistent
with the PLL.
Just as in the ordinary Little’s law and the PLL, it is interesting to consider when
can we add the direct estimator of occupancy level QQQ(n) into the joint convergence.
Clearly, if we can show that RRR(n) defined in (4.10) goes to 000 in distribution, then
QQQ(n) converges to the same limit as LLL(n) in distribution as n→ ∞ and can be added
into the joint convergence in (4.14) by applying Theorem 3.1 of Billingsley (1999).
For clarity, we summarize that observation in the following proposition.
Proposition 4.4. If, in addition to condition (C1) and (C2), we have RRR(n) ⇒ 000,
CHAPTER 4. CLT VERSION OF THE PLL 44
then
((λλλ(n), δδδ(n), F cF cF c(n), QQQ(n), LLL(n), WWW (n)
), (4.16)(
λλλ(n), δδδ(n), F cF cF c(n), QQQ(n), LLL(n), WWW (n), RRR(n)))
⇒((λλλ,δδδ,F cF cF c,LLL,LLL,WWW
),(ΛΛΛ,∆∆∆,ΓΓΓ,ΥΥΥ,ΥΥΥ,ΩΩΩ,000
))in (R2d × (ℓ1)
d × R3d)× (R2d × (ℓ1)d × R4d), (4.17)
where δδδ is in (4.4), ∆∆∆ = (∆0,∆1, · · · ,∆d−1) is given by
∆k = Υ[k] −Υ[k+1] + Λ[k+1] for 0 ≤ k ≤ d− 1, (4.18)
and all the other variables have the same meaning as in Theorem 4.3.
The proof is given in §8.2.3.
The following corollary shows that if we have Gaussian limits in the conditions,
then the limits in the conclusion are also Gaussian. The proof is in §8.2.4.
Corollary 4.5 (Gaussian limits). If, in addition to the conditions of Theorem 4.3,
(ΛΛΛ,ΓΓΓ) has a zero-mean Gaussian distribution with covariance and cross-covariance
Cov(ΛΛΛ,ΛΛΛ) = ΣΛ, Cov(Γk,Γl) = ΣΓ:k,l and Cov(ΛΛΛ,Γk) = ΣΛ,Γ:k, 0 ≤ k, l ≤ d− 1, then,
CHAPTER 4. CLT VERSION OF THE PLL 45
(ΩΩΩ,ΥΥΥ) also has a zero-mean Gaussian distribution with
Cov(ΩΩΩ,ΩΩΩ)k,l =∞∑i=1
∞∑j=1
ΣΓ:k,li,j ,
Cov(ΥΥΥ,ΥΥΥ)k,l =∞∑i=0
∞∑j=0
λ[k−i]λ[l−j]ΣΓ:[k−i],[l−j]i+1,j+1 +
∞∑i=0
∞∑j=0
λ[k−i]Fc[l−j],jΣ
Λ,Γ:[k−j][l−j]+1,i+1
+∞∑i=0
∞∑j=0
F c[k−i],iλ[l−j]Σ
Λ,Γ:[l−j][k−i]+1,j+1 +
∞∑i=0
∞∑j=0
F c[k−i],iF
c[l−j],jΣ
Λ[k−i]+1,[l−j]+1,
Cov(ΩΩΩ,ΥΥΥ)k,l =∞∑i=1
∞∑j=0
λ[l−j]ΣΓ:k,[l−j]i,j+1 +
∞∑i=1
∞∑j=0
F c[l−j],jΣ
Λ,Γ:k[l−j]+1,i. (4.19)
The next corollary show that boundedness is a simple yet practical condition such
that RRR(n) ⇒ 000, so that Theorem 4.1 is covered by Theorem 4.3 and Proposition 4.4
as a special case. The proof is provided in §8.2.5.
Corollary 4.6 (bounded LoS). Suppose that the number of arrivals in a discrete time
period and the LoS are bounded (by J). Then condition (C1) reduces to convergence
in Rd × (RJ)d for some finite J . If, in addition, conditions (C1) and (C2) hold, then
RRR(n) ⇒ 000, so that the joint convergence (4.16) in Proposition 4.4 holds.
If we don’t want to strengthen the conditions with boundedness, two other reason-
able sufficient conditions are given in Chapter 5 such that RRR(n) ⇒ 000 and a full version
of CLT-PLL can be achieved with both CLS-scaled direct and indirect estimator of
occupancy level in the joint convergence as in Proposition 4.4. However, before we go
to that, we make a remark that connects the CLT-PLL to the ordinary CLT version
of Little’s law which is studied in Glynn and Whitt (1988b). We also give potential
statistical applications of our theorems.
Remark 4.1 (connection to the ordinary CLT version of Little’s law in Glynn and
Whitt (1988b)). When d = 1, Theorem 4.3 reduces to an ordinary CLT version of LL,
CHAPTER 4. CLT VERSION OF THE PLL 46
which can be compared to the earlier one in Glynn and Whitt (1988b). Specifically,
when d = 1, we have ΥΥΥ = fF cF cF c(λλλ,ΛΛΛ,ΓΓΓ) = λΩλΩλΩ +WΛWΛWΛ. All the discussion in Glynn and
Whitt (1988b) is in continuous time, but it is not difficult to translate everything into
discrete time. To avoid notation conflicts and make things clear, we add tildes on all
the variables in Glynn and Whitt (1988b). ΛΛΛ is the limit of the time-averaged arrival
rate, so it corresponds to the second term of (1.2) in Glynn and Whitt (1988b), i.e.
ΛΛΛ = −λ3/2U . ΩΩΩ is the limit of customer-averaged LoS, so in the notation of Glynn and
Whitt (1988b), assuming Theorem 1 of Glynn and Whitt (1988b) holds, we should
write
t1/2(
∑N(t)k=1 Wk
N(t)− w) =
t
N(t)t−1/2(
N(t)∑k=1
Wk − N(t)w)
=t
N(t)t−1/2(
N(t)∑k=1
Wk − λtw + λtw − N(t)w)
=t
N(t)
(t−1/2(
N(t)∑k=1
Wk − λtw)− t−1/2w(N(t)− λt))
⇒ λ−1(λ1/2(W − wU) + wλ3/2U) = ΩΩΩ,
so that λ1/2W = λΩΩΩ+λ1/2wU−wλ3/2U . Finally, ΥΥΥ as the limit of LLL(n) corresponds to
the sixth term in (1.2) of Glynn and Whitt (1988b), and if we further have RRR(n) ⇒ 000,
then ΥΥΥ as the limit of QQQ(n) is the time-averaged occupancy level of the system which
corresponds to the eighth term in (1.2) of Glynn and Whitt (1988b). Both the sixth
and the eighth term have the common limit
λ1/2(W − wU) = λΩΩΩ + λ1/2wU − wλ3/2U − λ1/2wU
= λΩΩΩ− wλ3/2U = λΩΩΩ + wΛΛΛ.
CHAPTER 4. CLT VERSION OF THE PLL 47
Note that λ and w being the arrival rate and mean LoS correspond to λλλ and WWW ,
respectively, in our notation, so the terms in the two theorems match perfectly.
In the notation of Glynn and Whitt (1988b), our Theorem 3.1 in the case d = 1
actually states that if
t−1/2(N(t)− λt,
N(t)∑k=1
Wk − N(t)w)⇒ (ΛΛΛ,ΩΩΩ),
and Theorem 2(f) of Glynn and Whitt (1988b) (which is exactly (C1)) holds, then
we have the joint convergence of the second, sixth and eighth terms in (1.2) of Glynn
and Whitt (1988b).
Unlike Theorem 1 of Glynn and Whitt (1988b), Theorem 4.3 and Proposition
4.4 do not require stationarity. That is explained by the extra convergence RRR(n) ⇒
000, which corresponds to Theorem 2(f) of Glynn and Whitt (1988b). That extra
convergence in Glynn and Whitt (1988b) is implied by the stationarity. Inspired by
that observation, we will obtain a similar (stronger) sufficient condition for RRR(n) ⇒ 000
involving stationarity in Theorem 5.2.
4.4 Statistical Applications
The CLT versions of the PLL have statistical applications. First, Theorem 4.3 sup-
ports using the indirect estimator of the occupancy level via the PLL. Moreover,
confidence areas for the occupancy-level estimators can be constructed. In particu-
lar, if Theorem 4.1 or Corollary 4.5 holds, then we can have confidence ellipsoids forLLL.
To be specific, LLL(n) ⇒ ΥΥΥ ∼ N(000,ΣΥ) in Rd where ΣΥ, the covariance matrix of ΥΥΥ, is
determined by (4.19). Then when n is large n(LLL−LLL(n))T (ΣΥ)−1(LLL−LLL(n)) has approx-
imately a standard normal distribution. Let qα be such that P (|N(0, 1)| ≤ qα) = 1−α,
CHAPTER 4. CLT VERSION OF THE PLL 48
where 0 ≤ α < 1, then
xxx ∈ Rd : n(xxx− LLL(n))T (ΣΥ)−1(xxx− LLL(n)) ≤ qα, (4.20)
which is an ellipsoid centered at LLL(n), is a confidence ellipsoid for LLL with approximate
confidence level 1−α. By (4.15) and (4.19), the more negatively related ΛΛΛ and ΓΓΓ are,
the more asymptotically efficient the indirect estimator of the number of customer in
the system becomes. However, unlike the case for the ordinary Little’s law discussed
in Glynn and Whitt (1989), there are many covariance terms in (4.19) even if we only
consider the variance of ΥΥΥk for a given k, so it is not straightforward to compare the
asymptotic efficiency of the estimator when we change the other two elements in the
PLL.
Secondly, Theorem 4.1 as well as the two more sufficient conditions we will intro-
duce in Chapter 5 tell us when will the indirect estimator LLL(n) and the natural direct
estimator QQQ(n) for the number of customers in the system have the same asymp-
totic efficiency, i.e. LLL(n) and QQQ(n) converge to the same random variable. Then the
confidence area analysis above in (4.20) also holds for QQQ(n).
To apply (4.20), we need to estimate ΣΥ. The expression is complicated, but
we will soon establish a useful sufficient condition. In particular, under the con-
ditions of Theorem 5.1, we can estimate ΣΥ by (5.2) using λk(n) to estimate λk,
(n− 1)−1∑n
m=1(AAAm − λλλ(n))(AAAm − λλλ(n))T to estimate ΣΛ and the empirical distribu-
tions to estimate Fk,j and F ck,j.
CHAPTER 5. SUFFICIENT CONDITIONS FOR THE CLT VERSION OF THEPLL 49
Chapter 5
Sufficient Conditions for the
Central-Limit-Theorem Version of
the Periodic Little’s Law
In this chapter we provide convenient sufficient conditions for the two conditions in
Theorem 4.3 and RRR(n) ⇒ 000 to be satisfied, so that we have Proposition 4.4 and the
joint convergence in (4.16).
The first sufficient condition is independence. To state it, for n = 0, 1, 2, · · · , let
AAAn be the vector of arrivals at the d times within period n, i.e.,
AAAn = (A0+nd, A1+nd, · · · , Ad−1+nd) ∈ Rd. (5.1)
For vectors, let > mean strict order for all components.
Theorem 5.1 (independent case). If the following conditions hold:
(I1) AAAn, n = 0, 1, 2, · · · , are i.i.d. with EAAA0 = λλλ > 0 and Var(AAA0) = ΣΛ,
CHAPTER 5. SUFFICIENT CONDITIONS FOR THE CLT VERSION OF THEPLL 50
(I2) the LoS are mutually independent and independent of the arrival process, having
a cumulative distribution fuction that depends only on the discrete time period
k, and
(I3) the LoS distribution satisfies F ck,j ∼ O(j−(3+δ)) for all k and some δ > 0,
then conditions (C1), (C2) and RRR(n) ⇒ 000 are satisfied, so that the joint convergence
in (4.16) holds. Further, (ΛΛΛ,ΓΓΓ) has a zero-mean Gaussian distribution in Rd × (ℓ1)d
with ΛΛΛ ∼ N(000,ΣΛ), Cov(Γk,j,Γk,s) = λkFk,jFck,s for 0 ≤ k ≤ d− 1 and 0 ≤ j ≤ s, and
ΛΛΛ,ΓΓΓ0,ΓΓΓ1, · · · ,ΓΓΓd−1 are independent. As a special case of Corollary 4.5, (ΩΩΩ,ΥΥΥ) also
has a zero-mean Gaussian distribution with, for 1 ≤ k ≤ l ≤ d− 1,
Cov(ΩΩΩ,ΩΩΩ)k,l =
λk
∑∞i=0
∑∞j=0 Fk,mini,jF
ck,maxi,j, for k = l,
0, for k = l,
Cov(ΥΥΥ,ΥΥΥ)k,l =k∑
s=0
λ3s(∞∑
m=0
∞∑n=0
Fs,mink−s+md,l−s+ndFcs,maxk−s+md,l−s+nd)
+l∑
s=k+1
λ3s(∞∑
m=1
∞∑n=0
Fs,mink−s+md,l−s+ndFcs,maxk−s+md,l−s+nd)
+d−1∑
s=l+1
λ3s(∞∑
m=1
∞∑n=1
Fs,mink−s+md,l−s+ndFcs,maxk−s+md,l−s+nd)
+d−1∑i=0
d−1∑j=0
ck,ick,jΣΛi,j,
Cov(ΩΩΩ,ΥΥΥ)k,l =∞∑i=1
∞∑m=0
λ2kFk,mini,l−k+mdFck,maxi,l−k+md. (5.2)
where
ck,j =
∑∞
m=0 Fcj,k−j+md for 0 ≤ j ≤ k,∑∞
m=1 Fcj,k−j+md for k + 1 ≤ j ≤ d− 1.
The proofs can be found in §8.2.6.
CHAPTER 5. SUFFICIENT CONDITIONS FOR THE CLT VERSION OF THEPLL 51
Remark 5.1 (necessity of condition (I3)). In condition (I3) we assume a 3 + δ rate
of decay of the tail distribution of the LoS, which is stronger than an ordinary CLT
requires. However, this is needed in the proof as can be seen in (8.39). More generally,
it remains to determine if condition (I3) is necessary, i.e., if it can be replaced by a
2 + δ rate of decay.
It is significant that Theorem 5.1 can be applied to the MT/GIt/∞ queueing
model we built for the ED patient flow analysis in Whitt and Zhang (2017), which
had mutually independent Gaussian daily totals for the arrivals, but dependence
among the hourly arrivals within each day. However, we conjecture that more data
would show dependence among the daily totals as well. Moreover, we want to treat
queueing models that are not infinite-server models. For infinite-server models, each
LoS (sojourn time) coincides with the service time, but that is not the case for other
service systems. Hence, we next show that Theorem 4.3 and Proposition 4.4 can
be applied to more general queueing models by replacing the i.i.d. condition with
stationarity plus appropriate mixing, as in Chapter 4 of Billingsley (1999).
We start with the framework used in Chapter 3, where we consider the composite
arrival+service input process stochastic process YYY ≡ YYY n : n ∈ N with
YYY n ≡ Ynd+k,j : 0 ≤ k ≤ d− 1; j ≥ 0. (5.3)
For simplicity, we assume the LoS distributions are bounded, i.e. Yk,j = 0, j ≥ J , for
some constant J > 0, then each YYY n becomes a finite dimensional vector, and we can
CHAPTER 5. SUFFICIENT CONDITIONS FOR THE CLT VERSION OF THEPLL 52
regard YYY n as a d× J random matrix
YnYnYn =
Ynd+0,0 Ynd+0,1 · · · Ynd+0,J−1
Ynd+1,0 Ynd+1,1 · · · Ynd+1,J−1
... ... ... ...
Ynd+d−1,0 Ynd+d−1,1 · · · Ynd+d−1,J−1
∈ Rd×J . (5.4)
We will start by assuming YYY n : n ∈ N is a stationary process. By adding a suitable
mixing condition, we can show that it satisfies a multivariate CLT, then we exploit
the relationship between YYY n : n ∈ N and (λλλ(n), F cF cF c(n)) and show that (C1) and
(C2) hold. Finally, we show that RRR(n) ⇒ 000 so that Theorem 4.3 and Proposition 4.4
hold.
Before we state the theorem, we make a few definitions. For convenience, we
directly use YYY n, n ∈ N in (5.3) as the example. We say the process YYY n, n ∈ N is
strictly stationary if
(YYY 0,YYY 1, · · · ,YYY m)d= (YYY 0+n,YYY 1+n, · · · ,YYY m+n), for all m ≥ 0 and n ≥ 0, (5.5)
where d= means equal in distribution. For m ≤ n, let Fn
m ≡ σ(YYY m,YYY m+1, · · · ,YYY n)
be the sigma-algebra generated by the family of random variables, where we allow
n = ∞. And we say YYY n, n ∈ N is strongly mixing (or α-mixing) if α(n) → 0, where
α(n) ≡ supm∈N,A∈Fm
0 ,B∈F∞m+n
|P (A ∩B)− P (A)P (B)|, (5.6)
is called the strong mixing coefficient, following Theorem 18.5.3 of Ibragimov and
Linnik (1971) and Theorem 0 of Bradley (1985).
Now we have the following theorem, whose proof is provided in §8.2.7.
CHAPTER 5. SUFFICIENT CONDITIONS FOR THE CLT VERSION OF THEPLL 53
Theorem 5.2 (stationary+bounded case). If the following conditions hold:
(S1) the composite process YYY n, n ∈ N in (5.3) is strictly stationary and strongly
mixing,
(S2) EAAA0 = λλλ > 0 and there exists some δ > 0, such that E||AAA0||2+δ∞ < ∞ and∑∞
n=1 α(n)δ/(2+δ) <∞, and
(S3) The LoS distributions are bounded (by J),
for YYY n in (5.3) and AAA0 in (5.1), then condtions (C1), (C2) and RRR(n) ⇒ 000 are satisfied,
so that we have the joint convergence in (4.16). Furthermore, (ΛΛΛ,ΓΓΓ) has a zero-mean
Gaussian distribution in Rd × (RJ)d, so that Corollary 4.5 is also satisfied.
We remark that there is a large literature on the CLT under weak dependence so
that many generalizations of Theorem 5.2 are possible. For example, in Chapter 4
of Billingsley (1999) CLTs under mixing conditions in §19 are reduced to CLTs for
martingale-difference sequences in §18.
Finally, other than the two conditions we discussed in this section, there could be
many other sufficient conditions such that Proposition 4.4 holds.
CHAPTER 6. WEAK CONVERGENCE OF RANDOM ELEMENTS OF (ℓ1)d 54
Chapter 6
Weak Convergence of Random
Elements of (ℓ1)d(ℓ1)d(ℓ1)d
In this chapter, we provide background on the weak convergence of random elements
of (ℓ1)d. We used this space so that we could apply the continuous mapping theorem
to prove Theorem 4.3. In particular, condition (C1) in Theorem 4.3 requires the
convergence F cF cF c(n) ⇒ ΓΓΓ in the space (ℓ1)d.
Within functional analysis, (ℓ1)d is quite standard, as it is a finite product space
associated with the well known Banach space ℓ1. However, neither ℓ1 nor (ℓ1)d is
standard in weak convergence theory, especially in applications on queueing systems.
In fact, we know of no previous use in queueing theory. Of course, we have the well-
established general and powerful weak convergence theory in general metric spaces,
such as Billingsley (1999), but we would like to specialize the general theory to the
space (ℓ1)d. Besides, there is a substantial literature on Banach spaces and weak
convergence in Banach spaces, e.g., Araújo and Giné (1980), Ledoux and Talagrand
(1991), and Megginson (1998). We will draw on this developed theory.
We will give a practical criterion for checking weak convergence in the space (ℓ1)d.
CHAPTER 6. WEAK CONVERGENCE OF RANDOM ELEMENTS OF (ℓ1)d 55
The criterion is not used directly in our main theorems, but we hope it helps readers
better understand the meaning of weak convergence in (ℓ1)d and could be useful in
future research. We will also discuss the established CLT for Banach-space random
variables and specialize it to the space ℓ1, which, by giving this special case, helps to
understand when we can have such convergence as in (C1) and Gaussian limits as in
Corollary 4.5.
6.1 A Criterion for Weak Convergence in (ℓ1)d(ℓ1)d(ℓ1)d
The ℓ1 space is a well-defined separable Banach space. Throughout this paper, we
always use B∗ to represent the dual space of a Banach space B. As a special case
of a general metric space, the weak convergence of probability distributions on it is
well defined, just as in general metric spaces; see Billingsley (1999). For more on
random variables in Banach spaces and convergence of those random variables, see
Araújo and Giné (1980) and Ledoux and Talagrand (1991). Here we briefly state
some definitions and results about Banach spaces and probability measures defined
on them, which can be found in Megginson (1998) and Ledoux and Talagrand (1991).
Then we exploit the general results in the special case of (ℓ1)d space.
The (ℓ1)d space is the product (or in some literature called direct sum and denoted
as ℓ1 ⊕ ℓ1 ⊕ · · · ⊕ ℓ1) of ℓ1 spaces. We refer to Megginson (1998) for general Banach
spaces. Since ℓ1 is a separable Banach space, (ℓ1)d is also a separable Banach space
with the norm || · ||1,d we defined earlier in §4.1. The dual of a product of Banach
spaces is the product of the corresponding duals. Because (ℓ1)∗ = ℓ∞, thus the dual
of (ℓ1)d is (ℓ∞)d.
We want to have a good sufficient condition for weak convergence in (ℓ1)d. For
that purpose, let UUU (n) ≡ (UUU(n)0 , . . . ,UUU
(n)d−1) ∈ (ℓ1)
d, where UUU (n)k ≡ (UUU
(n)k,j : j ≥ 0) ∈ ℓ1,
CHAPTER 6. WEAK CONVERGENCE OF RANDOM ELEMENTS OF (ℓ1)d 56
and similarly for a prospective limit UUU . A result from Ledoux and Talagrand (1991)
states that UUU (n) converges weakly to UUU as soon as h(UUU (n)) converges weakly (as
a sequence of real valued random variables) to h(UUU) for every h in a weakly dense
subset of ((ℓ1)d)∗ = (ℓ∞)d, and UUU (n) is tight, i.e. for each ϵ > 0, there exists a
compact set K ⊆ (ℓ1)d such that P (UUU (n) ∈ K) ≥ 1 − ϵ for all n. Now we transform
the above conditions to more explicit ones in the case of (ℓ1)d space.
For n ≥ 1, denote
On ≡ (yyy ∈ (R∞)d : yk,j = 0, 0 ≤ k ≤ d− 1, j ≥ n− 1,
then O = ∪∞n=1On is the subset of (ℓ∞)d with only finite many non-zero components.
Lemma 6.1 (weakly dense set of (ℓ1)d). The subset O of (ℓ∞)d containing elements
with only finitely many non-zero components is a weakly dense subset of (ℓ∞)d.
Proof. It suffices to show that for any yyy ∈ (ℓ∞)d, there exists a sequence yyy(i) ⊆ O
which weakly converges to yyy. Let
y(i)k,j =
yk,j, if j < i,
0, otherwise,(6.1)
for all 0 ≤ k ≤ d−1 and i ≥ 1. Then yyy(i) ∈ O. Now we show that it weakly converges
to yyy.
Note that ((ℓ∞)d)∗ = (ℓ∞,s)d, where ℓ∞,s ≡ xxx = (x0, x1, · · · ) ∈ R∞ :
∑∞i=0 xi <
∞ is the set of all summable (but not necessarily absolutely summable) sequences.
CHAPTER 6. WEAK CONVERGENCE OF RANDOM ELEMENTS OF (ℓ1)d 57
For any xxx ∈ ((ℓ∞)d)∗ = (ℓ∞,s)d,
xxx(yyy(i)) =d−1∑k=0
∞∑j=0
xk,jy(i)k,j =
d−1∑k=0
i−1∑j=0
xk,jyk,j →d−1∑k=0
∞∑j=0
xk,jyk,j = xxx(yyy) as i→ ∞,
(6.2)
which indicates yyy(i) weakly converges to yyy, hence O is weakly dense in (ℓ∞)d.
Next, we describe the compact sets in (ℓ1)d space.
Lemma 6.2 (compact sets in (ℓ1)d space). A set K ⊆ (ℓ1)
d is compact if and only if
K is bounded and closed and for each ϵ > 0, there exists Jϵ such that for all yyy ∈ K,∑d−1k=0
∑∞j=Jϵ
|yk,j| < ϵ.
Proof. If K ⊆ (ℓ1)d is compact, then K must be closed and bounded. Further, it is
well known that a subset in metric space is compact if and only if it is complete and
totally bounded (see for example Theorem 45.1 of Munkres (2000)). Totally bounded
means that, for any ϵ > 0 given, we can find a finite set of yyy(i)Ni=1 ⊆ K such that
K is covered by the N ϵ-balls centered at those points. Because the set is finite, we
can find Jϵ large enough such that∑d−1
k=0
∑∞j=Jϵ
|y(i)k,j| < ϵ for all yyy(i) ∈ K. Then for
any yyy ∈ K, we can find yyy(i) such that ||yyy − yyy(i)||1,d < ϵ, so
d−1∑k=0
∞∑j=Jϵ
|yk,j| ≤d−1∑k=0
∞∑j=Jϵ
|yk,j − y(i)k,j|+
d−1∑k=0
∞∑j=Jϵ
|y(i)k,j|
≤d−1∑k=0
∞∑j=0
|yk,j − y(i)k,j|+
d−1∑k=0
∞∑j=Jϵ
|y(i)k,j|
≤ d||yyy − yyy(i)||1,d +d−1∑k=0
∞∑j=Jϵ
|y(i)k,j|
≤ dϵ+ ϵ. (6.3)
CHAPTER 6. WEAK CONVERGENCE OF RANDOM ELEMENTS OF (ℓ1)d 58
Conversely, assume K ⊆ (ℓ1)d is bounded and closed and for each ϵ > 0, there
exists Jϵ such that for all yyy ∈ K,∑d−1
k=0
∑∞j=Jϵ
|yk,j| < ϵ. Since (ℓ1)d is a Banach
space and K is closed, K is a complete subset. Assume K is bounded by C/d,
i.e. ||yyy||1,d ≤ C/d for all yyy ∈ K. For any ϵ > 0 fixed, let Jϵ be as above. Let
K0 ≡ xxx ∈ (RJϵ)d : ||xxx||1 ≤ C, which is bounded, thus totally bounded since it has
finite dimension. So there exists a finite set xxx(i)Ni=1 ⊆ (RJϵ)d such that the ϵ-balls
centered at those points cover K0. Assume yyy(i)Ni=1 ⊆ (ℓ1)d are the corresponding
points of xxx(i)Ni=1 when we naturally embed (RJϵ)d to (ℓ1)d, i.e. y(i)k,j = x
(i)k,j for j ≤ Jϵ
and 0 otherwise. For any yyy ∈ K, let xxx be the natural projection of yyy on (RJϵ)d.
Notice that ||xxx||1 ≤ d||yyy||1,d ≤ C, which means xxx ∈ K0, so there is xxx(i) such that
||xxx− xxx(i)||1 ≤ ϵ. Then,
||yyy − yyy(i)||1,d ≤d−1∑k=0
Jϵ−1∑j=0
|yk,j − y(i)k,j|+
d−1∑k=0
∞∑j=Jϵ
|yk,j − y(i)k,j|
≤ ||xxx− xxx(i)||1 +d−1∑k=0
∞∑j=Jϵ
|yk,j|
≤ ϵ+ ϵ, (6.4)
which implies that K is covered by the finite number of ϵ-balls in (ℓ1)d centered at
yyy(i)Ni=1. So K is totally bounded, thus compact.
Now we can give an easy-to-check sufficient condition for weak convergence of
random elements in (ℓ1)d, which can be used to establish (C1) when applying Theorem
4.3. For any m,C > 0, let
Km,C ≡ xxx ∈ (ℓ1)d : ||x||1,d ≤ C,
d−1∑k=0
∞∑j=mn
|xk,j| < n−1 for all n ≥ 1 (6.5)
which is obviously a compact set by Lemma 6.2.
CHAPTER 6. WEAK CONVERGENCE OF RANDOM ELEMENTS OF (ℓ1)d 59
Theorem 6.3 (criterion for convergence of random elements of (ℓ1)d)). Convergence
in distribution UUU (n) ⇒ UUU in (ℓ1)d as n→ ∞ holds if
(i) For all ϵ > 0, there exists m, C and corresponding Km,C, such that
P (UUU (n) ∈ Km,C) > 1− ϵ (6.6)
and
(ii) for all J , 0 ≤ J <∞,
(UUU(n)k,j : 0 ≤ k ≤ d− 1; 0 ≤ j ≤ J) ⇒ (UUUk,j : 0 ≤ k ≤ d− 1; 0 ≤ j ≤ J) in RdJ .
(6.7)
Condition (ii) holds if and only if (iii) for all J , 0 ≤ J < ∞, and for all sets of
real number ak,j : 1 ≤ k ≤ d− 1, 0 ≤ j ≤ J
d−1∑k=0
J∑J=0
ak,jUUU(n)k,j ⇒
d−1∑k=0
J∑j=0
ak,jUUUk,j. (6.8)
Proof. Condition (i) ensures that UUU (n) is tight. Condition (ii) is equivalent to
condition (iii) by the familiar Cramer-Wold device; see p. 382 of Billingsley (1995).
And condition (iii) mean yyy(UUU (n)) → yyy(UUU) in distribution for all yyy ∈ O, which by
Lemma 6.1 is a weakly dense set in ((ℓ1)d)∗. So that condition (i) with condition (iii)
ensures UUU (n) ⇒ UUU in (ℓ1)d.
6.2 Central Limit Theorem in ℓ1ℓ1ℓ1
In this section we specialize the CLT in general Banach spaces for i.i.d. random
elements to ℓ1. First, we emphasize that, even for i.i.d. random elements, the classical
CHAPTER 6. WEAK CONVERGENCE OF RANDOM ELEMENTS OF (ℓ1)d 60
CLT does not hold in all Banach spaces, but only for some “good” Banach spaces.
Fortunately, ℓ1 is such a “good” Banach space. To make this clear, we do a quick
review; more related theory can be found in Araújo and Giné (1980) and Ledoux and
Talagrand (1991).
We first introduce two concepts: cotype of a Banach space and pre-Gaussian
random variables. A separable Banach space B with norm || · || is said to be of cotype
q if there is a constant Cq such that for all finite sequences xi ∈ B,
(∑i
||xi||q)1/q ≤ Cq||∑i
ϵixi||, (6.9)
where ϵi is a Rademacher sequance, i.e., i.i.d. random variables with P (ϵi = 1) =
P (ϵi = −1) = 1/2. It is known that ℓ1 is of cotype 2; see page 274 in §9.2 of Ledoux
and Talagrand (1991).
We say that a random variable UUU in a Banach space B has a Gaussian distribution
if, for every h in the dual space B∗, h(UUU) has a one-dimensional Gaussian distribution.
A random variable UUU in B, with Eh(UUU) = 0 and Eh2(UUU) <∞ for every h in B∗ (i.e.
weakly centered and square integrable), is pre-Gaussian if its covariance is also the
covariance of a Gaussian Borel probability measure on B. A weakly centered and
square integrable random variable UUU = (U0, U1, · · · ) in ℓ1 is pre-Gaussian if and only
if∞∑k=0
(E|Uk|2)1/2 <∞; (6.10)
see page 261 of Ledoux and Talagrand (1991).
Theorem 6.4 (CLT for Banach-space random variables from Ledoux and Talagrand
(1991) Theorem 10.7). If UUU is pre-Gaussian with values in a separable cotype-2 Banach
space, then UUU satisfies the CLT, i.e., n−1/2∑n
i=1UUU(i) where UUU (i) are i.i.d. copies of
CHAPTER 6. WEAK CONVERGENCE OF RANDOM ELEMENTS OF (ℓ1)d 61
UUU , converges in distribution, where the limit is Gaussian.
We remark that the limit distribution must have a Gaussian distribution because
if UUU satisfies the CLT in B, then h(UUU) satisfies the ordinary CLT with a Gaussian
limit for h ∈ B∗. If we include (6.10), then we get the following corollary.
Corollary 6.5 (CLT for ℓ1-valued random variables). If UUU = (U0, U1, · · · ) ∈ ℓ1 is
pre-Gaussian, then UUU satisfies the CLT. The limit has a Gaussian distribution with
the same covariance structure as UUU .
We now discuss how this background theory is relevant here. We will be consid-
ering a discrete random variable Y taking values in the non-negative integers. Let
F cj ≡ P (Y ≥ j) for k = 0, 1, · · · be the ccdf of Y and Fj ≡ 1− F c
j . Let Y (i) be i.i.d.
random variables each distributed as Y and let
U(i)j ≡ IY (i)≥j − F c
j .
Then UUU (i) are i.i.d. random variables in ℓ1 distributed as UUU with Eh(UUU) = 0 for all
h ∈ ℓ∞ (i.e., for all h ∈ (ℓ1)∗). We want UUU to be pre-Gaussian, so we require that
(6.10) holds, i.e.,
∞∑k=0
(E(IY (1)≥j − F cj )
2)1/2 =∞∑k=0
(FjFcj )
1/2 <∞. (6.11)
Hence, a sufficient condition for (6.11) is
F cj ∼ O(j−(2+ϵ)) for some ϵ > 0, (6.12)
which is actually implied by condition (I3) in Theorem 5.1. Condition (6.12) is also
sufficient for Eh(UUU) = 0 and Eh2(UUU) < ∞ for every h in ℓ∞. Hence, UUU is pre-
CHAPTER 6. WEAK CONVERGENCE OF RANDOM ELEMENTS OF (ℓ1)d 62
Gaussian.
63
Part II
An Application of the Periodic
Little’s Law
CHAPTER 7. REAL-TIME PREDICTOR FOR THE OCCUPANCY LEVEL 64
Chapter 7
Real-Time Predictor for the
Occupancy Level
In this chapter we propose a real-time procedure for predicting the hourly occupancy
level of an Emergency Department, given we have good predictions for daily arrival
totals. It is a forecast problem for a specific sample path. So instead of using the long-
run limit terms as in the PLL formula, we directly use the counts in our estimator.
The predictor is in the indirect estimation form of the occupancy level, inspired by
the PLL. A key observation is that predicting indirectly via the PLL can have a
better result than directly using the occupancy level history records to construct the
estimator. This application uses the same patient flow dataset in §1.3 as our applied
motivation of the PLL. Our goal is to predict the occupancy level of the Emergency
Department for the next few hours. For readers’ convenience, we will brief the dataset
and the problem settings in §7.1, then we introduce the real-time predictor based on
the PLL in §7.2.
CHAPTER 7. REAL-TIME PREDICTOR FOR THE OCCUPANCY LEVEL 65
7.1 The Data and the Problem
Armony et al. (2015) introduces a patient flow dataset of the Emergency Department
of Rambam Hospital located in Haifa, Isreal. The data is owned by the Service
Enterprise Engineering (SEE) Center at the Technion – Israel Institute of Technology.
They made the data publicly available so that researchers may conduct reproducible
studies based on it.
The data includes arrival and departure times of individual patients from January
2004 to October 2007, but it is also limited in that it does not contain a detailed
account of all the steps and processes that take place during a patient’s stay. In
Whitt and Zhang (2017) we proposed an aggregate time-varying stochastic queueing
model for the patient flow after doing a careful investigation to 25 weeks data. One key
observation is that both the arrival rates and the LoS distributions are periodic time-
varying instead of being constant. Then in Whitt and Zhang (2019b) we addressed
the prediction issue by involving the whole available data.
This chapter is based on section 5 of Whitt and Zhang (2019b), where we have
got good predictions for the daily arrival totals for the Emergency Department in the
previous sections of that paper by a complicated time-series model. We also have
estimations of the LoS distributions by using historical empirical LoS distributions.
We showed that in short term (within a few months), we can view the LoS distribution
for any given hour in a week as a constant. The idea is to forecast the future occupancy
level via arrival rates and the LoS distributions, which is exactly what the PLL enables
us to do. Moreover, we make it in real-time by using the most recent available data
up to the time epoch that we make predictions.
We divide the dataset into a training set and a test set. For simplicity, we use
the Lebanon war period as our split point because the war period is an outlier that
CHAPTER 7. REAL-TIME PREDICTOR FOR THE OCCUPANCY LEVEL 66
needs to discard, so the training set contains 923 days from Jan. 1, 2004 to July 11,
2006 and the test set has 443 days from Aug. 15, 2006 to Oct. 31, 2007. In Whitt
and Zhang (2019b), we use the training set to find the optimal model to forecast the
daily arrival totals, which turns out to be a seasonal autoregressive integrated moving
average with exogenous regressors model (SARIMAX), which incorporates the local
holiday indicators and weather variables. We use the test set to check our choice and
compare with other methods. We use mean squared error (MSE, by which we mean
the estimate) to measure the precision of our prediction, defined by
MSE =1
N
N∑i=1
(yi − yi)2, (7.1)
where N is the sample size, yi are the true values, and yi are the predicted values.
7.2 Real-Time Predictor for the Occupancy Level
We start by predicting the occupancy level for 1 hour ahead. Afterward, we show
that the method can be extended to more hours ahead, but losing predictive power
as the time interval increases. For this occupancy prediction, we exploit the elapsed
service times of the patients initially in the system. Thus, an important reference
point is the average LoS, which is about 4− 4.5 hours for our data. Clearly the value
of information about current patients will dissipate as the time interval increases to
4 hours and beyond.
7.2.1 Predicting Occupancy Level One Hour Ahead
Here we use the same discrete-time framework as we used in Whitt and Zhang (2017)
Whitt and Zhang (2019c) and Whitt and Zhang (2019a) since we assume both the
CHAPTER 7. REAL-TIME PREDICTOR FOR THE OCCUPANCY LEVEL 67
arrival rate and the LoS distribution are fixed within an hour. Let Yk,j denote the
number of patients that arrived in the kth discrete time period (in our case one time
period represents an hour) and had a LoS larger or equal to j hours, i.e., they left at
or after time period k + j. We assume all the arrivals occurred at the beginning of
each time period and departures occurred at the end of the time period and we count
the number of patients (occupancy level) in the middle of each time period. Under
this rule, the occupancy level in time period k is
Qk =∞∑j=0
Yk−j,j. (7.2)
Taking expectations of the above equation under proper assumptions gives us the
stochastic version of the PLL as in Theorem 3.1. Our goal is to approximate Qk,
assuming that we are given all the arrival and departure epochs for each patient that
occurred before time period k.
We built a stochastic model for the patients flow in Whitt and Zhang (2017),
which describes the arrival process and the time-varying LoS distributions quite well.
Given the daily arrival totals, hourly arrival rate curve and the LoS distributions for
each hour, we can easily calculate the occupancy level of the system. However, how
can we predict the occupancy level for the next hour given all the history up to now?
We propose a real-time predictor as follow.
To predict Qk, first we observed that we can use finite summation to approximate
the infinite sum, because in reality we can always view the LoS distributions as
bounded. Actually only 3.60% of all the patients had a LoS greater or equal to 13
hours. Hence we divided Qk into two parts and estimated them separately, letting
Qk =12∑j=0
Yk−j,j +∞∑
j=13
Yk−j,j ≡12∑j=0
Yk−j,j +Rk,13. (7.3)
CHAPTER 7. REAL-TIME PREDICTOR FOR THE OCCUPANCY LEVEL 68
Note that Yk,0 is the total number of arrivals in time period k. Since we showed
in Whitt and Zhang (2017) that the arrival process could be modeled as a non-
homogeneous Poisson process within a day given the daily arrival totals, given that
we have already got the predicted daily arrival totals, we only need to estimate the
arrival rate function within a day. We use the empirical hourly arrival rate curve
by combining the arrivals on the same day-of-week of the latest 10 weeks as our
estimated arrival rate for a day. We estimated the LoS distribution similarly by using
the empirical distribution of the same hour from the latest 10 weeks because the LoS
distributions also change slowly.
To conveniently express our estimator, we re-index our time period from k to
a three-element tuple (w, d, h), where w ∈ 0, 1, 2, · · · represents the week index,
d ∈ 0, 1, 2, 3, 4, 5, 6 is the day-of-week index and h ∈ 0, 1, 2, · · · , 23 is the hour
index. There is a one-to-one relation between these indices k and the tuples (w, d, h),
so that we will use them exchangeably in the following part of the thesis. We also
let (w, d, h) ± x denote add/minus x time periods (hours) to time period (w, d, h).
Let A(w,d) be the predicted daily arrival totals of week w, day-of-week d, and let
A(w,d) ≡∑23
h=0 Y(w,d,h),0 to be the true daily arrival totals. Then we let the estimator
for Y(w,d,h),0 be
Y(w,d,h),0 ≡ A(w,d) ∗λ(w,d,h)∑23h=0 λ(w,d,h)
= A(w,d) ∗∑10
i=1 Y(w−i,d,h),0∑10i=1A(w−i,d)
, (7.4)
where λ(w,d,h) ≡ (1/10)∑10
i=1 Y(w−i,d,h),0 is the estimated hourly arrival rate for time
period (w, d, h).
To estimate Yk−j,j for j = 1, 2, · · · , 12, note that we can observe Yk−j,j−1 for
j = 1, 2, · · · , 12 at time period k − 1, i.e. the number of patients that arrived at
time period k − j and still stayed in the system at time period k − 1. Let Fk be the
CHAPTER 7. REAL-TIME PREDICTOR FOR THE OCCUPANCY LEVEL 69
information filtration; i.e., Fk denotes all the observable arrival and LoS information
up to time k. Since we showed in Whitt and Zhang (2017) that the LoSs as i.i.d.
random variables given the arrival time, the conditional expectation of Yk−j,j can be
expressed as the product of Yk−j,j−1 and the corresponding survival probability of
each patient, which is
E(Yk−j,j|Fk−1) = Yk−j,j−1 ∗ Pk−j(W ≥ j|W ≥ j − 1) ≡ Yk−j,j−1 ∗ pk−j,j−1, (7.5)
where W is a random variable following the LoS distribution of customers that arrived
at time period k − j, and pk−j,j−1 is the probability of a patient that arrived at time
period k− j and did not leave the system up to time period (k− j) + (j− 1) = k− 1
will still be there at time period k. Again, we use the latest 10 weeks history data to
estimate that probability. We estimated p(w,d,h),j by
p(w,d,h),j ≡∑10
i=1 Y(w−i,d,h),j+1∑10i=1 Y(w−i,d,h),j
, j = 0, 1, · · · , 11. (7.6)
After combining these components, the estimator for Y(w,d,h)−j,j is
Y(w,d,h)−j,j = Y(w,d,h)−j,j−1 ∗ p(w,d,h)−j,j−1, j = 1, 2, · · · , 12. (7.7)
Finally, we need to estimate Rk,13, the number of patients in time period k that
had already been in the system for greater or equal to 13 hours. Instead of estimating
Rk,13 directly, we actually estimate rk,13 ≡ Rk,13/Qk = 1 − (∑12
j=0 Yk−j,j)/Qk by the
history data of the latest 10 weeks. Under the other indexing scheme, this can be
written as
r(w,d,h),13 ≡ 1−∑10
i=1
∑12j=0 Y(w−i,d,h)−j,j∑10
i=1Q(w−i,d,h)
. (7.8)
CHAPTER 7. REAL-TIME PREDICTOR FOR THE OCCUPANCY LEVEL 70
The reason we did like this is that we only used Yi,j and Qi for i ≤ k and j =
0, 1, · · · ,min12, k− i, which means we only need to keep a finite dimensional record
of the Y matrix. Of course the number we chose here (12) is somewhat arbitrary,
we could also keep record of Yk,j for j up to 11 or 13, say. However, in making this
choice, we note that smaller values will cause us to waste some information we have,
while larger values will likely produce inaccurate estimates of pk,j for large j’s.
Combining the estimation for each part in (7.4), (7.6), (7.7) and (7.8), we get the
estimator for Q(w,d,h) as
Q(w,d,h) ≡∑12
j=0 Y(w,d,h)−j,j
1− r(w,d,h),13
. (7.9)
We use the prediction results of daily arrival totals by SARIMAX(6, 1, 0)(0, 0, 2)7
as we introduced in §4.4 of Whitt and Zhang (2019b) for A(w,d) in (7.4). Since we
need 10 weeks data to estimate the arrival rates and the LoS distributions before we
start predicting the occupancy level, we make the hourly occupancy prediction from
12 p.m. Oct.25, 2006, which is 10 weeks after the start of test set, to 11 p.m. Oct.31,
2007, i.e. 8928 hours or equivalently 372 days.
The MSE of 1-hour-ahead real-time occupancy prediction is 14.65 (MAPE 10.59).
For comparison, we also apply two other simple prediction methods applied to the
same test period. Both are direct estimators which only use the occupancy level
history. The first alternative method is to directly use the current hour occupancy
level as the prediction for the next hour’s occupancy level (i.e., Q(w,d,h) = Q(w,d,h−1)),
and the MSE of this method is 23.04. The second method is to use the average
of 10 weeks’ observed occupancy levels at the same hour in a week before the one
we want to forecast as the predictor (i.e., Q(w,d,h) = 1/10∑10
i=1Q(w−i,d,h)). By this
second method, the MSE is 51.91. So we make a 30% improvement of the first
naive prediction method and even better to the second. Figure 7.1 plots part of the
CHAPTER 7. REAL-TIME PREDICTOR FOR THE OCCUPANCY LEVEL 71
prediction compared to the actual values, showing that the prediction curve is quite
close to the true curve.
2007-02-02 2007-02-03 2007-02-04 2007-02-05 2007-02-06 2007-02-07 2007-02-08 2007-02-09 2007-02-10 2007-02-11 2007-02-12time
10
20
30
40
50
60
70
occu
panc
y leve
l
truepredicted
Figure 7.1: Predicted occupancy level one hour ahead compared to the true occupancylevel.
7.2.2 Predicting Occupancy Level More Than One Hour Ahead
The real-time occupancy predictor for one hour ahead in the previous section can
easily be generalized to estimate the occupancy level two or more hours in the future.
However, as we predict farther into the future, we encounter two problems: (i) the
relevance of the elapsed times of current patients will decrease and (ii) the number of
estimators we must use will increase. For both reasons, we expect that the error will
increase. We take predicting 2 hours ahead as an example to show how to construct
the predictor; others can be done in the same way.
Recall that
Q(w,d,h) =12∑j=0
Y(w,d,h)−j,j +∞∑
j=13
Y(w,d,h)−j,j ≡12∑j=0
Y(w,d,h)−j,j +R(w,d,h),13
CHAPTER 7. REAL-TIME PREDICTOR FOR THE OCCUPANCY LEVEL 72
as in (7.3). Since we are predicting 2 hours ahead, we assume that we are now at
time (w, d, h)− 2. The estimator for Y(w,d,h)−0,0 is the same as in (7.4). However, we
can no longer apply (7.7) to estimate Y(w,d,h)−j,j because Y(w,d,h)−j,j−1 is not available
to us as we assume we are now at (w, d, h)−2. Analogous to (7.5), for j ≥ 2, we have
E(Y(w,d,h)−j,j|F(w,d,h)−2) = Y(w,d,h)−j,j−2 ∗ P(w,d,h)−j(W ≥ j|W ≥ j − 2)
≡ Y(w,d,h)−j,j−2 ∗ p(w,d,h)−j,j−2,2, (7.10)
where we use p(w,d,h),j,l to denote the probability that a patient arrived at time period
(w, d, h) will stay for at least another l hours given that the patient has been in the
system for j hours. Similar to (7.6), we estimate p(w,d,h),j,l by
p(w,d,h),j,l ≡∑10
i=1 Y(w−i,d,h),j+l∑10i=1 Y(w−i,d,h),j
. (7.11)
Hence, for Y(w,d,h)−j,j, j = 2, 3, · · · , 12, we use
Y(w,d,h)−j,j = Y(w,d,h)−j,j−2 ∗ pk−j,j−2,2, (7.12)
as the estimator. For Y(w,d,h)−1,1, since it is in the future, we use the product of the
predicted total arrivals and the corresponding survival probability,; i.e., we use (7.6)
for j = 1 and replace Y(w,d,h)−1,0 on the right hand side by Y(w,d,h)−1,0. Finally, we use
the same equation (7.9) to get the estimator for Q(w,h,d).
When making predictions for more than 1 hour ahead, we need to use predictions
for daily arrival totals for two or more days ahead. This can be seen from equation
(7.4). When we predict the hourly total arrivals, we need the forecast of daily arrival
totals for that day. However, we can make 1 day ahead daily volume prediction only
after 00:00 on that day. For example, assume that we are now at 23:00 on day k and
CHAPTER 7. REAL-TIME PREDICTOR FOR THE OCCUPANCY LEVEL 73
we want to predict the occupancy level for 00:00-01:00 on day k+1. According to our
real-time prediction procedure, we need to know both Ak and Ak+1. Note that here
Ak is the 1-day-ahead daily arrival total prediction while Ak+1 must be 2-days-ahead
daily arrival total prediction because we have not observed the arrivals in 23:00-24:00
on day k so that we don’t know Ak yet, and the last observed daily arrival total is
Ak−2.
Table 7.1 shows the MSE for predicting the occupancy level more than 1 hour
ahead for up to 6 hours. For comparison, Table 7.2 shows the corresponding MSE by
using rolling history occupancy average using n weeks history.
# of hours ahead 1 2 3 4 5 6MSE 14.65 25.21 35.05 66.33 113.59 160.16
Table 7.1: MSE for predicting the occupancy several hours ahead.
n 1 2 3 4 5MSE 82.24 64.29 57.43 55.74 54.92n 6 7 8 9 10
MSE 53.92 53.10 52.78 52.52 51.91
Table 7.2: MSE for predicting the occupancy by the rolling averages of n weeks.
Table 7.1 shows that beyond 4 hours ahead, the error is larger than the MSE 51.91
obtained by using rolling history occupancy average using n = 10 weeks history. For
another comparison, the MSE is only 58.83 if we predict the occupancy level by
the observed occupancy level two hour ago. Hence, we conclude that our real-time
occupancy predictor outperforms the rolling average predictor for forecasting from
1 to 3 hours in the future, but not longer. That is roughly what we expect, given
that the mean LoS for all patients is about 4 hours. Beyond that time interval, the
current state information will not provide much information. At the same time, we are
required to make too many estimations in the real-time estimation procedure, which
CHAPTER 7. REAL-TIME PREDICTOR FOR THE OCCUPANCY LEVEL 74
will make it perform worse than the rolling average estimator, which only removes
noise but does not use the recent system state.
75
Part III
Proofs
CHAPTER 8. PROOFS 76
Chapter 8
Proofs
We now provide all the postponed proofs.
8.1 Proofs for the Sample-Path Version of the Pe-
riodic Little’s Law
In this section all the limits are in the sense of almost sure convergence. Hence, we
can focus on one sample path where all the limits exist.
8.1.1 Proof of Lemma 2.1
We will show that under the Assumptions (A1), (A2) and (A3) in (2.2), we have
limn→∞
λk+ld(n), limn→∞
F ck+ld,j(n) and lim
n→∞Wk+ld(n) exist for all 0 ≤ k ≤ d − 1, l ≥ 0,
CHAPTER 8. PROOFS 77
j ≥ 0, and
limn→∞
λk+ld(n) ≡ λk+ld = λk,
limn→∞
F ck+ld,j(n) ≡ F c
k+ld,j = F ck,j, and
limn→∞
Wk+ld(n) ≡ Wk+ld = Wk, (8.1)
where λk, F ck,j and Wk are the same constants as in (2.2).
By the definition of λk,
λk = limn→∞
λk(n) = limn→∞
1
n
n∑m=1
Ak+(m−1)d
= limn→∞
1
n(Ak +
n−1∑m=1
A(k+d)+(m−1)d) = limn→∞
n− 1
n
1
n− 1
n−1∑m=1
A(k+d)+(m−1)d
= λk+d. (8.2)
Next, by (2.1), we have Yk,j(n) = λk(n)Fck,j(n). By Assumptions (A1), (A2) and
(A3) in (2.2), we know that limn→∞
Yk,j(n) = λkFck,j exists for all 0 ≤ k ≤ d − 1 and
j ≥ 0. Using the same argument as for λk, we know that limn→∞
Yk+d,j(n) = limn→∞
Yk,j(n)
for all 0 ≤ k ≤ d− 1, j ≥ 0. Then,
F ck+d,j = lim
n→∞F ck+d,j(n) =
limn→∞
Yk+d,j(n)
limn→∞
λk+d(n)
=λkF
ck,j
λk= F c
k,j, for all 0 ≤ k ≤ d− 1, j ≥ 0. (8.3)
Similarly, for Wk+d we have
Wk+d =∞∑j=0
F ck+d,j =
∞∑j=0
F ck,j = Wk for all 0 ≤ k ≤ d− 1. (8.4)
CHAPTER 8. PROOFS 78
By induction, we proved (8.1).
8.1.2 Proof of Theorems 2.2 and 2.3
The proof is done in two steps. In step 1, we show that limn→∞
Lk(n) =∑∞
j=0 λk−jFck−j,j
for k = 0, 1, · · · , d−1, and then in step 2, we show that L ≡ limn→∞
Qk(n) = limn→∞
Lk(n),
thus completing the proof of Theorem 2.2 and 2.3 together.
Step 1: Given a fixed ϵ > 0, by assumption (A1), there exists N1, such that for
any n > N1, we have sup0≤k≤d−1
|λk(n)− λk| < ϵ. Given assumption (A2), by the series
form of Scheffé’s lemma (p.215 of Billingsley (1995)), we know that assumption (A3)
is equivalent to
limn→∞
∞∑j=0
|F ck,j(n)− F c
k,j| = 0. (8.5)
CHAPTER 8. PROOFS 79
So there exists N2, such that for any n > N2, we have sup0≤k≤d−1
∑∞j=0 |F c
k,j(n)−F ck,j| < ϵ.
Let N3 = maxN1, N2, then when n > N3,
|Lk(n)−∞∑j=0
λk−jFck−j,j|
= |k∑
i=0
λi(n)∞∑l=0
F ci,k−i+ld(n) +
d−1∑i=k+1
λi(n)∞∑l=1
F ci,k−i+ld(n)
−k∑
i=0
λi
∞∑l=0
F ci,k−i+ld −
d−1∑i=k+1
λi
∞∑l=1
F ci,k−i+ld|
= |k∑
j=0
λk−j(n)Fck−j,j(n) +
∞∑m=1
d∑j=1
λd−j(n)Fcd−j,(m−1)d+j+k(n)
−k∑
j=0
λk−jFck−j,j −
∞∑m=1
d∑j=1
λd−jFcd−j,(m−1)d+j+k|
≤ |k∑
j=0
λk−jFck−j,j(n) +
∞∑m=1
d∑j=1
λd−jFcd−j,(m−1)d+j+k(n)
−k∑
j=0
λk−jFck−j,j −
∞∑m=1
d∑j=1
λd−jFcd−j,(m−1)d+j+k|
+ ϵ( k∑
j=0
F ck−j,j(n) +
∞∑m=1
d∑j=1
F cd−j,(m−1)d+j+k(n)
)≤ ( max
0≤k≤d−1λk)
( k∑j=0
|F ck−j,j(n)− F c
k−j,j|
+∞∑
m=1
d∑j=1
|F cd−j,(m−1)d+j+k(n)− F c
d−j,(m−1)d+j+k|)
+ ϵ( k∑
j=0
F ck−j,j(n) +
∞∑m=1
d∑j=1
F cd−j,(m−1)d+j+k(n)
)≤ ( max
0≤k≤d−1λk)dϵ+ ϵd
(max
0≤k≤d−1Wk + ϵ
). (8.6)
Hence limn→∞
|Lk(n)−∑∞
j=0 λk−jFck−j,j| = 0 for all 0 ≤ k ≤ d− 1.
CHAPTER 8. PROOFS 80
Step 2: Next, we show that Lk ≡ limn→∞
Qk(n) = limn→∞
Lk(n). To do so, actually we
will prove that
E(n) ≡d−1∑k=0
Ek(n) ≡d−1∑k=0
|Lk(n)− Qk(n)| → 0 as n→ ∞. (8.7)
We further divide this step into 2 substeps. In the first substep, we compute the
expression of E(n), and then in the second substep we show it goes to 0 as n goes to
infinity.
Step 2.1: By some transformation, we know that
Qk(n) ≡1
n
n∑m=1
Qk+(m−1)d =1
n
n∑m=1
k+(m−1)d∑j=0
Yk+(m−1)d−j,j
=
1
n
n∑m=1
k∑j=0
Yk−j+(m−1)d,j +1
n
n∑m=2
k+(m−1)d∑j=k+1
Yk−j+(m−1)d,j
=1
n
n∑m=1
k∑j=0
Yk−j+(m−1)d,j +1
n
n−1∑m=1
md∑j=1
Yd−j+(m−1)d,j+k
=1
n
n∑m=1
k∑j=0
Yk−j+(m−1)d,j +1
n
n−1∑m=1
d∑j=1
Yd−j+(m−1)d,j+k
+1
n
n−2∑m=1
md∑j=1
Yd−j+(m−1)d,j+k+d
=1
n
n∑m=1
k∑j=0
Yk−j+(m−1)d,j +1
n
n−1∑s=1
n−s∑m=1
d∑j=1
Yd−j+(m−1)d,j+k+(s−1)d, (8.8)
CHAPTER 8. PROOFS 81
and
Lk(n) =k∑
j=0
Yk−j,j(n) +∞∑s=1
d∑j=1
Yd−j,j+k+(s−1)d(n)
=1
n
n∑m=1
k∑j=0
Yk−j+(m−1)d,j +1
n
∞∑s=1
n∑m=1
d∑j=1
Yd−j+(m−1)d,j+k+(s−1)d. (8.9)
We may now study the absolute difference between Lk(n) and Qk(n). Here
Ek(n) ≡ |Lk(n)− Qk(n)|
=1
n
n−1∑s=1
n∑m=n−s+1
d∑j=1
Yd−j+(m−1)d,j+k+(s−1)d
+1
n
∞∑s=n
n∑m=1
d∑j=1
Yd−j+(m−1)d,j+k+(s−1)d
=1
n
n∑m=1
d∑j=1
∞∑s=n−m+1
Yd−j+(m−1)d,j+k+(s−1)d, (8.10)
and summing over k = 0, 1, · · · , d− 1 further gives
E(n) ≡d−1∑k=0
Ek(n) =d−1∑k=0
1
n
n∑m=1
d∑j=1
∞∑s=n−m+1
Yd−j+(m−1)d,j+k+(s−1)d
=1
n
n∑m=1
d∑j=1
∞∑s=(n−m)d
Yd−j+(m−1)d,j+s. (8.11)
Step 2.2: Now it suffices to show that E(n) → 0 as n→ ∞. For that purpose,
Let N1, N2 and N3 be the same as in the beginning of the proof, depending on given
CHAPTER 8. PROOFS 82
ϵ. Then, when n > N3, we have
|∞∑j=0
Yk,j(n)−∞∑j=0
λkFck,j| = |
∞∑j=0
λk(n)Fck,j(n)−
∞∑j=0
λkFck,j|
≤ |∞∑j=0
λkFck,j(n)−
∞∑j=0
λkFck,j|+ ϵ
∞∑j=0
F ck,j(n)
≤ λk
∞∑j=0
|F ck,j(n)− F c
k,j|+ ϵ (Wk + ϵ)
≤ λkϵ+ ϵ (Wk + ϵ) . (8.12)
Assumptions (A1) and (A2) indicate that limn→∞
Yk,j(n) = λkFck,j. Again, by Scheffé’s
lemma, (8.12) is equivalent to
limn→∞
∞∑j=0
|Yk,j(n)− λkFck,j| = 0. (8.13)
For any ϵ > 0, since∑∞
j=0 λkFck,j converges for each k = 0, 1, · · · , d− 1, we know
that there exists J , such that
d−1∑k=0
∞∑j=J
λkFck,j < ϵ.
Let N4 ≡ ⌈J/d⌉ where ⌈x⌉ means the smallest integer greater than x, then when
n ≥ N4, we have nd > J . By (8.13), there exists N5 such that when n > N5,
d−1∑k=0
∞∑j=0
|Yk,j(n)− λkFck,j| < ϵ.
CHAPTER 8. PROOFS 83
And let N6 ≡ maxN4, N5, then when n ≥ N6,
d−1∑k=0
∞∑j=N6d
Yk,j(n) ≤ 2ϵ. (8.14)
And let N7 ≡ ⌈N6/ϵ⌉, then when n > N7, we have (n−N6)/n > 1− ϵ. Finally, when
n > maxN7, 2N6,
E(n) =1
n
n∑m=1
d∑j=1
∞∑s=(n−m)d
Yd−j+(m−1)d,j+s
=d∑
j=1
1
n
n−N6∑m=1
∞∑s=(n−m)d
Yd−j+(m−1)d,j+s
+d∑
j=1
1
n
n∑m=n−N6+1
∞∑s=(n−m)d
Yd−j+(m−1)d,j+s
≤d∑
j=1
1
n
n−N6∑m=1
∞∑s=N6d
Yd−j+(m−1)d,j+s +d∑
j=1
1
n
n∑m=n−N6+1
∞∑s=0
Yd−j+(m−1)d,s
≤d∑
j=1
1
n
n∑m=1
∞∑s=N6d
Yd−j+(m−1)d,j+s +d∑
j=1
1
n
n∑m=n−N6+1
∞∑s=0
Yd−j+(m−1)d,s
=d∑
j=1
∞∑s=N6d
Yd−j,j+s(n) +d∑
j=1
∞∑s=0
(Yd−j,s(n)−n−N6
nYd−j,s(n−N6))
≤ 2ϵ+ 2ϵ+ ϵ(d−1∑k=0
∞∑j=0
λkFck,j + ϵ). (8.15)
The inequality in the third line is just relaxing the index s, the next inequality
is relaxing the index m for the first term, and the last inequality is given by the
definition of N6 and N7, where the first term is bounded by 2ϵ by using (8.14), and
CHAPTER 8. PROOFS 84
the second term is bounded by
d∑j=1
∞∑s=0
(Yd−j,s(n)−n−N6
nYd−j,s(n−N6))
≤d∑
j=1
∞∑s=0
(Yd−j,s(n)− (1− ϵ)Yd−j,s(n−N6))
=d∑
j=1
∞∑s=0
(Yd−j,s(n)− Yd−j,s(n−N6)) + ϵYd−j,s(n−N6)
≤2ϵ+ ϵ(d−1∑k=0
∞∑j=0
λkFck,j + ϵ). (8.16)
So we have proved that Lk ≡ limn→∞
Qk(n) = limn→∞
Lk(n), which completes the proof
of Theorem 2.2 and 2.3.
8.1.3 Proof of the Second Half of Corollary 2.4
Proof: Here we give the second half of the proof of Corollary 2.4, i.e., the explicit
bound in (2.9).
From equation (8.10), we know that
Ek(n) =1
n
n∑m=1
d∑j=1
∞∑s=n−m+1
Yd−j+(m−1)d,j+k+(s−1)d.
By the condition, we know that Yi,j = 0 for i ≥ 0 and j ≥ dmu. Hence, when n ≥ mu,
Ek(n) =1
n
d∑j=1
n∑m=n−mu
mu+1∑s=n−m+1
Yd−j+(m−1)d,j+k+(s−1)d
≤ 1
n
d∑j=1
n∑m=n−mu
mu+1∑s=n−m+1
Ad−j+(m−1)d ≤1
n
d∑j=1
n∑m=n−mu
mu+1∑s=n−m+1
λu
=1
nλud
(mu + 1)(mu + 2)
2≤ λud(mu + 2)2
2n. (8.17)
CHAPTER 8. PROOFS 85
So we have proved Corollary 2.4.
8.2 Proofs for the Central-Limit-Theorem Version
of the Periodic Little’s Law
In this section we provide the proofs of theorems, corollaries that related to the CLT
version of the PLL. We use ⇒ for convergence in distribution.
8.2.1 Proof of Lemma 4.2
To prove fzzz(xxx(1),xxx(2), yyy) is continuous from Rd × Rd × (ℓ1)
d to Rd, it suffices to
show that fzzz,k(xxx(1),xxx(2), yyy) is continuous from Rd × Rd × (ℓ1)d to R, where xxx(i) =
(x(i)0 , x
(i)1 , · · · , x
(i)d−1) ∈ Rd for i = 1, 2, and yyy = (yyy0, yyy1, · · · , yyyd−1) ∈ (ℓ1)
d are variables
and zzz = (zzz0, zzz1, · · · , zzzd−1) ∈ (ℓ1)d is a constant. Further, it suffices to show that
hk(xxx,yyy) is continuous from Rd × (ℓ1)d to R, where xxx = (x0, x1, · · · , xd−1) ∈ Rd. For
convenience, we use the maximum norm in Rd, i.e. ||xxx||∞ ≡ max0≤k≤d−1
|xk|, which is an
equivalent norm to the usual Euclidean distance, and for the space Rd × Rd × (ℓ1)d
and Rd × (ℓ1)d, we use the metric induced by the norm ||(xxx(1),xxx(2), yyy)||Rd×Rd×(ℓ1)d ≡
||xxx(1)||∞+||xxx(2)||∞+||yyy||1,d and ||(xxx,yyy)||Rd×(ℓ1)d ≡ ||xxx||∞+||yyy||1,d respectively. Because
the notation of infinity norm is the same in Rd and ℓ∞, it should not cause confusion.
First we observe that hk can be written as a inner product of two projection
functions:
hk(xxx,yyy) = ΠΠΠ1k(xxx) ·ΠΠΠ2
k(yyy), (8.18)
CHAPTER 8. PROOFS 86
where
ΠΠΠ1k(xxx) ≡ (xk, xk−1, · · · , x0, xd−1, xd−2, · · · , x0, xd−1, xd−2, · · · ) and
ΠΠΠ2k(yyy) ≡ (yk,0, yk−1,1, · · · , y0,k, yd−1,k+1, yd−2,k+1, · · · , y0,k+d, yd−1,k+d+1, · · · ). (8.19)
Note that ||ΠΠΠ1k(xxx)||∞ = ||xxx||∞ < ∞, and ||ΠΠΠ2
k(yyy)||1 ≤∑d−1
k=0 ||yyyk||1 < ∞. So obviously
ΠΠΠ1k(xxx) is continuous from (Rd, || · ||∞) to (ℓ∞, || · ||∞) and ΠΠΠ2
k(yyy) is continuous from
((ℓ1)d, || · ||1,d) to (ℓ1, || · ||1). We also have that ΠΠΠ1
k(xxx) − ΠΠΠ1k(xxx) = ΠΠΠ1
k(xxx − xxx) and
Π2kΠ2kΠ2k(yyy) −ΠΠΠ2
k(yyy) = ΠΠΠ2k(yyy − yyy) for xxx, xxx ∈ Rd and yyy, yyy ∈ (ℓ1)
d. For any δ > 0 and (xxx,yyy)
fixed, choose (xxx, yyy) such that
||(xxx,yyy)− (xxx, yyy)||Rd×(ℓ1)d = ||xxx− xxx||∞ +d−1∑k=0
||yyyk − yyyk||1 < δ,
then
|hk(xxx,yyy)− hk(xxx, yyy)| = |ΠΠΠ1k(xxx) ···ΠΠΠ2
k(yyy)−ΠΠΠ1k(xxx) ···ΠΠΠ2
k(yyy)|
≤ |ΠΠΠ1k(xxx) ···ΠΠΠ2
k(yyy)−ΠΠΠ1k(xxx) ···ΠΠΠ2
k(yyy)|+ |ΠΠΠ1k(xxx) ···ΠΠΠ2
k(yyy)−ΠΠΠ1k(xxx) ···ΠΠΠ2
k(yyy)|
≤ ||ΠΠΠ1k(xxx)||∞||ΠΠΠ2
k(yyy − yyy)||1 + ||ΠΠΠ1k(xxx− xxx)||∞||ΠΠΠ2
k(yyy)||1
≤ ||xxx||∞δ + δ
d−1∑k=0
||yyyk||1 ≤ ||xxx||∞δ + δ
d−1∑k=0
(||yyyk||1 + ||yyyk − yyyk||1)
≤ ||xxx||∞δ + δ
d−1∑k=0
||yyyk||1 + δ2 → 0 as δ → 0. (8.20)
Given that each function fzzz,k is continuous from (Rd×Rd×(ℓ1)d, ||·||Rd×Rd×(ℓ1)d) to
(R, | · |), we can conclude that fzzz is continuous from (Rd×Rd×(ℓ1)d, || · ||Rd×Rd×(ℓ1)d) to
(Rd, ||·||∞). Finally, note that we can write g(yyy) = f000(eee,000, yyy), where eee = (1, 1, · · · , 1) ∈
Rd, so that g is also continuous from ((ℓ1)d, || · ||1,d) to (Rd, || · ||∞) and the lemma is
CHAPTER 8. PROOFS 87
proved.
8.2.2 Proof of Theorem 4.3
Condition (C1) implies that F cF cF c(n) ∈ (ℓ1)d as well as F cF cF c ∈ (ℓ1)
d, which indicates that
the limiting distributions all have finite means. Moreover, (C1) implies that
(λλλ(n), F cF cF c(n)) ⇒ (λλλ,F cF cF c) in Rd × (ℓ1)d. (8.21)
Note that WWW (n) = g(F cF cF c(n)) and WWW = g(F cF cF c), so by continuous mapping theorem,
(λλλ(n), WWW (n), F cF cF c(n)) ⇒ (λλλ,WWW,F cF cF c) in R2d × (ℓ1)d. (8.22)
By the definition of LLL(n) in (4.2) and (2.6), LLL(n) is a function of (λλλ(n), F cF cF c(n)),
i.e. LLL(n) = f000(λλλ(n),000, FcF cF c(n)) where f is defined in (4.12). By Lemma 4.2, f is a
continuous function, hence
(λλλ(n), WWW (n), LLL(n), F cF cF c(n)) ⇒ (λλλ,WWW,LLL,F cF cF c) in R3d × (ℓ1)d. (8.23)
For the CLT-scaled terms, notice that WWW (n) = g(F cF cF c(n)), so by the continuous
mapping theorem,
(λλλ(n), WWW (n), F cF cF c(n)) ⇒ (ΛΛΛ,ΩΩΩ,ΓΓΓ) in R2d × (ℓ1)d, (8.24)
where ΩΩΩ = g(ΓΓΓ).
CHAPTER 8. PROOFS 88
Combining (8.23) and (8.24), by Theorem 3.9 of Billingsley (1999), we have
(λλλ(n), WWW (n), LLL(n), λλλ(n), WWW (n), F cF cF c(n), F cF cF c(n)) ⇒ (λλλ,WWW,LLL,ΛΛΛ,ΩΩΩ,F cF cF c,ΓΓΓ)
in R5d × (ℓ1)2d. (8.25)
Now we turn to LLL(n). Note that for k = 0, 1, · · · , d − 1, we can write the kth
component of LLL(n) as
√n(Lk(n)− Lk) =
√n( ∞∑
j=0
λ[k−j](n)Fc[k−j],j(n)−
∞∑j=0
λ[k−j]Fc[k−j],j
)=
√n
∞∑j=0
(λ[k−j](n)F
c[k−j],j(n)− λ[k−j](n)F
c[k−j],j
+ λ[k−j](n)Fc[k−j],j − λ[k−j]F
c[k−j],j
)= hk(λλλ(n), F
cF cF c(n)) + hk(λλλ(n), FcF cF c)
= fF cF cF c,k(λλλ(n), λλλ(n), FcF cF c(n)), (8.26)
so
LLL(n) = fF cF cF c
(λλλ(n), λλλ(n), F cF cF c(n)
). (8.27)
By Lemma 4.2, fF cF cF c is a continuous function. In addition, ΓΓΓ ∈ (ℓ1)d w.p.1, so we
can apply the continuous mapping theorem again to get
(λλλ(n), WWW (n), LLL(n), λλλ(n), WWW (n), LLL(n), F cF cF c(n), F cF cF c(n)) ⇒ (λλλ,WWW,LLL,ΛΛΛ,ΩΩΩ,ΥΥΥ,F cF cF c,ΓΓΓ)
in R6d × (ℓ1)2d, (8.28)
where ΥΥΥ = fF cF cF c(λλλ,ΛΛΛ,ΓΓΓ), which is what we want as in (4.14).
CHAPTER 8. PROOFS 89
8.2.3 Proof of Proposition 4.4
If RRR(n) ⇒ 0, then both RRR(n) = LLL(n)−QQQ(n) → 000 and LLL(n)−QQQ(n) → 0 in probability.
By applying Theorem 3.1 of Billingsley (1999) based on (8.28), we have
(λλλ(n), WWW (n), QQQ(n), LLL(n), λλλ(n), WWW (n), QQQ(n), LLL(n), F cF cF c(n), F cF cF c(n), RRR(n))
⇒ (λλλ,WWW,LLL,LLL,ΛΛΛ,ΩΩΩ,ΥΥΥ,ΥΥΥ,F cF cF c,ΓΓΓ,000) in R9d × (ℓ1)2d. (8.29)
Now consider the departure processes δδδ(n) and δδδ(n). Note that
δk(n) =1
n
n∑m=1
Dk+(m−1)d =1
n
n∑m=1
(Qk+(m−1)d −Qk+1+(m−1)d + Ak+1+(m−1)d)
=
Qk(n)− Qk+1(n) + λk+1(n), 0 ≤ k < d− 1,
Qd−1(n)− Q0(n+ 1) + λ0(n+ 1) +1
nQ0 −
1
nA0, k = d− 1,
(8.30)
and
δk(n) =
√n(Qk(n)− Qk+1(n) + λk+1(n)− Lk + Lk+1 − λk+1)
√n(Qd−1(n)− Q0(n+ 1) + λ0(n+ 1)− Lk + Lk+1 − λk+1 +
1
nQ0 −
1
nA0)
=
Qk(n)− Qk+1 + λk+1(n), 0 ≤ k < d− 1,
Qd−1(n)− Q0 + λ0(n) +1√nQ0 −
1√nA0, k = d− 1.
(8.31)
Once again by continuous mapping theorem, we get the joint convergence (4.16) in
Proposition 4.4, where δδδ is given by (4.4) and ∆k = Υ[k] − Υ[k+1] + Λ[k+1], 0 ≤ k ≤
d− 1.
CHAPTER 8. PROOFS 90
8.2.4 Proof of Corollary 4.5
To make the proof clear, we first establish two lemmas.
Lemma 8.1. Assume Xn ∼ N(µn, σ2n), n = 1, 2, · · · , and Xn → X almost surely as
n→ ∞. Then X ∼ N(µ, σ2) where µ = limn→∞
µn, σ2 = limn→∞
σ2n and Xn → X in L2 as
n→ ∞.
Proof. Let ϕXn(t) = EeitXn = exp(iµnt − 2−1σ2nt
2) be the characteristic function
of Xn. Since Xn → X almost surely, by dominated convergence theorem, we know
that limn→∞
EeitXn = EeitX for each t. So we must have limn→∞
µi = µ, limn→∞
σ2i = σ2
for some µ and σ ≥ 0. (Note that we cannot have limn→∞
µn = ∞ or limn→∞
σ2n = ∞,
which would contradict Lévy’s continuity theorem.) Then we know that EeitX =
exp(iµt− 2−1σ2t2), which shows that X ∼ N(µ, σ2).
Note that limn→∞
µn = µ and limn→∞
σ2n = σ2 also implies that sup
nEX4
n = supn(µ4
n +
6µ2nσ
2n+3σ4
n) <∞. So X2n is uniformly integrable. Hence Xn → X in L2 as n→ ∞.
Lemma 8.2. Assume NNN ∈ ℓ1 is a zero-mean normally distributed random variable
with Cov(NNN,NNN) = ΣN and aaa, bbb ∈ ℓ0 are constants. Let X ≡∑∞
i=1 aiNi = aaaT ···NNN and
Y ≡∑∞
i=1 biNi = bbbT ···NNN , then (X,Y ) is jointly zero-mean normal distributed with
Cov(X,X) = Var(X) =∞∑i=1
∞∑j=1
aiajΣNi,j <∞,
Cov(Y, Y ) = Var(Y ) =∞∑i=1
∞∑j=1
bibjΣNi,j <∞ and
Cov(X,Y ) = E(XY ) =∞∑i=1
∞∑j=1
aibjΣNi,j <∞.
CHAPTER 8. PROOFS 91
Proof. By the definition of normal distribution in ℓ1 space, X and Y are normally
distributed random variables on R. We only need to show that they are also jointly
normally distributed, i.e. for any given α, β ∈ R, αX + βY is a zero-mean normal
distributed random variable.
Denote Xn =∑n
i=1 aiNi and Yn =∑n
i=1 biNi. Since X and Y are well defined
almost surely, we know that αXn + βYn → αX + βY almost surely. Note that
αXn + βYn is normal distributed with mean zero and variance∑n
i=1
∑nj=1(αai +
βbi)(αaj+βbj)ΣNi,j. And we can apply Lemma 8.1 and know that αX+βY has normal
distribution with mean zero and variance∑∞
i=1
∑∞j=1(αai + βbi)(αaj + βbj)Σ
Ni,j. So
(X,Y ) is jointly normally distributed.
Now we derive E(XY ). By Lemma 8.1 we know that Xn → X and Yn → Y
in L2 as well, so XnYn → XY in L1 and we have E(XY ) = limn→∞
E(XnYn) =
limn→∞
∑ni=1
∑nj=1 aibjΣ
Ni,j =
∑∞i=1
∑∞j=1 aibjΣ
Ni,j <∞. Because X and Y are zero-mean
normal distributed, we have Cov(X,Y ) = E(XY ), where Cov(X,X) and Cov(Y, Y )
are special cases with X = Y .
Now we go back to the proof of 4.5
Proof Lemma 8.2 can be easily generalized to NNN ∈ (ℓ1)d only with some tedious
steps. Since ΓΓΓ ∈ (ℓ1)d almost surely, ΩΩΩ = g(ΓΓΓ) and ΥΥΥ = fF cF cF c(λλλ,ΛΛΛ,ΓΓΓ) are well defined
a.s. In addition, with the Gaussian assumption, we may apply the generalized version
of Lemma 8.2 and know that (ΩΩΩ,ΥΥΥ) has a zero-mean jointly normal distribution with
CHAPTER 8. PROOFS 92
each element being well defined a.s. and in L2, where
(Cov(ΩΩΩ,ΩΩΩ))k,l = Cov(∞∑j=0
Γk,j,
∞∑j=0
Γl,j) =∞∑i=1
∞∑j=1
ΣΓ:k,li,j ,
(Cov(ΥΥΥ,ΥΥΥ))k,l = Cov(fF cF cF c,k(λλλ,ΛΛΛ,ΓΓΓ), fF cF cF c,l(λλλ,ΛΛΛ,ΓΓΓ))
= Cov(∞∑j=0
λ[k−j]Γ[k−j],j +∞∑j=0
F c[k−j],jΛ[k−j],
∞∑j=0
λ[l−j]Γ[l−j],j +∞∑j=0
F c[l−j],jΛ[l−j])
=∞∑i=0
∞∑j=0
λ[k−i]λ[l−j]ΣΓ:[k−i],[l−j]i+1,j+1 +
∞∑i=0
∞∑j=0
λ[k−i]Fc[l−j],jΣ
Λ,Γ:[k−j][l−j]+1,i+1
+∞∑i=0
∞∑j=0
F c[k−i],iλ[l−j]Σ
Λ,Γ:[l−j][k−i]+1,j+1
+∞∑i=0
∑j=0
F c[k−i],iF
c[l−j],jΣ
Λ[k−i]+1,[l−j]+1,
(Cov(ΩΩΩ,ΥΥΥ))k,l = Cov(∞∑j=0
Γk,j,∞∑j=0
λ[l−j]Γ[l−j],j +∞∑j=0
F c[l−j,j]Λ[l−j])
=∞∑i=1
∞∑j=0
λ[l−j]ΣΓ:k,[l−j]i+1,j+1 +
∞∑i=1
∞∑j=0
F c[l−j],jΣ
Λ,Γ:k[l−j]+1,i+1.
8.2.5 Proof of Corollary 4.6
If the LoS is bounded by J , then F ck,j(n) = 0 for all 0 ≤ k ≤ d− 1 and n when j > J .
So if we have (C1), then F ck,j = 0 must holds for all k and j > J and F cF cF c(n) ∈ (ℓ1)
d
w.p. 1. Hence Γk,j = 0 for all k and j > J , and ΓΓΓ ∈ (ℓ1)d w.p. 1.
To see that RRR(n) ⇒ 0 holds, by (4.3), RRR(n) = LLL(n)− QQQ(n) =√n(LLL(n)− QQQ(n)),
so it suffices to show that√nE(n) → 0, where E(n) ≡ ||LLL(n)− QQQ(n)||1. We take M
such that (M − 1)d < J ≤Md. When n > M , because Yk,j = 0 for j > Md ≥ J , we
CHAPTER 8. PROOFS 93
have
√nE(n) =
√n1
n
n∑m=1
d∑j=1
∞∑s=(n−m)d
Yd−j+(m−1)d,j+s
= n−1/2
n∑m=n−M+1
d∑j=1
Md∑s=(n−m)d
Yd−j+(m−1)d,j+s
≤d∑
j=1
n−1/2
n∑m=n−M+1
MdAd−j+(m−1)d
≤ n−1/2d2M2C → 0 as n→ ∞, (8.32)
where C is the upper bound for the number of arrivals within a discrete time period.
This establishes RRR(n) ⇒ 0.
8.2.6 Proof of Theorem 5.1
To prove Theorem 5.1, we need a lemma to establish (C1).
Lemma 8.3. Suppose Wk, k = 0, 1, · · · , d−1, are non-negative integer-valued random
variables with F ck,j ≡ 1−Fk,j ≡ P (Wk ≥ j) ∼ O(j−(3+δ)), j ≥ 0, for some δ > 0. W (i)
k
are i.i.d. samples of Wk, denote III(i)k ≡ (IW
(i)k ≥0
, IW
(i)k ≥1
, · · · ), FFF ck ≡ (F c
k,0, Fck,1, · · · ),
and XXX(i)k ≡ III
(i)k − FFF c
k. Assume YYY (i) are i.i.d. non-negative integer-valued random
variables in Rd with EYYY (i) = µYµYµY ≡ (µY,0, µY,1, · · · , µY,d−1) > 000 and Var(YYY (i)) = ΣY ,
where all the W (i)k and YYY (j) are independent for k = 0, 1, · · · , d− 1, and i, j ≥ 1. Let
SSS(n) ≡ (S0(n), S1(n), · · · , Sd−1(n)) ≡n∑
i=1
YYY (i)
and
GGG(n) ≡ (GGG0(n),GGG1(n), · · · ,GGGd−1(n)) ≡ (
S0(n)∑i=1
XXX(i)0 ,
S1(n)∑i=1
XXX(i)1 , · · · ,
Sd−1(n)∑i=1
XXX(i)d−1).
CHAPTER 8. PROOFS 94
We claim that
n−1/2(SSS(n)− nµYµYµY ,GGG(n)) ⇒ (ΛΛΛ,ΓΓΓ) in Rd × (ℓ1)d, (8.33)
where ΓΓΓ = (ΓΓΓ0,ΓΓΓ1, · · · ,ΓΓΓd−1) and (ΛΛΛ,ΓΓΓ) has a zero-mean Gaussian distribution in
Rd × (ℓ1)d with ΛΛΛ ∼ N(000,ΣY ), Cov(Γk,j,Γk,s) = µY,kFk,jF
ck,s for 0 ≤ k ≤ d − 1 and
0 ≤ j ≤ s, and ΛΛΛ,ΓΓΓ0,ΓΓΓ1, · · · ,ΓΓΓd−1 are independent.
Proof. The classical multivariate CLT implies that
n−1/2(SSS(n)− nµYµYµY ) ⇒ ΛΛΛ ∼ N(000,ΣY ). (8.34)
Let ZZZk ∈ ℓ1 for 0 ≤ k ≤ d − 1 be zero-mean Gaussian distributed random variables
with Cov(Zk,j, Zk,l) = Fk,jFck,l. By Theorem 1.1 of Gut (2009),
(Sk(n))−1/2
Sk(n)∑i=1
XXX(i)k ⇒ ZZZk in ℓ1 for 0 ≤ k ≤ d− 1, (8.35)
then after applying Slutsky’s theorem, we have
n−1/2
Sk(n)∑i=1
XXX(i)k ⇒ ΓΓΓk = µ
1/2Y,kZZZk in ℓ1 for 0 ≤ k ≤ d− 1, (8.36)
so that ΓΓΓk has a zero-mean Gaussian distribution with Cov(Γk,j,Γk,s) = µY,kFk,jFck,s
for j ≤ s. Unfortunately we cannot get the joint convergence directly since they are
not independent of each other, however, we can use independent copies of YYY (i) as a
bridge to prove it.
Assume that YYY (i) are i.i.d. copies of YYY (i) and are also independent of other
CHAPTER 8. PROOFS 95
variables. Let
SSS(n) ≡ (S0(n), S1(n), · · · , Sd−1(n)) ≡n∑
i=1
YYY(i).
Because n−1/2(SSS(n)−nµYµYµY ) is independent of n−1/2GGG0(n) ≡ n−1/2∑S0(n)
i=1 XXX(i)0 , we can
apply Theorem 11.4.4 of Whitt (2002) to obtain
n−1/2(SSS(n)− nµYµYµY , GGG0(n)) ⇒ (ΛΛΛ,ΓΓΓ0) in Rd × ℓ1, (8.37)
where we can make ΛΛΛ be independent of ΓΓΓ0. For what we want, we need to show that
n−1/2||GGG0(n) − GGG0(n)||1 → 0 in probability as n → ∞ and then apply Theorem 3.1
from Billingsley (1999). For any ϵ > 0,
P (n−1/2||GGG0(n)− GGG0(n)||1 > ϵ) = P (||S0(n)∑i=1
XXX(i)0 −
S0(n)∑i=1
XXX(i)0 ||1 > n1/2ϵ)
≤P (||S0(n)∑i=1
XXX(i)0 −
S0(n)∑i=1
XXX(i)0 ||1 > n1/2ϵ, |S0(n)− S0(n)| ≤ n3/4)
+ P (|S0(n)− S0(n)| > n3/4)
≤2P ( maxI=1,2,··· ,⌊n3/4⌋
||I∑
i=1
XXX(i)0 ||1 > n1/2ϵ) + P (|S0(n)− S0(n)| > n3/4)
=2P ( maxI=1,2,··· ,⌊n3/4⌋
∞∑j=0
|I∑
i=1
X(i)0,j| > n1/2ϵ) + P (|S0(n)− S0(n)| > n3/4)
≤2P (∞∑j=0
maxI=1,2,··· ,⌊n3/4⌋
|I∑
i=1
X(i)0,j| > n1/2ϵ) + P (|S0(n)− S0(n)| > n3/4). (8.38)
For the first part, let δ1 ≤ δ/2 and C ≡ (∑∞
j=0 j−(1+δ1))−1 be constants, note that
CHAPTER 8. PROOFS 96
Var(X(i)0,j) = F0,jF
c0,j, so
P (∞∑j=0
maxI=1,2,··· ,⌊n3/4⌋
|I∑
i=1
X(i)0,j| > n1/2ϵ)
≤∞∑j=0
P ( maxI=1,2,··· ,⌊n3/4⌋
|I∑
i=1
X(i)0,j| > Cn1/2ϵj−(1+δ1))
≤3∞∑j=0
maxI=1,2,··· ,⌊n3/4⌋
P (|I∑
i=1
X(i)0,j| > Cn1/2ϵj−(1+δ1)/3)
≤27∞∑j=0
maxI=1,2,··· ,⌊n3/4⌋
Var(∑I
i=1X(i)0,j)
C2nϵ2j−(2+2δ1)
≤27∞∑j=0
C−2n−1ϵ−2j2+2δ1⌊n3/4⌋F0,jFc0,j
≤27C−2ϵ−2n−1⌊n3/4⌋∞∑j=0
j2+2δ1F c0,j → 0 as n→ ∞, (8.39)
because as a consequence of F ck,j ∼ O(j−(3+δ)),
∑∞j=0 j
2+2δ1F c0,j ≤ ∞, and the second
and third inequalities follow from Etemadi’s inequality (see page 256 of Billingsley
(1999)) and Chebyshev’s inequality respectively.
As for the second part, again using Chebyshev’s inequality, we have
P (|S0(n)− S0(n)| > n3/4) ≤ Var(S0(n)− S0(n))
n3/2≤
2nΣY1,1
n3/2→ 0 as n→ ∞.
(8.40)
So (8.38) goes to 0 as n→ ∞, hence
n−1/2(SSS(n)− nµYµYµY ,GGG0(n)) ⇒ (ΛΛΛ,ΓΓΓ0) in Rd × ℓ1. (8.41)
Using the same argument (making new i.i.d. copies of YYY (i)), we can add
GGG1(n),GGG2(n), · · · ,GGGd−1(n) one by one and in the end get (8.33).
Now we go back the proof of Theorem 5.1
CHAPTER 8. PROOFS 97
Proof. Take λλλ = E(A0+(m−1)d, A1+(m−1)d, · · · , Ad−1+(m−1)d) and F cF cF c to be the com-
plementary distribution functions of the LoS. Let WWW and LLL be as in (C2), in which
case WWW is the vector of mean LoS for a period.
We can use Lemma 8.3 to establish (C1) by letting YYY (i) be the AAAi and W(i)k be
the LoS of ith customer that arrived at discrete time period k + (m − 1)d for all
m = 1, 2, · · · . (So EW(i)k = Wk and σ2
W,k ≡ Var(W (i)k ) < ∞.) The conclusion of
Lemma 8.3 is exactly (C1).
Then we need to establish RRR(n) ⇒ 0. Note that (I1) and (I2) imply that the
SLLN holds, i.e. equation (2.2) hold, so that we have Theorem 2.2. Since RRR(n) =
LLL(n) − QQQ(n) = n1/2(LLL(n) − QQQ(n)), it suffices to show that for each 0 ≤ k ≤ d − 1,
n1/2(Lk(n)− Qk(n)) → 0 in probability as n→ ∞.
From the proof of Theorem 2.2 in Whitt and Zhang (2019c), we know that
Lk(n)− Qk(n) = n−1
n∑m=1
d∑j=1
∞∑s=n−m+1
Yd−j+(m−1)d,j+k+(s−1)d.
So for any ϵ > 0,
P (n1/2(Lk(n)− Qk(n)) > ϵ)
=P (n−1/2
n∑m=1
d∑j=1
∞∑s=n−m+1
Yd−j+(m−1)d,j+k+(s−1)d > ϵ)
≤d∑
j=1
P (n∑
m=1
∞∑s=n−m+1
Yd−j+(m−1)d,j+k+(s−1)d > n1/2d−1ϵ)
≤d∑
j=1
(P (
n−⌊n1/4⌋∑m=1
∞∑s=n−m+1
Yd−j+(m−1)d,j+k+(s−1)d > n1/2d−1ϵ)
+ P (n∑
m=n−⌊n1/4⌋+1
∞∑s=n−m+1
Yd−j+(m−1)d,j+k+(s−1)d > n1/2d−1ϵ)). (8.42)
CHAPTER 8. PROOFS 98
Then we only need to prove that each of the two probabilities go to zero as n→ ∞.
For the first part, we have
P (
n−⌊n1/4⌋∑m=1
∞∑s=n−m+1
Yd−j+(m−1)d,j+k+(s−1)d > n1/2d−1ϵ)
≤P (n−⌊n1/4⌋∑
m=1
∞∑s=⌊n1/4⌋+1
Yd−j+(m−1)d,j+k+(s−1)d > n1/2d−1ϵ)
≤P (n∑
m=1
∞∑s=⌊n1/4⌋+1
Yd−j+(m−1)d,j+k+(s−1)d > n1/2d−1ϵ)
≤∑n
m=1
∑∞s=⌊n1/4⌋+1EYd−j+(m−1)d,j+k+(s−1)d
n1/2d−1ϵ
≤n1/2dϵ−1
∞∑s=⌊n1/4⌋+1
λd−jFcd−j,j+k+(s−1)d
≤n1/2dλd−jϵ−1
∞∑s=⌊n1/4⌋+1
F cd−j,s ≤ n1/2Cdλd−jϵ
−1
∫ ∞
⌊n1/4⌋s−(3+δ)ds
=n1/2Cdλd−j(2 + δ)−1ϵ−1⌊n1/4⌋−(2+δ) → 0 as n→ ∞. (8.43)
For the second part, let Sk(n) =∑n
m=1Ak+(m−1)d. Then
P (n∑
m=n−⌊n1/4⌋+1
∞∑s=n−m+1
Yd−j+(m−1)d,j+k+(s−1)d > n1/2d−1ϵ)
≤P (n∑
m=n−⌊n1/4⌋+1
∞∑s=0
Yd−j+(m−1)d,s > n1/2d−1ϵ)
=P (
Sd−j(⌊n1/4⌋)∑i=1
W(i)d−j > n1/2d−1ϵ)
≤Var(
∑Sd−j(⌊n1/4⌋)i=1 W
(i)d−j) + (E(
∑Sd−j(⌊n1/4⌋)i=1 W
(i)d−j))
2
nd−2ϵ2
=n−1d2ϵ−2(⌊n1/4⌋ΣΛd−j+1,d−j+1Wd−j + ⌊n1/4⌋λd−jσ
2W,d−j + ⌊n1/4⌋2λ2d−jW
2d−j)
→ 0 as n→ ∞. (8.44)
CHAPTER 8. PROOFS 99
Hence, we have proved that n1/2(Lk(n) − Qk(n)) → 0 in probability as n → ∞,
i.e., RRR(n) ⇒ 0. Since we have established all three conditions in Theorem 4.3 and
Proposition 4.4, so their conclusions follow. As a special case of Corollary 4.5, the
covariance matrix of (ΩΩΩ,ΥΥΥ) has the same form as in (4.19) with ΣΛ:k,l = 0 for k = l,
ΣΛ,Γ:k = 0 for all k and ΣΓ:k,ki+1,j+1 = λkFk,iF
ck,j for 0 ≤ i ≤ j.
8.2.7 Proof of Theorem 5.2
As we discussed before Theorem 5.2, we will first use a CLT for stationary processes
with mixing conditions to YYY n : n ∈ N (Step 1). Such type of CLT was established
by Ibragimov; see Theorem 18.5.3 in Ibragimov and Linnik (1971) or Theorem 0 in
Bradley (1985). We apply it in RdJ by utilizing Cramér-Wold device; see Theorem
29.4 in Billingsley (1995). Then, in Step 2, we show that (C1) holds. Finally, in Step
3, we show RRR(n) ⇒ 000 to complete the proof.
Step 1: Firstly, by the definition of Yi,j and Ai, we observe that E||AAA0||2+δ∞ < ∞
in (S2) implies that E|Yk+nd,j|2+δ < ∞ for all n ≥ 0, 0 ≤ k ≤ d − 1 and 0 ≤ j ≤
J − 1. Let F ck,j ≡ EYk,j
EYk,0=
EYk,jλk
for 0 ≤ k ≤ d − 1 and 0 ≤ j ≤ J − 1, where
λλλ = (λ0, λ1, · · · , λd−1) is in (S2). Now the FFF we defined in (4.4) can also be reduced
to finite dimensional space Rd×J , i.e.
F cF cF c =
F0,0 F0,1 · · · F0,J−1
F1,0 F1,1 · · · F1,J−1
... ... ... ...
Fd−1,0 Fd−1,1 · · · Fd−1,J−1
. (8.45)
We want to show that YYY n : n ∈ N satisfies a CLT. By Cramér-Wold device, we
only need to show that∑d−1
k=0
∑J−1j=0 θk,jYk+nd,j satisfies the corresponding CLT for
CHAPTER 8. PROOFS 100
each θθθ = (θk,j)d×J ∈ Rd×J . Let
Zn(θθθ) ≡d−1∑k=0
J−1∑j=0
θk,j(Yk+nd,j − λkFck,j), n ∈ N, (8.46)
be the centralized strictly stationary process, and we need to show that it satis-
fies Theorem 0 in Bradley (1985), i.e. for some δ > 0, E|Zn(θθθ)|2+δ < ∞ and∑∞n=1 αZθθθ
(n)δ/(2+δ), where αZθθθ(n) is the strong mixing coefficient for Zn like we
introduced in (5.6). We take the δ as in (S2). Note that
E|Zn(θθθ)|2+δ = E|d−1∑k=0
J−1∑j=0
θk,j(Yk+nd,j − λkFck,j)|2+δ
≤d−1∑k=0
J−1∑j=0
|θk,j|2+δE|Yk+nd,j − λkFck,j|2+δ
≤d−1∑k=0
J−1∑j=0
|θk,j|2+δ((E|Yk+nd,j|2+δ)1/(2+δ) + λkFck,j)
2+δ <∞, (8.47)
where the first inequality uses the linearity of expectation and the second is by
Minkowski’s inequality. Since Zn(θθθ) is a linear combination of the elements of YYY n, the
corresponding sigma-algebra generated by it is smaller than the one generated by YYY n,
so that by the definition of strong mixing coefficient we know that αZθθθ(n) ≤ α(n).
Hence, given∑∞
n=1 α(n)δ/(2+δ) <∞ as in (S2), we have
∑∞n=1 αZθθθ
(n)δ/(2+δ) <∞. By
Theorem 0 in Bradley (1985), we know that σ2θθθ = EZ0(θθθ)
2 + 2∑∞
i=1E(Z0(θθθ)Zi(θθθ))
exists with the sum being absolutely convergent and n−1/2∑n−1
i=0 Zi(θθθ) ⇒ N(0, σ2θθθ).
Let
YYY (n) ≡√n(YYY (n)− diag(λλλ)F cF cF c) ≡
√n(
1
n
n−1∑i=0
YYY i − diag(λλλ)F cF cF c), (8.48)
CHAPTER 8. PROOFS 101
where diag(λλλ) ∈ Rd×d is the diagonal matrix with λλλ on the diagonal. By Cramér-Wold
device, we know that
YYY (n) ⇒ ψψψ ≡
ψ0,0 ψ0,1 · · · ψ0,J−1
ψ1,0 ψ1,1 · · · ψ1,J−1
... ... ... ...
ψd−1,0 ψd−1,1 · · · ψd−1,J−1
as n→ ∞, (8.49)
where ψψψ is normal distributed with mean 0 and covariance
Cov(ψk,j, ψl,s) =E((Yk,j − λk, Fck,j)(Yl,s − λlF
cl,s))
+ 2∞∑i=1
E((yk,j − λkFck,j)(Yl+id,j − λlF
cl,s)). (8.50)
Step 2: Now we show that (8.49) implies conditon (C1) holds. Actually it suffices
to show that (λλλ(n), F cF cF c(n)) is actually a continuous function of YYY (n). It is trivial to
see that λλλ(n) = YYY (n)eee1, where eee1 = (1, 0, 0, · · · , 0)T ∈ RJ is a continuous map from
Rd×J to Rd. For F cF cF c(n) =√n(F cF cF c(n)−F cF cF c), note that it is, by (S3), a finite dimensional
matrix, so it suffices to show that each element of it is a continuous function of YYY (n).
To see it, observe that
F ck,j(n) =
√n(F c
k,j(n)− F ck,j) =
√n(Yk,j(n)
Yk,0(n)− F c
k,j)
=√n(Yk,j(n)− λkF
ck,j)− F c
k,j(Yk,0(n)− λk)
Yk,0(n)
=Yk,j(n)− F c
k,jYk,0(n)
Yk,0(n). (8.51)
CHAPTER 8. PROOFS 102
In addition, (8.49) implies that Yk,0(n) → λk in probability, so by continuous mapping
theorem, Slutsky’s theorem and Cramér-Wold device, we know that
(λλλ(n), F cF cF c(n)) ⇒ (ΛΛΛ,ΓΓΓ) in Rd × Rd×J , (8.52)
where ΛΛΛ = (ψ0,0, ψ1,0, · · · , ψd−1,0) and Γk,j =ψk,j − F c
k,jψk,0
λk, 0 ≤ k ≤ d− 1, 0 ≤ j ≤
J − 1.
Step 3: Finally, since the LoSs are bounded, we only need to show RRR(n) ⇒ 000
to establish Theorem 4.3 and Proposition 4.4. Moreover, if (ΛΛΛ,ΓΓΓ) has a Gaussian
distribution, Theorem 4.5 holds as well.
Analogously to the proof of Corollary 4.6, it suffices to show that√nE(n) → 0 in
probability. To see that, for any ϵ > 0, take the same M as in the proof of Corollary
4.6, and based on (8.32), we have
P (√nE(n) > ϵ) ≤ P (n−1/2Md
d∑j=1
n∑m=n−M+1
Ad−j+(m−1)d > ϵ)
≤ P (n−1/2Md2n∑
m=n−M+1
||AAA(m−1)||∞ > ϵ)
≤ M2d2E||AAA0||∞ϵn1/2
→ 0 as n→ ∞, (8.53)
where in the last inequality we apply Markov inequality and exploit the stationarity of
AAAn. So we know RRR(n) ⇒ 000 and together with Step 2, we have proved the theorem.
BIBLIOGRAPHY 103
Bibliography
Araújo, A. and Giné, E. (1980). The Central Limit Theorem for Real and BanachValued Random Variables. Wiley series in probability and mathematical statistics.John Wiley & Sons.
Armony, M., Israelit, S., Mandelbaum, A., Marmor, Y. N., Tseytlin, Y., and Yom-Tov, G. B. (2015). On patient flow in hospitals: A data-based queueing-scienceperspective. Stochastic Systems, 5(1):146–194.
Baccelli, F. and Brémaud, P. (1994). Elements of Queueing Theory: Palm-MartingaleCalculus and Stochastic Recurrences, volume 26 of Applications of mathematics.Springer-Verlag.
Bertsimas, D. and Mourtzinou, G. (1997). Transient laws of non-stationary queueingsystems and their applications. Queueing Systems, 25(1–4):115–155.
Billingsley, P. (1995). Probability and Measure, volume 245 of Wiley Series in Prob-ability and Statistics. Wiley, 3rd edition.
Billingsley, P. (1999). Convergence of Probability Measures. Wiley Series in Proba-bility and Statistics. Wiley-Interscience, 2nd edition.
Bradley, R. C. (1985). On the central limit question under absolute regularity. TheAnnals of Probability, 13(4):1314–1325.
Breiman, L. (1968). Probability, volume 7 of Classics in Applied Mathematics. SIAM,1st edition.
Brockmeyer, E., Halstrøm, H. L., and Jensen, A. (1948). The Life and Works of A.k.Erlang. Akademiet for de Tekniske Videnskaber.
Brumelle, S. L. (1971). On the relation between customer and time averages in queues.Journal of Applied Probability, 8(3):508–520.
BIBLIOGRAPHY 104
Brumelle, S. L. (1972). A generalization of L = λW to moments of queue length andwaiting times. Operations Research, 20(6):1127–1136.
Bruneel, H. and Kim, B. G. (1993). Discrete-Time Models for Communication Sys-tems Including ATM, volume 205 of The Springer International Series in Engi-neering and Computer Science. Springer Science+Business Media, New York, 1stedition.
Calegari, R., Fogliatto, F. S., Lucini, F. R., Neyeloff, J., Kuchenbecker, R. S., andSchaan, B. D. (2016). Forecasting daily volume and acuity of patients in theemergency department. Computational and Mathematical Methods in Medicine,2016(2):1–8.
Cobham, A. (1954). Priority assignment in waiting line problems. Journal of theOperations Research Society of America, 2(1):70–76.
de Bruin, A. M., van Rossum, A. C., Visser, M. C., and Koole, G. M. (2007). Modelingthe emergency cardiac in-patient flow: an application of queuing theory. HealthCare Management Science, 10(2):125–137.
Doob, J. L. (1953). Stochastic Processes. John Wiley & Sons.
Eilon, S. (1969). A simpler proof of L = λW . Operations Research, 17(5):915–917.
El-Taha, M. and Stidham Jr., S. (1999). Sample-Path Analysis of Queueing Systems,volume 11 of International Series in Operations Research & Management Science.Springer Science+Business Media, New York, 1st edition.
Fiems, D. and Bruneel, H. (2002). A note on the discretization of Little’s result.Operations Research Letters, 30(1):17–18.
Fralix, B. H. and Riaño, G. (2010). A new look at transient versions of Little’s law, andM/G/1 preemptive last-come-first-served queues. Journal of Applied Probability,47(2):459–473.
Glynn, P. W. and Whitt, W. (1986). A central-limit-theorem version of L = λW .Queueing Systems, 1(2):191–215. (See Correction Note on L = λW , QueueingSystems, 12 (4), 1992, 431-432. The results are correct; minor but important changeneeded in proofs.).
Glynn, P. W. and Whitt, W. (1987). Sufficient conditions for functional limit theoremversions of L = λW . Queueing Systems, 1(3):279–287.
BIBLIOGRAPHY 105
Glynn, P. W. and Whitt, W. (1988a). An lil version of L = λW . Mathematics ofOperations Research, 13(4):693–710.
Glynn, P. W. and Whitt, W. (1988b). Ordinary CLT and WLLN versions of L = λW .Mathematics of Operations Research, 13(4):674–692.
Glynn, P. W. and Whitt, W. (1989). Indirect estimation via L = λW . OperationsResearch, 37(1):82–103.
Gross, D., Shortle, J. F., Thompson, J. M., and Harris, C. M. (2018). Foundamentalsof Queueing Theory. Wiley Series in Probability and Statistics. John Wiley & Sons,5th edition.
Gut, A. (2009). Stopped Random Walks: Limit Theorems and Applications. SpringerSeries in Operations Research and Financial Engineering. Springer, 2 edition.
Heyman, D. P. and Stidham, Jr., S. (1980). The relation between customer and timeaverages in queues. Operations Research, 28(4):983–994.
Hoot, N. R., LeBlanc, L. J., Jones, I., Levin, S. R., Zhou, C., Gadd, C. S., andAronsky, D. (2008). Forecasting emergency department crowding: a discrete eventsimulation. Annals of Emergency Medicine, 52(2):116–125.
Hung, G. R., Whitehouse, S. R., O’Neill, C. B., Gray, A. P., and Kissoon, N. (2007).Computer modeling of patient flow in a pediatric emergency department usingdiscrete event simulation. Pediatric Emergency Care, 23(1):5–10.
Ibragimov, I. A. and Linnik, I. V. (1971). Independent and Stationary Sequences ofRandom Variables. Groningen: Wolters-Noordhoff.
Ibrahim, R. and Whitt, W. (2011a). Real-time delay estimation based on delayhistory in many-server service systems with time-varying arrivals. Production andOperations Management, 20(5):654–667.
Ibrahim, R. and Whitt, W. (2011b). Wait-time predictors for customer service sys-tems with time-varying demand and capacity. Operations research, 59(5):1106–1118.
Jewell, W. S. (1967). A simple proof of: L = λW . Operations Research, 16(6):1109–1116.
BIBLIOGRAPHY 106
Jones, S. A., Joy, M. P., and Pearson, J. (2002). Forecasting demand of emergencycare. Health Care Management Science, 5(4):297–305.
Jones, S. S., Thomas, A., Scott, E. R., Welch, S. J., Haug, P. J., and Snow, G. L.(2008). Forecasting daily patient volumes in the emergency department. AcademicEmergency Medicine, 15(2):159–170.
Kim, S.-H. and Whitt, W. (2013). Statistical analysis with Little’s law. OperationsResearch, 61(4):1030–1045.
Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces: Isoperimetryand Processes. Classics in Mathematics. Springer-Verlag Berlin Heidelberg, 1stedition.
Little, J. D. and Graves, S. C. (2008). Little’s law. In Chhajed, D. and Lowe,T. J., editors, Building Intuition: Insights From Basic Operations ManagementModels and Principles, volume 115 of International Series in Operations Research& Management Science, chapter 5, pages 81–100. Springer.
Little, J. D. C. (1961). A proof of the queueing formula: L = λW . OperationsResearch, 9(3):383–387.
Little, J. D. C. (2011). Little’s law as viewed on its 50th anniversary. OperationsResearch, 59(3):536–549.
Marcilio, I., Hajat, S., and Gouveia, N. (2013). Forecasting daily emergency depart-ment visits using calendar variables and ambient temperature readings. AcademicEmergency Medicine, 20(8):769–777.
Medeiros, D. J., Swenson, E., and DeFlitch, C. (2008). Improving patient flow ina hospital emergency department. In 2008 Winter Simulation Conference, pages1526–1531. IEEE.
Megginson, R. E. (1998). An Introduction to Banach Space Theory, volume 183 ofGraduate Texts in Mathematics. Springer-Verlag New York, 1st edition.
Morse, P. M. (1958). Queues, Inventories and Maintenance: The Analysis of Oper-ational Systems with Variable Demand and Supply. John Wiley & Sons.
Munkres, J. R. (2000). Topology. Featured Titles for Topology Series. Prentice Hall,2nd edition.
BIBLIOGRAPHY 107
Rolski, T. (1989). Relationships between characteristics in periodic Poisson queues.Queueing Systems, 4(1):17–26.
Rudin, W. (1976). Principles of Mathematical Analysis. International Series in Pureand Applied Mathematics. McGraw-Hill, 3rd edition.
Schweigler, L. M., Desmond, J. S., McCarthy, M. L., Bukowski, K. J., Ionides, E. L.,and Younger, J. G. (2009). Forecasting models of emergency department crowding.Academic Emergency Medicine, 16(4):301–308.
Sigman, K. (1995). Stationary Marked Point Processes: An Intuitive Approach, vol-ume 2 of Stochastic Modeling Series. Chapman & Hall/CRC.
Sigman, K. and Whitt, W. (2019). Marked point processes in discrete time. QueueingSystems. Forthcoming.
Stidham, Jr., S. (1974). A last word on L = λW . Operations Research, 22(2):417–421.
Tandberg, D. and Qualls, C. (1994). Time series forecasts of emergency depart-ment patient volume, length of stay, and acuity. Annals of Emergency Medicine,23(2):299–306.
Whitt, W. (1980). Some useful functions for functional limit theorems. Mathematicsof Operations Research, 5(1):67–85.
Whitt, W. (1991). A review of L = λW and extensions. Queueing Systems, 9(3):235–268.
Whitt, W. (1992). Correction note on L = λW . Queueing Systems, 12(3–4):431–432.(The results in the previous papers are correct, but minor important changes areneeded in some proofs.).
Whitt, W. (1999). Predicting queueing delays. Management Science, 45(6):870–888.
Whitt, W. (2002). Stochastic-Process Limits. Springer, New York.
Whitt, W. (2012). Extending the FCLT version of L = λW . Operations ResearchLetters, 40(4):230–234.
Whitt, W. (2018). Time-verying queues. Queueing Models and Servic Management,1(2):79–164.
BIBLIOGRAPHY 108
Whitt, W. and Zhang, X. (2017). A data-driven model of an emergency department.Operations Research for Health Care, 12:1–15.
Whitt, W. and Zhang, X. (2019a). A central-limit-theorem version of the periodicLittle’s law. Queueing Systems, 91(1–2):15–47.
Whitt, W. and Zhang, X. (2019b). Forecasting arrivals and occupancy levels in anemergency department. Operations Research for Health Care, 21:1–18.
Whitt, W. and Zhang, X. (2019c). Periodic Little’s law. Operations Research,67(1):267–280.
Wolff, R. W. and Yao, Y.-C. (2014). Little’s law when the average waiting time isinfinite. Queueing Systems, 76(3):267–281.
Zeltyn, S., Marmor, Y. N., Mandelbaum, A., Carmeli, B., Greenshpan, O., Mesika,Y., Wasserkrug, S., Vortman, P., Shtub, A., Lauterman, T., Schwartz, D.,Moskovtch, K., Tzafrir, S., and Basis, F. (2011). Simulation-based models of emer-gency departments: Operational, tactical, and strategic staffing. ACM Transactionson Modeling and Computer Simulation, 21(4):1–25.