5.1 Continuous-Time Markov Chainsrsong/409s20/Chap5-rs.pdffor all s;t 0. Hence the random variable T...

23
University of Illinois at Urbana-Champaign Department of Mathematics ASRM 409 Stochastic Processes for Finance and Insurance Spring 2020 Chapter 5 Continuous-Time Markov Chains by Renming Song 5.1 Continuous-Time Markov Chains 5.1.1 Suppose {X t : t 0} is a continuous-time process taking values in the set of non-negative integers. We say that {X t : t 0} is a continuous-time Markov chain if for all t, s 0 and non-negative integers i, j , P(X t+s = j |X s = i, X u , 0 u<s)= P(X t+s = j |X s = i). In other words, a continuous-time Markov chain is a continuous-time process with the Markov property that the conditional distribution of the future X t+s given the present X s and the past X u ,0 u<s, depends only on the present and is independent of the past. 5.1.2 If, in addition, P(X t+s = j |X s = i) does not depend on s, then the continuous-time Markov chain is said to have homogeneous transition probabilities. In this chapter, we will only deal with continuous-time Markov chains with homogeneous transition probabilities. As in the case of discrete-time Markov chains, we will use P(·) to denote the conditional probability P i (·|X 0 = i). 5.1.3 Suppose that a continuous-time Markov chain is at state i at some time, say, time 0, and suppose that the Markov chain has not left state i by time s, what is the probability that the Markov chain will not leave the state i during the time interval (s, s + t]? Let T i be the amount of time that the Markov chain stays in state i, when it starts in state i, before making a transition to a different state, then we are looking for P(T i >s + t|T i >s). By the Markov property, the probability that it will remain in state i during the time interval (s, s + t] is just the probability that the Markov chain will stay in state i during the time interval (0,t] when it starts in the state i at time 0. Thus P i (T i >s + t|T i >s)= P i (T i >t) 1

Transcript of 5.1 Continuous-Time Markov Chainsrsong/409s20/Chap5-rs.pdffor all s;t 0. Hence the random variable T...

  • University of Illinois at Urbana-ChampaignDepartment of Mathematics

    ASRM 409 Stochastic Processes for Finance and InsuranceSpring 2020

    Chapter 5 Continuous-Time Markov Chains

    by Renming Song

    5.1 Continuous-Time Markov Chains

    5.1.1 Suppose {Xt : t ≥ 0} is a continuous-time process taking values in the set of non-negativeintegers. We say that {Xt : t ≥ 0} is a continuous-time Markov chain if for all t, s ≥ 0 andnon-negative integers i, j,

    P(Xt+s = j|Xs = i,Xu, 0 ≤ u < s) = P(Xt+s = j|Xs = i).

    In other words, a continuous-time Markov chain is a continuous-time process with the

    Markov property that the conditional distribution of the future Xt+s given the present Xs

    and the past Xu, 0 ≤ u < s, depends only on the present and is independent of the past.

    5.1.2 If, in addition,

    P(Xt+s = j|Xs = i)

    does not depend on s, then the continuous-time Markov chain is said to have homogeneous

    transition probabilities. In this chapter, we will only deal with continuous-time Markov

    chains with homogeneous transition probabilities. As in the case of discrete-time Markov

    chains, we will use P(·) to denote the conditional probability Pi(·|X0 = i).

    5.1.3 Suppose that a continuous-time Markov chain is at state i at some time, say, time 0, and

    suppose that the Markov chain has not left state i by time s, what is the probability that

    the Markov chain will not leave the state i during the time interval (s, s + t]? Let Ti be

    the amount of time that the Markov chain stays in state i, when it starts in state i, before

    making a transition to a different state, then we are looking for

    P(Ti > s+ t|Ti > s).

    By the Markov property, the probability that it will remain in state i during the time

    interval (s, s + t] is just the probability that the Markov chain will stay in state i during

    the time interval (0, t] when it starts in the state i at time 0. Thus

    Pi(Ti > s+ t|Ti > s) = Pi(Ti > t)

    1

  • for all s, t ≥ 0. Hence the random variable Ti is memoryless and thus must be an expo-nential random variable.

    5.1.4 The explanation in the paragraph above gives another way of defining a continuous-time

    Markov chain. Namely, it is a stochastic process with the properties that each time it

    enters a state i

    (i) the amount of time it spans there before making a transition to a different state is

    exponentially distributed with parameter, say vi, and

    (ii) when the process leaves state i, it next enters state j with some probability, say, pij.

    Of course, pij must satisfy

    pi i = 0, for all i∑j

    pi j = 1, for all i.

    In other words, a continuous-time Markov chain is a process that moves from state to

    state according to a discrete-time Markov chain, but is such that the amount of time it

    spends in each state, before transitioning to a different state, is exponentially distributed.

    In addition, the amount of time the process spends in state i and the next state it will visit,

    must be independent random variables, otherwise, it will violate the Markov property.

    5.1.5 A homogeneous Poisson process {Nt : t ≥ 0} with rate λ is an example of a continuous-timeMarkov chain. Each time it enters a state i, it will stay there an exponential(λ) amount

    of time before transitioning to state i + 1. Thus, in this case, pi i+1 = 1, pi j = 0 for all

    j 6= i+ 1.

    Example 1: Consider a shoe-shine establishment consisting of 2 chairs–chair 1 and chair 2. A

    customer upon on arrival goes initially to chair 1 where his shoes are cleaned and polish is

    applied. After this is done the customer moves to chair 2 where the polish is buffed. The

    service times at the two chairs are assumed to be independent exponential random variables

    with respective parameters µ1 and µ2. Suppose that potential customers arrive according to a

    Poisson process with rate λ, and that a potential customer will enter the system only if both

    chairs are empty.

    Solution: This model can be analyzed as a continuous-time Markov chain. First, let’s decide

    the appropriate state space. Since a potential customer will enter the system only if there are

    no other customers present, it follows that there will always be either 0 or 1 customers in the

    system. However, if there is 1 customer in the system, then we would also need to know which

    chair he is in now. Hence , an appropriate state space might consist of the three states 0, 1 and

    2

  • 2 where the states have the following interpretation:

    State Interpretation

    0 system is empty

    1 a customer is in chair 1

    2 sa customer is in chair 2.

    Then v0 = λ, v1 = µ1, v2 = µ2, p01 = p12 = p20 = 1.

    5.2 Birth and Death Processes

    5.2.1 Consider a system whose state at any time is represented by the number of people in the

    system at that time. Suppose that whenever there are n people in the system, then (i) new

    arrivals enter the system at an exponential rate λn; and (ii) people leave the system at an

    exponential rate µn. Such a system is called a birth and death process. The parameters

    {λn}∞n=0 and {µn}∞n=1 are called, respectively, the arrival (or birth) and departure (or death)rates.

    5.2.2 A birth and death process is a continuous-time Markov chain with state space {0, 1, 2, . . . }for which transitions from state n may only go to either state n − 1 or state n + 1. Therelationship between the birth/death rates, and state transition probabilities are

    v0 = λ0

    vi = λi + µi, i > 0

    p0 1 = 1

    pi i+1 =λi

    λi + µi, i > 0

    pi i−1 =µi

    λi + µi, i > 0.

    5.2.3 Consider a birth and death process for which

    µn = 0, for all n ≥ 0

    λn = λ, for all n ≥ 0.

    This is a process in which departure (death) never occurs, and the time between successive

    arrivals is exponential with parameter λ. Hence, this is just a Poisson process with rate λ.

    5.2.4 A birth and death process for which µn = 0 for all n is called a pure birth process.

    5.2.5 Consider a population whose members can give birth to new members but cannot die.

    3

  • If each member acts independently of the others and takes an exponentially distributed

    amount of time, with parameter λ, to give birth (to one new member), then if Xt is the

    population size at time t, then {Xt : t ≥ 0} is a pure birth process with λn = nλ, n ≥ 0.This pure birth process is also known as a Yule process after G. Yule.

    5.2.6 A model in which

    µn = nµ, n ≥ 1

    λn = nλ+ θ, n ≥ 0

    is called a linear growth process with immigration. Such processes occur naturally in the

    study of biological reproduction and population growth. Each individual in the population

    is assumed to give birth at an exponential rate λ; in addition, there is an exponential rate

    of increase θ of the population due to immigration. Hence the total birth rate when there

    are n persons in the system is nλ+ θ. Deaths are assumed to occur at an exponential rate

    µ for each member of the population, and so µn = nµ.

    Let Xt denote the population size at time t. Suppose that X0 = i and let

    M(t) = Ei[Xt].

    We will determine M(t) by deriving and then solving a differential equation that it satisfies.

    We start by deriving an equation for M(t+ h) by conditioning on Xt. This yields

    M(t+ h) = Ei[Xt+h] = Ei[Ei[Xt+h|Xt]].

    Now, given the size of the population at time t then, ignoring events whose probability

    is o(h), the population at time t + h will either increase in size by 1 if a birth or an

    immigration occurs in (t, t + h), or decreases by 1 if a death occurs in this interval, or

    remains the same if neither of these two possibilities occur. That is, given Xt

    Xt+h =

    Xt + 1, with probability [θ +Xtλ]h+ o(h)

    Xt − 1, with probability Xtµh+ o(h)

    Xt, with probability 1− [θ +Xtλ+Xtµ]h+ o(h).

    Therefore

    Ei[Xt+h|Xt] = Xt + [θ +Xtλ−Xtµ]h+ o(h).

    Taking expectation yields

    M(t+ h) = M(t) + (λ− µ)M(t)h+ θh+ o(h),

    4

  • or equivalentlyM(t+ h)−M(t)

    h= (λ− µ)M(t) + θ + o(h)

    h.

    Letting h→ 0 yieldsM ′(t) = (λ− µ)M(t) + θ. (1)

    If we define a function h(t) by

    h(t) = (λ− µ)M(t) + θ,

    then

    h′(t) = (λ− µ)M ′(t).

    Therefore (1) can be rewritten ash′(t)

    λ− µ= h(t)

    orh′(t)

    h(t)= λ− µ.

    Integration yields

    log[h(t)] = (λ− µ)t+ c

    or

    h(t) = Ke(λ−µ)t.

    Putting this back in terms of M(t) gives

    (λ− µ)M(t) + θ = Ke(λ−µ)t.

    To determine the value of the constant K, we use the fact M(0) = i and evaluate the

    preceding at t = 0. This gives

    (λ− µ)i+ θ = K.

    Substituting this back into the preceding equation for M(t) yields

    M(t) =θ

    λ− µ[eλ−µ)t − 1] + ie(λ−µ)t.

    Note that we have implicitly assumed that λ 6= µ. If λ = µ, then (1) reduces to

    M ′(t) = θ. (2)

    Integrating (2) and suing M(0) = i gives

    M(t) = θt+ i.

    5.2.7 Example 2: The Queueing System M/M/1 Suppose that customers arrive at a single-

    5

  • server service station according to a Poisson process with rate λ. Upon arrival, each

    customer goes directly into service if the server is free; if not, the customer joins the queue.

    When the server finishes serving a customer, the customer leaves the system and the next

    customer in line, if there are any waiting, enters the service. The successive service times

    are assumed to be independent exponential random variables with parameter µ. This

    model is known as a M/M/1 queueing system. The first M refers to the fact that the

    inter-arrival process is Markovian (since it Poisson) and teh second M refers to the fact

    that the service distribution is exponential (and hence Markovian). The 1 refers to the

    fact that there is one single server.

    Solution: If we let Xt denote the number of customers in the system at time t, then

    {Xt : t ≥ 0} is a birth and death process with

    µn = µ, n ≥ 1

    λn = λ, n ≥ 0.

    5.2.8 Example 3: A Multi-server Exponential Queueing System Consider an exponential

    queueing system in which there are s servers available, each serving at rate µ. Suppose

    that customers arrive at a single-server service station according to a Poisson process with

    rate λ. An entering customer first waits in line and then goes to the first free sever.

    Solution: If we let Xt denote the number of customers in the system at time t, then

    {Xt : t ≥ 0} is a birth and death process with parameters

    µn =

    nµ, 1 ≤ n ≤ ssµ, n > sλn = λ, n ≥ 0.

    The reason is as follows. If there are n customers in the system, n ≤ s, then n servers willbe busy. Since each of these servers works at rate µ, the total departure rate will be nµ.

    On the other hand, if there are n customers in the system, n > s, then all s servers will

    be busy and thus the total departure rate will be sµ. This is known as M/M/s queueing

    system.

    5.2.9 Consider now a general birth and death process with birth rates {λn} and death rates{µn}, where µ0 = 0. Let Ui denote the time, starting from state i, it takes for the processto enter state i + 1, i ≥ 0. We will recursively compute Ei[Ui], i ≥ 0, by starting withi = 0. Since U0 is exponential rate parameter λ0, we have

    E0[U0] =1

    λ0.

    6

  • For i > 0, we condition whether the first transition takes the process to state i− 1 or statei+ 1. That is, we let

    Ii =

    1, if the first transition from i is to i+ 1,0, if the first transition from i is to i− 1,and note that

    Ei[Ui|Ii = 1] =1

    λi + µi(3)

    Ei[Ui|Ii = 0] =1

    λi + µi+ Ei−1[Ui−1] + Ei[Ui]. (4)

    This follows since, independent of whether the first transition is from a birth or a death,

    the time until it occurs is exponential with parameter λi + µi; if this first transition is a

    birth, then the population size is at i+ 1, so no additional time is needed; whereas if this

    first transition is a death, then the population size becomes i− 1 and the additional timeto reach i + 1 is equal to the time it takes to return to state i (equal to Ei−1[Ui−1]) plusthe additional time it takes to reach i + 1 (equal to Ei[Ui]). Hence , since the probabilitythat the first transition is a birth is λi/(λi + µi), we have

    Ei[Ui] =1

    λi + µi+

    µiλi + µi

    (Ei−1[Ui−1] + Ei[Ui]),

    or equivalently

    Ei[Ui] =1

    λi+µiλiEi−1[Ui−1], i ≥ 1.

    Starting with E0[U0] = 1/λ0, the preceding provides an efficient way to successively findE1[U1], E2[U2], . . . .

    5.2.10 Suppose that we want to determine expected time to go from state i to state j where i < j.

    This can be accomplished using the preceding above by noting that this quantity is equal

    to Ei[Ui] + Ei+1[Ui+1] + · · ·+ Ej−1[Uj−1].

    5.2.11 Example 4: For the birth and death process with parameters λi = λ, µi = µ,

    Ei[Ui] =1

    λ+µ

    λEi−1[Ui−1]

    =1

    λ(1 + µEi−1[Ui−1]).

    Starting with E0[U0] = 1/λ, we get

    E1[U1] =1

    λ(1 +

    µ

    λ)

    E2[U2] =1

    λ

    [1 +

    µ

    λ+(µλ

    )2],

    7

  • and in general,

    Ei[Ui] =1

    λ

    [1 +

    µ

    λ+(µλ

    )2+ · · ·+

    (µλ

    )i]

    =

    1−(µλ)i+1

    λ−µ , λ 6= µi+1λ, λ = µ.

    The expected time to reach state j, starting at state k, k < j, is

    Ek[time to go from k to j] =j−1∑i=k

    Ei[Ui]

    =

    j−kλ−µ −

    (µ/λ)k+1

    λ−µ1−(µ/λ)j−k

    1−µ/λ , λ 6= µj(j+1)−k(k+1)

    2λ, λ = µ

    We can also compute the variance of the time to go from 0 to i+1 by using the conditional

    variance formula. First note that (3)–(4) can be rewritten as

    Ei[Ui|Ii] =1

    λi + µi+ (1− Ii)(Ei−1[Ui−1] + Ei[Ui]).

    Thus

    Var(Ei[Ui|Ii]) = (Ei−1[Ui−1] + Ei[Ui])2Var(Ii)

    = (Ei−1[Ui−1] + Ei[Ui])2µiλi

    (µi + λi)2(5)

    since Ii is a Bernoulli random variable with parameter p = λi/(λi + µi). Also note that if

    we let Yi denote the time until the transition from i occurs, then

    Var(Ui|Ii = 1) = Var(Yi|Ii = 1) = Var(Yi) =1

    (λi + µi)2(6)

    where we used the fact that the time until transition is independent of the next state it

    will visit. Also

    Var(Ui|Ii = 0) = Var(Yi + time to get back to i+ time to then reach i+ 1)

    = Var(Yi) + Var(Ui−1) + Var(Ui) (7)

    where we used the fact that three random variables are independent. We can rewrite

    (6)–(7) as

    Var(Ui|Ii) = Var(Yi) + (1− Ii)[Var(Ui−1) + Var(Ui)],

    so

    Ei[Var(Ui|Ii)] =1

    (λi + µi)2+

    µiµi + λi

    [Var(Ui−1) + Var(Ui)]. (8)

    8

  • Hence, using the conditional variance formula, which says that Var(Ui) is the sum of (8)

    and (5), we get

    Var(Ui) =1

    (λi + µi)2+

    µiµi + λi

    [Var(Ui−1) + Var(Ui)]

    +µiλi

    (µi + λi)2(Ei−1[Ui−1] + Ei[Ui])2

    or, equivalently

    Var(Ui) =1

    λi(λi + µi)+µiλi

    Var(Ui−1) +µi

    λi + µi(Ei−1[Ui−1] + Ei[Ui])2.

    Starting with Var(U0) = 1/λ20 and using the recursions above to obtain the expectations,

    we can recursively compute Var(Ui).

    In addition, if we want to find the variance of the time to reach state j, starting from state

    k, k < j, then this can be expressed as the time to go from k to k + 1 plus the time to go

    from k + 1 to k + 2, and so on. Since, by the Markov property, these successive random

    variables are independent, we have

    Var( time to go from k to j ) =

    j−1∑i=k

    Var(Ui).

    5.3 Transition Probabilities

    5.3.1 In this section, we discuss the transition probabilities of continuous-time Markov chains.

    Suppose that {Xt : t ≥ 0} is a continuous-time Markov chain. Let

    Pij(t) = P(Xt+s = j|Xs = i)

    be the probability that the continuous-time Markov chain will be in state j t time later

    given that it is in state i right now. {Pij(t)} are called the transition probabilities of thecontinuous-time Markov chain {Xt : t ≥ 0}.

    5.3.2 We can explicitly determine the transition probabilities Pij(t) for a pure birth process with

    distinct birth rates. In order to do this, we first find the density of the sum of independent

    exponential random variables with distinct parameters.

    5.3.3 Let Ui, i = 1, . . . , n, be independent exponential random variables with respective param-

    eters λi, i = 1, . . . , n, and suppose that λi 6= λj for i 6= j. The random variable∑n

    i=1 Ui

    is said to be a hypoexponential random variable. To compute the density of a general

    9

  • hypoexponential random variable, we start with the case n = 2. Now

    fU1+U2(t) =

    ∫ t0

    fU1(s)fU2(t− s)ds

    =

    ∫ t0

    λ1e−λ1sλ2e

    −λ2(t−s)ds

    = λ1λ2e−λ2t

    ∫ t0

    e−(λ1−λ2)sds

    =λ1

    λ1 − λ2λ2e

    −λ2t(1− e−λ2t)

    =λ1

    λ1 − λ2λ2e

    −λ2t +λ2

    λ2 − λ1λ1e

    −λ1t.

    Using the display above and a similar computation, we get for n = 3,

    fU1+U2+U3(t) =3∑i=1

    λie−λit

    (∏j 6=i

    λjλj − λi

    ),

    which suggests the general result

    fU1+···+Un(t) =n∑i=1

    Ci,nλie−λit

    where

    Ci,n =∏j 6=i

    λjλj − λi

    .

    We now prove the result above by induction. Since we have already established it for n = 2,

    assume it for n and consider n+ 1 arbitrary independent exponential random variables Ui

    with distinct parameters λi, i = 1, . . . , n+ 1. If necessary, renumber X1 and Xn+1 so that

    λn+1 < λ1. Now,

    fU1+···+Un+1(t)

    =

    ∫ t0

    fU1+···+Un(s)λn+1e−λn+1(t−s)ds

    =n∑i=1

    Ci,n

    ∫ t0

    λie−λisλn+1e

    −λn+1(t−s)ds

    =n∑i=1

    Ci,n

    (λi

    λi − λn+1λn+1e

    −λn+1t +λn+1

    λn+1 − λiλie−λit)

    = Kn+1λn+1e−λn+1t +

    n∑i=1

    Ci,n+1λie−λit (9)

    where

    Kn+1 =n∑i=1

    Ci,nλi

    λi − λn+1

    10

  • is a constant that does not depend on t. But we also have that

    fU1+···+Un+1(t) =

    ∫ t0

    fU2+···+Un+1(s)λ1e−λ1(t−s)ds,

    which implies , by the same argument leading to (9), that for a constant K1,

    fU1+···+Un+1(t) = K1λ1e−λ1t +

    n+1∑i=2

    Ci,n+1λie−λit.

    Equating these two expressions for fU1+···+Un+1(t) yields

    Kn+1λn+1e−λn+1t +

    n∑i=1

    Ci,n+1λie−λit = K1λ1e

    −λ1t +n+1∑i=2

    Ci,n+1λie−λit.

    Multiplying both sides by eλn+1t and then letting t→∞ yields (since e−λ1−λn+1)t → 0)

    Kn+1 = Cn+1,n+1

    and this, by (9), completes the induction proof.

    5.3.4 Suppose that {Xt : t ≥ 0} is a pure birth process with distinct birth rates {λn : n ≥ 0}(i.e, λi 6= λi for i 6= j). For such a process, let Uk be the time the process spends in staek before making a transition to state k + 1, k ≥ 0. Suppose that the process is presntlyat state i, and let j > i. Then, as Ui is the time it spends in state i before moving to

    state i + 1, and Ui+1 the time it spends in state i + 1 before moving to state i + 2, and

    so on, it follows that∑j−1

    k=i Uk is the time it takes until the process enters state j. Now if

    the process has not yet entered state j by time t, then its state at time t is smaller than

    j, and vice versa. That is,

    Xt < j if and only if Ui + · · ·+ Uj−1 > t.

    Therefore, for i < j,

    P(Xt < j|X0 = i) = P

    (j−1∑k=i

    Uk > t

    ).

    However, since Ui, . . . , Uj−1 are independent exponential random variables with respective

    parameters λi, . . . , λj−1, we obtain from the preceding display and the paragraph above

    that

    P(Xt < j|X0 = i) =j−1∑k=i

    e−λktj−1∏

    r=i,r 6=k

    λrλr − λk

    .

    Replacing j by j + 1 in the display above gives

    P(Xt < j + 1|X0 = i) =j∑k=i

    e−λktj∏

    r=i,r 6=k

    λrλr − λk

    .

    11

  • Since

    P(Xt = j|X0 = i)

    = P(Xt < j + 1|X0 = i)− P(Xt < j|X0 = i)

    and since Pii(t) = P(Ui > t) = e−λit, we have proved that

    For a pure birth process with with distinct birth rates {λn : n ≥ 0} (i.e, λi 6= λi for i 6= j),

    Pii(t) = e−λit (10)

    Pij(t) =

    j∑k=i

    e−λktj∏

    r=i,r 6=k

    λrλr − λk

    −j−1∑k=i

    e−λktj−1∏

    r=i,r 6=k

    λrλr − λk

    , i < j. (11)

    5.3.5 Example 5 Consider the Yule process, which is a pure birth process in which each individual

    in the population independently gives birth at rate λ, and so λn = nλ, n ≥ 0. Lettingi = 1 in (10)-(11), we get

    P1j(t) =

    j∑k=1

    e−kλtj∏

    r=1,r 6=k

    r

    r − k−

    j−1∑k=1

    e−kλtj−1∏

    r=1,r 6=k

    r

    r − k

    = e−jλtj∏r=1

    r

    r − j+

    j−1∑k=1

    (j∏

    r=1,r 6=k

    r

    r − k−

    j−1∏r=1,r 6=k

    r

    r − k

    )

    = e−jλt(−1)j−1 +j−1∑k=1

    (j

    j − k− 1) j−1∏r=1,r 6=k

    r

    r − k.

    Now

    k

    j − k

    j−1∏r=1,r 6=k

    r

    r − k=

    (j − 1)!(1− k)(2− k) · · · (k − 1− k)(j − k)!

    = (−1)k−1(j − 1k − 1

    ),

    so

    P1j(t) =

    j∑k=1

    (j − 1k − 1

    )e−kλt(−1)k−1

    = e−λtj−1∑i=0

    (j − 1i

    )e−iλt(−1)i

    = e−λt(1− e−λt

    )j−1.

    12

  • Thus, starting with a single individual, the population at time t has a geometric distribu-

    tion with parameter e−λt.

    If the population start with i individuals, then we can regard each of these individuals as

    starting her own independent Yule process, so the population at time t will be the sum of

    i independent and geometrically distributed random variables with parameter e−λt. This

    means that the conditional distribution of Xt, given X0 = i, is the same as the distribution

    of the number of times that a coin, with probability e−λt coming up heads, must be flipped

    to amass a total of i heads. Hence the population size at time t has a negative binomial

    distribution with parameters (i, e−λt), so

    Pij(t) =

    (j − 1i− 1

    )e−iλt

    (1− e−λt

    )j−i, j ≥ i ≥ 1.

    5.3.6 Now we return to the general case. We will derive a set of differential equations for

    the transition probabilities of a general continuous-time Markov chain. We give give a

    definition and make some preparations first.

    5.3.7 Recall that for any pair of states i and j, vi is the rate at which the continuous-time

    Markov chain makes a transition when in state i, and pij is the probability this transition

    is into state j. Define

    qij = vipij.

    qij is the rate, when in state i, at which the continuous-time Markov chain makes a tran-

    sition into state j. The quantities qij are called the instantaneous transition rates. Since

    vi =∑j

    vipij =∑j

    qij

    and

    pij =qijvi

    =qij∑j qij

    ,

    it follows that the specifying the instantaneous rates determines the parameters of the

    continuous-time Markov chain.

    5.3.8 For a continuous-time Markov chain,

    (a) for any state i,

    limh↓0

    1− Pii(h)h

    = vi;

    (b) for any states i 6= j,

    limh↓0

    Pij(h)

    h= qij.

    13

  • Here is a proof of the assertions above. The time until a transition occurs is an exponential

    random variable. Thus the probability of two or more transitions in a time interval of length

    h is o(h)(something small compared to h). Thus, 1−Pii(h), the probability that the processstarting in state i at time 0 will not be in state i at time h, equals the probability that a

    transition occurs within time h plus o(h) . Therefore

    1− Pii(h) = vih+ o(h).

    Hence (a) is valid.

    To prove (b), we note that Pij(h), the probability that the process goes from state i to

    state j in time h, equals the probability that a transition occurs in this time multiplied by

    the probability that the transition is into state j, plus o(h). That is,

    Pij(h) = hvipij + o(h).

    Hence (b) is valid.

    5.3.9 For all s, t ≥ 0 and all states i, j,

    Pij(t+ s) =∑k

    Pik(t)Pkj(s). (12)

    The equations (12) are known as the Chapman-Kolmogorov equations.

    Here is a proof of the Chapman-Kolmogorov equations. In order for the process to go from

    state i to state j in time t+ s, it must be somewhere at time t and thus

    Pij(t+ s) = P(Xt+s = j|X0 = i)

    =∑k

    P(Xt+s = j,Xt = k|X0 = i)

    =∑k

    P(Xt+s = j|Xt = k,X0 = i) · P(Xt = k|X0 = i)

    =∑k

    P(Xt+s = j|Xt = k) · P(Xt = k|X0 = i)

    =∑k

    Pkj(s)Pik(t)

    and the proof is complete.

    5.3.10 For all state i, j and time t ≥ 0,

    P ′ij(t) =∑k 6=i

    qikPkj(t)− viPij(t).

    This set of equations are known as Kolmogorov’s backward equations.

    14

  • Here is a proof of Kolmogorov’s backward equations. It follows from the Chapman-

    Kolmogorov equations that

    Pij(h+ t)− Pij(t) =∑k

    Pik(h)Pkj(t)− Pij(t)

    =∑k 6=i

    Pik(h)Pkj(t)− [1− Pii(h)]Pij(t)

    and thus

    limh↓0

    Pij(h+ t)− Pij(t)h

    = limh↓0

    (∑k 6=i

    Pik(h)

    hPkj(t)−

    [1− Pii(h)

    h

    ]Pij(t)

    ).

    It can be justified that one can change the order of the limit and the summation in the

    display above. Then the Kolmogorov’s backward equations follow from 5.3.8.

    5.3.11 Example 6 The backward equations for the pure birth process are

    P ′ij(t) = λiPi+1,j(t)− λiPij(t).

    5.3.12 Example 7 The backward equations for the birth and death process are

    P ′0j(t) = λ0P1j(t)− λ0P0j(t)

    P ′ij(t) = (λi + µi)

    (λi

    λi + µiPi+1,j(t) +

    µiλi + µi

    Pi−1,j(t)

    )− (λi + µi)Pij(t), i > 0.

    or equivalently

    P ′0j(t) = λ0P1j(t)− λ0P0j(t)

    P ′ij(t) = λiPi+1,j(t) + µiPi−1,j(t)− (λi + µi)Pij(t), i > 0. (13)

    5.3.13 Example 8 (A continuous-time Markov chain consisting of two states) Consider a machine

    that works for an exponential amount of time with parameter λ before breaking down;

    suppose that it takes an exponential amount of time with parameter µ to repair the ma-

    chine. If the machine is in working condition at time 0, what is the probability that it will

    be working at time t = 10?

    Let 0 denote that the machine is in working condition and 1 denote that the machine is

    being repaired. Let Xt be the status of the machine at time t. Then {Xt : t ≥ 0} is a birth

    15

  • and death process with parameters

    λ0 = λ, µ1 = µ,

    λi = 0, i 6= 0, µi = 0, i 6= 1.

    We will find the desired probability, P00(10), by solving the backward equations in Example

    7. It follows from (13) that

    P ′00(t) = λ[P10(t)− P00(t)] (14)

    P ′10(t) = µ[P00(t)− P10(t)]. (15)

    Multiplying (14) by µ and (15) by λ and then adding the two equations, we get

    µP ′00(t) + λP′10(t) = 0.

    Integrating, we get

    µP00(t) + λP10(t) = c.

    However, since P00(0) = 1 and P10(0) = 0, we get c = µ and hence

    µP00(t) + λP10(t) = µ (16)

    or equivalently

    λP10(t) = µ[1− P00(t)].

    By substituting this into (14), we get

    P ′00(t) = µ[1− P00(t)]− λP00(t)

    = µ− (µ+ λ)P00(t).

    Letting

    h(t) = P00(t)−µ

    µ+ λ,

    we have

    h′(t) = µ− (µ+ λ)(h(t) +

    µ

    µ+ λ

    )= −(µ+ λ)h(t),

    orh′(t)

    h(t)= −(µ+ λ).

    Integrating both sides, we get

    log h(t) = −(µ+ λ)t+ c,

    16

  • or

    h(t) = Ke−(µ+λ)t,

    and thus

    P00(t) = Ke−(µ+λ)t +

    µ

    µ+ λ

    which finally yields, by setting t = 0 and using the fact that P00(0) = 1,

    P00(t) =λ

    µ+ λe−(µ+λ)t +

    µ

    µ+ λ.

    By (16), this also implies that

    P10(t) =µ

    µ+ λ− µµ+ λ

    e−(µ+λ)t.

    Hence, the desired probability is

    P00(10) =λ

    µ+ λe−10(µ+λ) +

    µ

    µ+ λ.

    5.3.14 It follows from the Chapman-Kolmogorov equations that

    Pij(t+ h)− Pij(t) =∑k

    Pik(t)Pkj(h)− Pij(t)

    =∑k 6=j

    Pik(t)Pkj(h)− [1− Pjj(h)]Pij(t)

    and thus

    limh↓0

    Pij(t+ h)− Pij(t)h

    = limh↓0

    (∑k 6=j

    Pik(t)Pkj(h)

    h−[

    1− Pjj(h)h

    ]Pij(t)

    ).

    If we could change the order of the limit and summation, we obtain from 5.3.8 that

    P ′ij(t) =∑k 6=j

    qkjPik(t)− vjPij(t).

    Unfortunately, we can not always justify changing the order of the limit and summation.

    However, this could be justified in most models, including all birth and death processes

    and all finite state models. Thus we have the following

    5.3.15 Under suitable conditions,

    P ′ij(t) =∑k 6=j

    qkjPik(t)− vjPij(t). (17)

    17

  • This set of equations are known as Kolmogorov’s forward equations.

    5.3.16 For a pure birth process

    Pii(t) = e−λit, i ≥ 0

    Pij(t) = λj−1e−λjt

    ∫ t0

    eλjsPi,j−1(s)ds, j ≥ i+ 1.

    We now prove the above by solving the forward equations for the pure birth process. For

    this process, (17) reduces to

    P ′ij(t) = λj−1Pi,j−1(t)− λjPij(t).

    However, by noting that Pij(t) = 0 whenever j < i, we can rewrite the previous equation

    to get

    P ′ii(t) = −λiPii(t), i ≥ 0 (18)

    P ′ij(t) = λj−1Pi,j−1(t)− λjPij(t), j ≥ i+ 1. (19)

    The fact that Pii(t) = e−λit follows from (18) by integrating and using the fact that

    Pii(0) = 1. To get the result for Pij(t), we note by (19) that

    eλjt[P ′ij(t) + λjPij(t)

    ]= eλjtλj−1Pi,j−1(t)

    ord

    dt

    [eλjtPij(t)

    ]= λj−1e

    λjtPi,j−1(t).

    Hence, since Pij(0) = 0, we obtain the desired results.

    5.4 Limiting Probabilities

    5.4.1 In analogy with a basic result in discrete-time Markov chains, the probability that a

    continuous-time Markov chain will be in state j at time t often converges to a limiting

    value that is independent of the initial state. If we call this value Pj, then

    Pj = limt→∞

    Pij(t).

    We will assume that the limit above exists and is independent of the initial state i. In

    this section, we will derive a set of equations for Pj.

    18

  • 5.4.2 By the forward equations, we have

    P ′ij(t) =∑k 6=j

    qkjPik(t)− vjPij(t). (20)

    Letting t→∞ and assuming that we can change the order of the limit and the summation,we get

    limt→∞

    P ′ij(t) = limt→∞

    [∑k 6=j

    qkjPik(t)− vjPij(t)

    ]=∑k 6=j

    qkjPk − vjPj.

    However, since Pij(t) is a bounded function, it follows that if P′ij(t) converges as t → ∞,

    ten it must converge to 0. Hence, we must have

    0 =∑k 6=j

    qkjPk − vjPj,

    or

    vjPj =∑k 6=j

    qkjPk, for all states j. (21)

    The preceding set of equations, along with the equation∑j

    Pj = 1 (22)

    can be used to solve for the limiting probabilities.

    5.4.3 A sufficient condition for the limiting probabilities to exist is that (a) all states of the

    Markov chain communicate in the sense that starting in state i there is positive probability

    of ever being in state j, for all i, j and (b) the Markov chain is positive recurrent in the

    sense that, starting in any state, the mean time to return to that state is finite.

    If conditions (a) and (b) hold, then the limiting probabilities will exist satisfy (21) and

    (22). In addition, Pj also will have the interpretation of being the long-run proportion of

    time that the process is in state j.

    5.4.4 Equations (21) and (22) have a nice interpretation. In any interval (0, t), the number of

    transitions into state j must equal to within 1 the number of transitions out of state j.

    Hence in the long run, the rate at which transitions into state j occur must equal the rate

    at which transitions out of state j occur. When the process is in state j, it leavs at rate

    vj, and, as Pj is the proportion of time it is in state j, it thus follows that

    vjPj = rate at which the process leaves state j.

    19

  • Similarly, when the process is in state k, it enters j at rate qkj. Hence, as Pk is the

    proportion of time in state k, we see that the rate at which transitions from k to j is just

    qkjPk; thus ∑k 6=j

    qkjPk = rate at which the process enters state j.

    So, (21) is just a statement of the equality of the rates at which the process enters and

    leaves state j. Because it balances these rates, (21) is sometimes referred to as a set of

    ”balance equations”.

    5.4.5 When the limiting probabilities Pj exist, we say that the Markov chin is ergodic. The

    Pj’s are sometimes called the stationary probabilities since it can be shown that if the

    initial state is chosen according to the distribution {Pj}, the probability of being in statej at time t is Pj, for all t.

    5.4.6 Now we determine the limiting probabilities for a birth and death process. From (21) we

    get

    state rate which leave = rate at which enter

    0 λ0P0 = µ1P1

    1 (λ1 + µ1)P1 = µ2P2 + λ0P0

    3 (λ2 + µ2)P2 = µ3P3 + λ1P1

    n, n ≥ 1 (λn + µn)Pn = µn+1Pn+1 + λn−1Pn−1.

    By adding to each equation the equation preceding it, we get

    λ0P0 = µ1P1

    λ1P1 = µ2P2

    λ2P2 = µ3P3

    · · ·

    λnPn = µn+1Pn+1, n ≥ 0.

    Solving in terms of P0, we get

    P1 =λ0µ1P0

    P2 =λ1µ2P1 =

    λ1λ0µ2µ1

    P0

    P3 =λ2µ3P2 =

    λ2λ1λ0µ3µ2µ1

    P0

    · · ·

    Pn =λn−1µn

    Pn−1 =λn−1λn−2 · · ·λ1λ0µnµn−1 · · ·µ2µ1

    P0

    20

  • Using the fact that∑∞

    n=0 Pn = 1, we get

    1 = P0 + P0

    ∞∑n=1

    λn−1λn−2 · · ·λ1λ0µnµn−1 · · ·µ2µ1

    or

    P0 =1

    1 +∑∞

    n=1λn−1λn−2···λ1λ0µnµn−1···µ2µ1

    and so

    Pn =λ0λ1 · · ·λn−1

    µ1µ2 · · ·µn(

    1 +∑∞

    n=1λ0λ1···λn−1µ1µ2···µn

    ) , n ≥ 1. (23)The equations abpve also show us what condition is necessary for these limiting probabil-

    ities to exist. Namely, it is necessary that

    ∞∑n=1

    λ0λ1 · · ·λn−1µ1µ2 · · ·µn

  • Solution If we say that the system is on state n whenever n machines are not in use, then

    the preceding is a birth and death process with parameters

    µn = µ n ≥ 1

    λn = (M − n)λ n ≤M

    λn = 0 n > M.

    This is so in the sense that a failing machine is regarded as an arrival and a fixed machine

    as departure. If any machine are broken down, then since the serviceman’s rate is µ,

    µn = µ. On the other hand, if n machines are not in use, then since the M − n machinesin use each fail at a rate λ, it follows that λn = (M − n)λ. From (23) we have that Pn,the limiting probability that n machines will not be in use, is given by

    P0 =1

    1 +∑M

    n=1[Mλ(M − 1)λ · · · (M − n+ 1)λ/µn]

    =1

    1 +∑M

    n=1(λ/µ)nM !/(M − n)!

    Pn =(λ/µ)nM !/(M − n)!

    1 +∑M

    n=1(λ/µ)nM !/(M − n)!

    , n = 0, 1, . . . ,M.

    Hence the long-run average number of machines not in use is given by

    M∑n=0

    nPn =

    ∑mn=0 n(λ/µ)

    nM !/(M − n)!1 +

    ∑Mn=1(λ/µ)

    nM !/(M − n)!. (25)

    5.4.10 Example 10 In the M/M/1 queue, λn = λ, µn = µ and thus by (23)

    Pn =(λ/µ)n

    1 +∑∞

    n=1(λ/µ)n

    = (λ/µ)n(1− λ/µ)

    provided λ/µ < 1.

    5.4.11 Example 11 Let us revisit the shoe shine shop of Example 1 and determine the proportion

    of time the process is in each of the states 0, 1, 2. Because this is not a birth and death

    process (since the process can go directly from state 2 to state 0), we start with the balance

    equations for the limiting probabilities.

    State rate at which leave = rate at which enter

    0 λP0 = µ2P2

    1 µ1P1 = λP0

    2 µ2P2 = µ1P1.

    22

  • Solving in terms of P0 yields

    P2 =λ

    µ2P0, P1 =

    λ

    µ1P0,

    which implies, since P0 + P1 + P2 = 1, that

    P0

    [1 +

    λ

    µ1+

    λ

    µ2

    ]= 1,

    or

    P0 =µ1µ2

    µ1µ2 + λ(µ1 + µ2).

    and

    P1 =λµ2

    µ1µ2 + λ(µ1 + µ2)

    P2 =λµ1

    µ1µ2 + λ(µ1 + µ2).

    5.4.12 Example 12 Suppose that taxis arrive according to a Poisson process with rate of λT = 2

    per minute and that potential passengers arrive according to a Poisson process with rate

    of λC = 3 per minute. Suppose that a taxi will wait no matter how many taxis are present

    and that an arriving potential passenger that does not find a taxi waiting will leave. Find

    the long-run proportion of arriving potential customers that find taxis.

    Solution We will solve this problem by modeling it with a birth and death process. Let

    Xt be the number of taxis waiting at time t. Then {Xt : t ≥ 0} is a birth and deathprocess with birth rate λn = 2 for all n ≥ 1 and death rate µn = 3 for all n ≥ 1. Hencethe long-run proportion of time that there is no taxis waiting is

    P0 =1

    1 +∑∞

    n=1(2/3)n

    =1

    3.

    Therefore the long-run proportion of arriving potential customers that find taxis is 1−P0 =23.

    23

    Continuous-Time Markov ChainsBirth and Death ProcessesTransition ProbabilitiesLimiting Probabilities