7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
1/35
Monte-Carlo Methods for the Estimation of
Rare Event Probabilities
Kevin Leder
December 2, 2011
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
2/35
Outline
1 Introduction
2 Importance Sampling
3 Splitting Method
4 Jackson Network
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
3/35
Introduction
Estimation of Small Probabilities via Monte Carlo
Why try to estimate the probability of rare events? Arent theyjust 0?
Phrase in terms of probabilities and random variables, there is a
random variable Z and a set A such thatP
(Z A) 0.Useful to embed rare event into sequence of rare events andstudy asymptotic properties i.e., consider the events {Xn A}.
We are interested in setting where the probabilities decayexponentially, i.e. there exists a > 0 such that
limn
1
nlogP (Zn A) = < 0.
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
4/35
Introduction
Estimating Rare Event Probabilities via Standard
Monte Carlo
Suppose we are interested in estimating P (Zn A) for somefixed n.
For a large integer k draw an i.i.d. vector (Zn1 , . . . , Znk ) then form
the estimator
pn,k = 1K
kj=1
1A(Znj ),
which is unbiased and consistent.
Consider relative error of estimator though
RE(pn,k) =sd(pn,k)
E[pn,k]=
1 P(Zn A)
kP(Zn A)
1kP(Zn A)
.
Number of replications k has to grow with 1/P(Xn A) to keep
relative error bounded.
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
5/35
Introduction
Two Solutions
Importance sampling: Simulate system under alternativedynamics so that event of interest is no longer rare. Keep track
of likelihood ratio so that you can renormalize final answer tocreate unbiased estimator.
Particle based methods: Simulate lots of correlated copies ofsystem under original dynamics, these methods can be viewedas a type of branching random walk.
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
6/35
Importance Sampling
Importance Sampling
Importance Sampling
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
7/35
Importance Sampling
Importance Sampling
For estimating pn = P(Zn A), first construct new sampling
measure Q then form estimator by averaging independentreplications of
pn =dP
dQ1A(Z
n),
where Zn is sampled according to measure Q.
Judge the performance of pn via its variance, (or equivalently2nd moment)
EQ[p2n] = E[pn].
In order to control relative error would like strong efficiency
supn
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
8/35
Importance Sampling
Importance Sampling for Random Walks
Consider estimating pn(A) = P(Zn/n A) where
Zn = X1 + . . . + Xn and {Xi} is an i.i.d sequence of ddimensional random vectors that satisfy
() = logE[e,Y] < ,
for in a neighborhood of the origin. Assume that E[X1] / A.
Create a sampling measure by exponential tilt of each incrementof Xi, for each R
d define a sampling measure
Q(X1 dx1, . . . , Xn dxn) =e,x1
e()P(X1 dx1)
e,xn
e()P(Xn dxn)
For this change of measure denote our estimator of pn(A) bypn(A, ), then 2nd moment single replication of estimator is
E[pn(A, )] = E[1{Zn/nA}en((),Zn/n)].
Study asymptotic properties of 2nd moment via large deviation
theory.
I S li
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
9/35
Importance Sampling
Large Deviations Principle
A sequence of random variables {Zn
} taking values in a Polishspace X satisfy a large deviations principle (LDP) with ratefunction I : X [0, ] if
1 I has compact level sets2 For every Borel set A X
infxA
I(x) lim infn
1
nlogP(Zn A)
lim supn
1
nlogP(Zn A) inf
xAI(x)
A useful alternative formulation: for any bounded and continuousf : X R the following holds
limn
1
nlogEenf(Z
n) = infxX
[f(x) + I(x)].
I t S li
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
10/35
Importance Sampling
Large Deviations for Random Walks
Suppose that Zn = X1 + . . . + Xn for an i.i.d sequence {Xi} of ddimensional vector that satisfy
() = logE[e,Y] < ,
for in a neighborhood of the origin.
Then Zn/n satisfies an LDP with rate function (CramersTheorem)
I() = supRd
[, ()].
In the 1-d setting, if a> E[X1] then Cramers theorem gives that
P(Zn > na) = en(I(a)+o(1)),
where I(a) = infx>aI(x) = I(a) = aa (a) and a solves(a) = a.
Importance Sampling
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
11/35
Importance Sampling
Using Large Deviations for Importance Sampling
Using Cramers Theorem and alternative formulation of LDP we
can approximate 2nd moment of IS estimator for large n,1
nlogE[1{Zn/nA}e
n((),Zn/n)] infxA
[I(x) () + , x].
The goal of importance sampling is to minimize variance whichgives following minmax problem
supRd
infxA
[I(x) () + , x].
If A is convex then
supRd
infxA
[I(x) () + , x] = infxA
supRd
[I(x) () + , x]
= 2 infxA
I(x).
Which Rd to use? Let x = arginfxAI(x) then we usechange of measure defined by using tilte x which is solution ofD(x) = x
.
If A is convex, this is a logarithmically efficient estimator.
Importance Sampling
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
12/35
Importance Sampling
Importance of Convexity
Glasserman and Yang (97) consider the problem of estimatingP(|Zn| > 1.5n) where the increments Xi = Ai Bi andAi N(1.5, 1) and B exp(1).
If set were convex then we need to findx = arginfx:|x|>1.5I(x) = 1.5, then use change of measurebased on 1.5 i.e.,
dP
dQ(x1) = e
1.5x1(1.5).
However by pretending target set was convex we end up withterrible estimator
lim supn
E[pn(A, 1.5)] = .
Importance Sampling
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
13/35
Importance Sampling
What went wrong?
Normal paths undersampling measure
Rogue paths under
sampling measure
Importance Sampling
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
14/35
Importance Sampling
A procedure for non-convex A
Dupuis and Wang showed that for non-convex A logarithmic
efficiency requires state-dependent changes of measure.
Suppose that A = A1 . . . Am, where Aj are closed convexsets. Define yj = arginfjAjI(y) and j
.= yj. Then a
logarithmically efficient change of measure is given by usingtransition kernel
Q(Xi dx|Zi1 = z) =m
j=1
rji (z)e
j,xi(j)P(Xi dxi),
The state-dependence mixture probabilities are described as
follows
rji (z) =
wji(z)m
k=1 wki (z)
,
wherewki (z) = exp [nk, z yk + (n i)(k)] .
Importance Sampling
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
15/35
Importance Sampling
Importance Sampling in Finance
Many option pricing problems can be viewed as rare-event
calculations. Option only has value on a small set of samplespace, so expected value is dominated by values on a rare set.
Glasserman et al (99) looked at using importance sampling as acomputational tool for pricing a variety of path dependentoptions. In setting of concave payoff function they presentlogarithmically efficient procedure for pricing options.
Gausoni and Robertson extended framework of the Glassermanpaper to a continuous time setting and establish that optimalchange of measure in continuous time setting can be found bysolving Euler-Lagrange equation. They assume that payofffunction is concave.
Several works by Glasserman have considered use ofimportance sampling for estimating value at risk, and conditionalvalue at risk.
Dupuis and Wang show that under very weak conditions on thepayoff functional adaptive importance sampling can be used to
evaluate option prices with logarithmic efficiency.
Splitting Method
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
16/35
p g
Splitting Method
Particle Based Methods
Splitting Method
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
17/35
Splitting Method
Will focus on a specific particle method called splitting method,
first developed in Villen-Altamirano and Villen-Altamirano 94who called it RESTART.
Dean and Dupuis 08 presented a procedure for construction ofefficient and stable splitting schemes. Will follow their notation.
Model problem: Xn a sequence of stochastic processes on
domain D Rd
, and two disjoint sets A and B, define thesequence of stopping times n = min{i : Xn(i) A B}
Goal estimate the probabilities
pn(x) = P (Xn(n) B|X
n(0) = x) .
Assume that there is a non-negative measurable function L suchthat
limn
1
nlog pn(x)
= inf{t
0
L(s), (s) ds : (0) = 0, (t) B, (s) Ac for all s t}.
Splitting Method
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
18/35
The Splitting Algorithm
Consider collection of nested setsB = Cn(0) Cn(1) . . . Cn(Mn)
1 Initiate simulation procedure with a single particle starting fromposition x Cn(k) for some k 1. Let w1 = 1 initial weightassociated to particle.
2 Evolve initial particle according to original transition kernel untileither it hits A (dies) or level Cn(k 1). If it hits Cn(k 1) it isreplaced by r identical particles (r > 1). Weight of descendantparticles is weight of parent particle 1/r.
3 Procedure from step 1 is replicated for each descendant particle,
carrying over the value of the weights at each level for thesurviving particles.
4 Steps 1 to 3 are repeated until all particles have either died orreached level Cn(0) = B.
Splitting Method
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
19/35
Splitting Method
A
B
x
Cn(0)
Cn(1)
Cn(2)
Splitting Method
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
20/35
The Splitting Estimator
Consider collection of nested setsB = Cn(0) Cn(1) . . . Cn(Mn) (Note will want Mn = c nfor some c > 0).
Nested sets are based on level sets of an importance function,U. Specifically define Lz = {y D : U(y) z} then
Cn(j) = L(j1)/n.
An important function is the level function
n(y) = min{j 0 : y Cn(j)},
Estimator for pn(x) is
Rn(x) = Nn(x)/rn(x),
where Nn(x) is number of particles that made it to B.
Splitting Method
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
21/35
Analysis of Splitting Estimators
For numerical stability want E[Nn(x)] rn(x)pn(x) to growsubexponentially, or rn(x)pn(x) = exp(o(n)).
For logarithmically optimal 2nd moment require thatrn(x) = pn(x) exp(o(n)).
Suppose we have a function W(x) such that
pn(x) exp (nW(x) + o(n)) ,
then it suffices to establish that
n(x) log r nW(x) = o(n)
Its easy to see that n(x) = nU(x) therefore we choose ourimportance function as U(x) = W(x)/ log(r).
See Dean and Dupuis for details.
Jackson Network
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
22/35
Jackson Networks
Performance Comparison: Overflow inJackson Networks
Jackson Network
O J k N k
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
23/35
Open Jackson Networks
Consider a network of d stations. Customers arrive to thenetwork with arrival rate = (1, . . . , d)
T, and the service rateof the d stations is encoded by = (1, . . . , d)
T.
A job that leaves station i joins station j with probability Pi,j, andleaves the system with probability
Pi,0 = 1 d
j=1
Pi,j,
this is called the routing matrix.
We are interested in stable open Jackson networks, that isi) i, either i > 0 or j1 Pj1j2 ...Pjki > 0 for some j1,...,jk.
ii) i, either Pi0 > 0 or Pij1 Pj1j2 ...Pjk0 > 0 for some j1,...,jk.iii) The network is stable (i.e. a stationary distribution exists).
Jackson Network
B i P ti f J k N t k
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
24/35
Basic Properties of Jackson Networks
Assume without loss of generality thatd
j=1(j + j) = 1.Under the stability assumption the following
i = i +d
j=1jPji, i = 1, 2,..., d
has a unique solution T = T(I P)1.
The traffic intensity at station i is in equilibrium is given byi = i/i (0, 1).
Define = max1id i, and then set = |{i : i = }.
Study system through embedded discrete time Markov chainQ = {Q(k) : k 0} where Q(k) = (Q1(k), . . . , Qd(k)), andQi(k) represents number of customers in station i immediatelyafter kth transition.
Jackson Network
O fl P b biliti i J k N t k
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
25/35
Overflow Probabilities in Jackson Networks
Consider a subset of stations encoded by the vector v, denote
the total population in this subset by Nv(x) = x, v.Will be interested in the following probability:
pvn = P { total population in stations encoded by v reaches
n before returning to 0, starting from 0}.
Can also define pvn via stopping times
T{x} inf{k 1 : Q(k) = x},
TVn inf{k 1 : Nv (Q(k)) n}.
If we define P()
.
= P(|Q(0) = x) thenpVn = P0(T
Vn T{0}).
or more generally
pV
n
(x) = Px(TV
n
T{0}).
Jackson Network
D mi f Q
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
26/35
Dynamics of Q
Queue length process is just a state-dependent random walk
Q(k + 1) = Q(k) + (Q(k), Y (k + 1)) ,
is a reflection function that prevents the queue-length processfrom taking negative values.
The noise term Y(k) represents outcome of next transition andhas following pdf
P (Y (k) = w) =
i arrival at station i,
iPij dep. at station i goes to station j,iPi0 dep. at station i leaves sys.
Jackson Network
Logarithmic Asymptotics of Overflow Probabilities in
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
27/35
Logarithmic Asymptotics of Overflow Probabilities in
Jackson Networks I
Large deviations theory dictates the existence of a function W
pVn (x/n) = exp (nWV (x/n) + o(n)) .
By looking at Q/n we have the following via formal Taylor
expansion
1 =1
pVn (x/n)E
pVn (x/n+
1
n(x/n, Y (1)))
Eexp{nWV[x/n+1
n
(x/n, Y (1))] + nWV (x/n)}
= Eexp{WV(x/n)T(x/n, Y (1)) + o(1)}
= exp ( (x/n, WV (x/n)) + o(1)) ,
where (x, ) = logE expT(x, Y (k)).
Jackson Network
Logarithmic Asymptotics of Overflow Probabilities in
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
28/35
Logarithmic Asymptotics of Overflow Probabilities in
Jackson Networks II
In order to characterize logarithmic asymptotics of pn(x/n) needto find function WV that satisfies
(x/n, WV(x/n)) = 0,
or for an asymptotic logarithmic upper bound find WV thatsatisfies
(x/n, WV(x/n)) 0.
A function that satisfies this condition is
WV(x/n) = , x/n log V ,
where i = log i and V = max{i : vi = 1}.
Build our splitting scheme out of this function, i.e. theimportance function is given by U(x/n) = WV(X/n)/ log(r).
Jackson Network
Logarithmically Efficient Estimation of Overflow
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
29/35
Logarithmically Efficient Estimation of Overflow
Probabilities
Dean and Dupuis established that if we use the importancefunction U then the splitting estimator for pVn (x) is logarithmicallyefficient, and number of particles created grows
subexponentially in n.Similarly Dupuis and Wang (09) established that usingsubsolutions to PDE from previous slide you can constructlogarithmically efficient IS estimators for overflow probabilities inJackson networks.
How do we then evaluate relative merits of two algorithms?Requires refined knowledge of performance characteristics, not
just logarithmic scale.
Jackson Network
Asymptotics of Overflow Probabilities in Jackson
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
30/35
Asymptotics of Overflow Probabilities in Jackson
Networks
Stationary distribution of Jackson networks:
(m1,..., md) =d
j=1
P (Qj () = mj)
=
dj=1
(1 j) mj
j , j = 1,..., d, and mj 0.
Can use this result and a time-reversal argument to show that ifx is in a compact set then there exists k0 and k1 such that
lim supn
pvn(x)
envnv1 k1
lim infn
pvn(x)
envnv1 k0
where where v = log v, in which v = max{i : vi = 1}; and
v = i
I{i = v
, vi = 1}. See Blanchet (11), or Blanchet,Leder, Shi (11).
Jackson Network
Computational Effort for Single Run of Splitting
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
31/35
Computational Effort for Single Run of Splitting
Algorithm
In Blanchet, Leder, and Shi (11) we looked at the computationaleffort necessary to use a well designed splitting algorithm.
Define C = logv
log r, then rewrite importance function and level
function as
U(x/n) = C(1
log vx
n)
n(x) = C(n x
log v).
Consider the total number of particles that make it to overflowset, can see that
E[Nn(x)] = rn(x)pVn (x) cevnnv1rn(x).
Notice that ev = elogv = eClog r = rC, so that
E[Nn(x) ]cnv1rn(x)nC,
and if we assume that x/n 0 then we have thatE[Nn(x)] cnv1
Jackson Network
Refined Performance of Splitting
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
32/35
Refined Performance of Splitting
From previous slide we saw that the number of particles tosurvive is nv1, the actual computational effort is on the orderof nv+1 as established in Blanchet, Leder and Shi (11).
The computational effort required to achieve a fixed level ofrelative error is given by
CnE[Rn(x)2]
pVn (x)2
,
where Cn is the computational cost per replication of theestimator i.e. roughly nv+1.
In Blanchet, Leder, Shi (11) we establish thatE[Rn(x)
2] = pVn (x)2O
nV
.
Thus the computational cost of a well designed splittingalgorithm is O
n2V+1
.
Jackson Network
Importance Sampling for Tandem Jackson Network
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
33/35
Importance Sampling for Tandem Jackson Network
Dupuis, Sezer, and Wang considered estimating total population
overflow in d node tandem network using sampling measuredefined as
Q(Y(k) = z|Q(k 1) = x)
Q(Y(k) = z|Q(k 1) = x)=
dj=0
rj(x/n) exp (j, z (j, x/n)) ,
where
rj(x/n) =wj(x/n)2
j=0 wk(x/n), wj(x/n) = exp (nj, x/n + n+ jn)
and
(j)i =
, 1 i d j0, otherwise
Dupuis, Sezer and Wang established that this estimator islogarithmically efficient.
Call associated estimator pn
.
Jackson Network
Refined Performance of Importance Sampling
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
34/35
Refined Performance of Importance Sampling
In Blanchet, Glynn and Leder (11) we performed refined analysisof this estimator to compare with splitting and other methods.
We know that cost of algorithm is roughly
E[pn]
e2nn22 .
By direct analysis of likelihood ratio on event of interest we areable to establish that
E[pn] = Oe2nn2d .We are able to establish that the computational complexity ofthis algorithm is O
n2(d+1)
.
Jackson Network
Comparing Performance on Estimating Overflow
7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder
35/35
Comparing Performance on Estimating Overflow
Probabilities in Tandem Networks
Computational cost of splitting algorithm is O(n2+1).
Computational cost of importance sampling algorithm isO(n2(d+1).
Thus prefer importance sampling if more than half the stationsare bottlenecks, and splitting otherwise.
Conjecture: This property holds for all Jackson networks, not
just tandem network topology.
Top Related