Optimal scaling in MTM algorithms

75
Scaling analysis of multiple-try MCMC methods Randal DOUC [email protected] Travail joint avec Mylène Bédard et Eric Moulines. 1 / 25

Transcript of Optimal scaling in MTM algorithms

Page 1: Optimal scaling in MTM algorithms

Scaling analysis of multiple-try MCMCmethods

Randal [email protected]

Travail joint avec Mylène Bédard et Eric Moulines.

1 / 25

Page 2: Optimal scaling in MTM algorithms

Themes

1 MCMC algorithms with multiple proposals: MCTM, MTM-C.2 Analysis through optimal scaling (introduced by Roberts,

Gelman, Gilks, 1998)3 Hit and Run algorithm.

2 / 25

Page 3: Optimal scaling in MTM algorithms

Themes

1 MCMC algorithms with multiple proposals: MCTM, MTM-C.2 Analysis through optimal scaling (introduced by Roberts,

Gelman, Gilks, 1998)3 Hit and Run algorithm.

2 / 25

Page 4: Optimal scaling in MTM algorithms

Themes

1 MCMC algorithms with multiple proposals: MCTM, MTM-C.2 Analysis through optimal scaling (introduced by Roberts,

Gelman, Gilks, 1998)3 Hit and Run algorithm.

2 / 25

Page 5: Optimal scaling in MTM algorithms

Themes

1 MCMC algorithms with multiple proposals: MCTM, MTM-C.2 Analysis through optimal scaling (introduced by Roberts,

Gelman, Gilks, 1998)3 Hit and Run algorithm.

2 / 25

Page 6: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Plan de l’exposé

1 Introduction

2 MH algorithms with multiple proposalsRandom Walk MHMCTM algorithmMTM-C algorithms

3 Optimal scalingMain results

4 Optimising the speed up processMCTM algorithmMTM-C algorithms

5 Conclusion

3 / 25

Page 7: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Plan de l’exposé

1 Introduction

2 MH algorithms with multiple proposalsRandom Walk MHMCTM algorithmMTM-C algorithms

3 Optimal scalingMain results

4 Optimising the speed up processMCTM algorithmMTM-C algorithms

5 Conclusion

3 / 25

Page 8: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Plan de l’exposé

1 Introduction

2 MH algorithms with multiple proposalsRandom Walk MHMCTM algorithmMTM-C algorithms

3 Optimal scalingMain results

4 Optimising the speed up processMCTM algorithmMTM-C algorithms

5 Conclusion

3 / 25

Page 9: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Plan de l’exposé

1 Introduction

2 MH algorithms with multiple proposalsRandom Walk MHMCTM algorithmMTM-C algorithms

3 Optimal scalingMain results

4 Optimising the speed up processMCTM algorithmMTM-C algorithms

5 Conclusion

3 / 25

Page 10: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Plan de l’exposé

1 Introduction

2 MH algorithms with multiple proposalsRandom Walk MHMCTM algorithmMTM-C algorithms

3 Optimal scalingMain results

4 Optimising the speed up processMCTM algorithmMTM-C algorithms

5 Conclusion

3 / 25

Page 11: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Plan

1 Introduction

2 MH algorithms with multiple proposalsRandom Walk MHMCTM algorithmMTM-C algorithms

3 Optimal scalingMain results

4 Optimising the speed up processMCTM algorithmMTM-C algorithms

5 Conclusion

4 / 25

Page 12: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Metropolis Hastings (MH) algorithm

1 We wish to approximate

I =

h(x)π(x)

π(u)dudx =

h(x)π(x)dx

2 x 7→ π(x) is known but not∫

π(u)du.

3 Approximate I with I = 1n

∑nt=1 h(X [t]) where (X [t]) is a Markov

chain with limiting distribution π.4 In MH algorithm, the last condition is obtained from a detailed

balance condition

∀x , y , π(x)p(x , y) = π(y)p(y , x)

5 Quality of the approximation are obtained from Law of LargeNumbers or CLT for Markov chains.

5 / 25

Page 13: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Metropolis Hastings (MH) algorithm

1 We wish to approximate

I =

h(x)π(x)

π(u)dudx =

h(x)π(x)dx

2 x 7→ π(x) is known but not∫

π(u)du.

3 Approximate I with I = 1n

∑nt=1 h(X [t]) where (X [t]) is a Markov

chain with limiting distribution π.4 In MH algorithm, the last condition is obtained from a detailed

balance condition

∀x , y , π(x)p(x , y) = π(y)p(y , x)

5 Quality of the approximation are obtained from Law of LargeNumbers or CLT for Markov chains.

5 / 25

Page 14: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Metropolis Hastings (MH) algorithm

1 We wish to approximate

I =

h(x)π(x)

π(u)dudx =

h(x)π(x)dx

2 x 7→ π(x) is known but not∫

π(u)du.

3 Approximate I with I = 1n

∑nt=1 h(X [t]) where (X [t]) is a Markov

chain with limiting distribution π.4 In MH algorithm, the last condition is obtained from a detailed

balance condition

∀x , y , π(x)p(x , y) = π(y)p(y , x)

5 Quality of the approximation are obtained from Law of LargeNumbers or CLT for Markov chains.

5 / 25

Page 15: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Metropolis Hastings (MH) algorithm

1 We wish to approximate

I =

h(x)π(x)

π(u)dudx =

h(x)π(x)dx

2 x 7→ π(x) is known but not∫

π(u)du.

3 Approximate I with I = 1n

∑nt=1 h(X [t]) where (X [t]) is a Markov

chain with limiting distribution π.4 In MH algorithm, the last condition is obtained from a detailed

balance condition

∀x , y , π(x)p(x , y) = π(y)p(y , x)

5 Quality of the approximation are obtained from Law of LargeNumbers or CLT for Markov chains.

5 / 25

Page 16: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Metropolis Hastings (MH) algorithm

1 We wish to approximate

I =

h(x)π(x)

π(u)dudx =

h(x)π(x)dx

2 x 7→ π(x) is known but not∫

π(u)du.

3 Approximate I with I = 1n

∑nt=1 h(X [t]) where (X [t]) is a Markov

chain with limiting distribution π.4 In MH algorithm, the last condition is obtained from a detailed

balance condition

∀x , y , π(x)p(x , y) = π(y)p(y , x)

5 Quality of the approximation are obtained from Law of LargeNumbers or CLT for Markov chains.

5 / 25

Page 17: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Plan

1 Introduction

2 MH algorithms with multiple proposalsRandom Walk MHMCTM algorithmMTM-C algorithms

3 Optimal scalingMain results

4 Optimising the speed up processMCTM algorithmMTM-C algorithms

5 Conclusion

6 / 25

Page 18: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Random Walk MH

Notation w.p. = with probability

Algorithme (MCMC )

If X [t] = x , how is X [t + 1] simulated?

(a) Y ∼ q(x ; ·).(b) Accept the proposal X [t + 1] = Y w.p. α(x ,Y ) where

α(x , y) = 1 ∧ π(y)q(y ; x)

π(x)q(x ; y)

(c) Otherwise X [t + 1] = x

7 / 25

Page 19: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Random Walk MH

Notation w.p. = with probability

Algorithme (MCMC )

If X [t] = x , how is X [t + 1] simulated?

(a) Y ∼ q(x ; ·).(b) Accept the proposal X [t + 1] = Y w.p. α(x ,Y ) where

α(x , y) = 1 ∧ π(y)q(y ; x)

π(x)q(x ; y)

(c) Otherwise X [t + 1] = x

The chain is π-reversible since:

7 / 25

Page 20: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Random Walk MH

Notation w.p. = with probability

Algorithme (MCMC )

If X [t] = x , how is X [t + 1] simulated?

(a) Y ∼ q(x ; ·).(b) Accept the proposal X [t + 1] = Y w.p. α(x ,Y ) where

α(x , y) = 1 ∧ π(y)q(y ; x)

π(x)q(x ; y)

(c) Otherwise X [t + 1] = x

The chain is π-reversible since:

π(x)α(x , y)q(x ; y) = π(x)α(x , y) ∧ π(y)α(y , x)

7 / 25

Page 21: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Random Walk MH

Notation w.p. = with probability

Algorithme (MCMC )

If X [t] = x , how is X [t + 1] simulated?

(a) Y ∼ q(x ; ·).(b) Accept the proposal X [t + 1] = Y w.p. α(x ,Y ) where

α(x , y) = 1 ∧ π(y)q(y ; x)

π(x)q(x ; y)

(c) Otherwise X [t + 1] = x

The chain is π-reversible since:

π(x)α(x , y)q(x ; y) = π(y)α(y , x)q(y ; x)

7 / 25

Page 22: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Random Walk MH

Assume that q(x ; y) = q(y ; x) ◮ the instrumental kernel issymmetric. Typically Y = X + U where U has symm. distr.

Notation w.p. = with probability

Algorithme (MCMC with symmetric proposal)

If X [t] = x , how is X [t + 1] simulated?

(a) Y ∼ q(x ; ·).(b) Accept the proposal X [t + 1] = Y w.p. α(x ,Y ) where

α(x , y) = 1 ∧ π(y)q(y ; x)

π(x)q(x ; y)

(c) Otherwise X [t + 1] = x

The chain is π-reversible since:

π(x)α(x , y)q(x ; y) = π(y)α(y , x)q(y ; x)

7 / 25

Page 23: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Random Walk MH

Assume that q(x ; y) = q(y ; x) ◮ the instrumental kernel issymmetric. Typically Y = X + U where U has symm. distr.

Notation w.p. = with probability

Algorithme (MCMC with symmetric proposal)

If X [t] = x , how is X [t + 1] simulated?

(a) Y ∼ q(x ; ·).(b) Accept the proposal X [t + 1] = Y w.p. α(x ,Y ) where

α(x , y) = 1 ∧ π(y)

π(x)

(c) Otherwise X [t + 1] = x

The chain is π-reversible since:

π(x)α(x , y)q(x ; y) = π(y)α(y , x)q(y ; x)

7 / 25

Page 24: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

MCTM algorithm

Multiple proposal MCMC

1 Liu, Liang, Wong (2000) introduced the multiple proposalMCMC. Generalized to multiple correlated proposals by Craiuand Lemieux (2007).

2 A pool of candidates is drawn from (Y 1, . . . ,Y K )∣

X [t] ∼ q(X [t]; ·).3 We select one candidate a priori according to some "informative"

criterium (with high values of π for example).

4 We accept the candidate with some well chosen probability.

◮ diversity of the candidates: some candidates are, other are faraway from the current state. Some additional notations:

Y j∣

X [t]∼ qj(X [t]; ·) (◮MARGINAL DIST.) (1)

(Y i)i 6=j∣

X [t],Y j ∼ qj(X [t],Y j ; ·) (◮SIM. OTHER CAND.) . (2)

8 / 25

Page 25: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

MCTM algorithm

Multiple proposal MCMC

1 Liu, Liang, Wong (2000) introduced the multiple proposalMCMC. Generalized to multiple correlated proposals by Craiuand Lemieux (2007).

2 A pool of candidates is drawn from (Y 1, . . . ,Y K )∣

X [t] ∼ q(X [t]; ·).3 We select one candidate a priori according to some "informative"

criterium (with high values of π for example).

4 We accept the candidate with some well chosen probability.

◮ diversity of the candidates: some candidates are, other are faraway from the current state. Some additional notations:

Y j∣

X [t]∼ qj(X [t]; ·) (◮MARGINAL DIST.) (1)

(Y i)i 6=j∣

X [t],Y j ∼ qj(X [t],Y j ; ·) (◮SIM. OTHER CAND.) . (2)

8 / 25

Page 26: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

MCTM algorithm

Multiple proposal MCMC

1 Liu, Liang, Wong (2000) introduced the multiple proposalMCMC. Generalized to multiple correlated proposals by Craiuand Lemieux (2007).

2 A pool of candidates is drawn from (Y 1, . . . ,Y K )∣

X [t] ∼ q(X [t]; ·).3 We select one candidate a priori according to some "informative"

criterium (with high values of π for example).

4 We accept the candidate with some well chosen probability.

◮ diversity of the candidates: some candidates are, other are faraway from the current state. Some additional notations:

Y j∣

X [t]∼ qj(X [t]; ·) (◮MARGINAL DIST.) (1)

(Y i)i 6=j∣

X [t],Y j ∼ qj(X [t],Y j ; ·) (◮SIM. OTHER CAND.) . (2)

8 / 25

Page 27: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

MCTM algorithm

Multiple proposal MCMC

1 Liu, Liang, Wong (2000) introduced the multiple proposalMCMC. Generalized to multiple correlated proposals by Craiuand Lemieux (2007).

2 A pool of candidates is drawn from (Y 1, . . . ,Y K )∣

X [t] ∼ q(X [t]; ·).3 We select one candidate a priori according to some "informative"

criterium (with high values of π for example).

4 We accept the candidate with some well chosen probability.

◮ diversity of the candidates: some candidates are, other are faraway from the current state. Some additional notations:

Y j∣

X [t]∼ qj(X [t]; ·) (◮MARGINAL DIST.) (1)

(Y i)i 6=j∣

X [t],Y j ∼ qj(X [t],Y j ; ·) (◮SIM. OTHER CAND.) . (2)

8 / 25

Page 28: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

MCTM algorithm

Multiple proposal MCMC

1 Liu, Liang, Wong (2000) introduced the multiple proposalMCMC. Generalized to multiple correlated proposals by Craiuand Lemieux (2007).

2 A pool of candidates is drawn from (Y 1, . . . ,Y K )∣

X [t] ∼ q(X [t]; ·).3 We select one candidate a priori according to some "informative"

criterium (with high values of π for example).

4 We accept the candidate with some well chosen probability.

◮ diversity of the candidates: some candidates are, other are faraway from the current state. Some additional notations:

Y j∣

X [t]∼ qj(X [t]; ·) (◮MARGINAL DIST.) (1)

(Y i)i 6=j∣

X [t],Y j ∼ qj(X [t],Y j ; ·) (◮SIM. OTHER CAND.) . (2)

8 / 25

Page 29: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

MCTM algorithm

Multiple proposal MCMC

1 Liu, Liang, Wong (2000) introduced the multiple proposalMCMC. Generalized to multiple correlated proposals by Craiuand Lemieux (2007).

2 A pool of candidates is drawn from (Y 1, . . . ,Y K )∣

X [t] ∼ q(X [t]; ·).3 We select one candidate a priori according to some "informative"

criterium (with high values of π for example).

4 We accept the candidate with some well chosen probability.

◮ diversity of the candidates: some candidates are, other are faraway from the current state. Some additional notations:

Y j∣

X [t]∼ qj(X [t]; ·) (◮MARGINAL DIST.) (1)

(Y i)i 6=j∣

X [t],Y j ∼ qj(X [t],Y j ; ·) (◮SIM. OTHER CAND.) . (2)

8 / 25

Page 30: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

MCTM algorithm

Assume that qj(x ; y) = qj(y ; x) .

Algorithme (MCTM: Multiple Correlated try Metropolis alg.)

If X [t] = x , how is X [t + 1] simulated?

(a) (Y 1, . . . ,Y K ) ∼ q(x ; ·). (◮POOL OF CAND.)

(b) Draw an index J ∈ {1, . . . ,K}, with probability proportional to[π(Y 1), . . . , π(Y K )] . (◮SELECTION A PRIORI)

(c) {Y J,i}i 6=J ∼ qJ(Y J , x ; ·). (◮AUXILIARY VARIABLES)

(d) Accept the proposal X [t + 1] = Y J w.p. αJ (x , (Y i)Ki=1, (Y

J,i)i 6=J)where

αj(x , (y i)Ki=1, (y

j,i)i 6=j ) = 1 ∧∑

i 6=j π(y i) + π(y j )∑

i 6=j π(y j,i) + π(x). (3)

(◮MH ACCEPTANCE PROBABILITY)

(e) Otherwise, X [t + 1] = X [t]

See MTM-C 9 / 25

Page 31: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

MCTM algorithm

Assume that qj(x ; y) = qj(y ; x) .

Algorithme (MCTM: Multiple Correlated try Metropolis alg.)

If X [t] = x , how is X [t + 1] simulated?

(a) (Y 1, . . . ,Y K ) ∼ q(x ; ·). (◮POOL OF CAND.)

(b) Draw an index J ∈ {1, . . . ,K}, with probability proportional to[π(Y 1), . . . , π(Y K )] . (◮SELECTION A PRIORI)

(c) {Y J,i}i 6=J ∼ qJ(Y J , x ; ·). (◮AUXILIARY VARIABLES)

(d) Accept the proposal X [t + 1] = Y J w.p. αJ (x , (Y i)Ki=1, (Y

J,i)i 6=J)where

αj(x , (y i)Ki=1, (y

j,i)i 6=j ) = 1 ∧∑

i 6=j π(y i) + π(y j )∑

i 6=j π(y j,i) + π(x). (3)

(◮MH ACCEPTANCE PROBABILITY)

(e) Otherwise, X [t + 1] = X [t]

See MTM-C 9 / 25

Page 32: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

MCTM algorithm

Assume that qj(x ; y) = qj(y ; x) .

Algorithme (MCTM: Multiple Correlated try Metropolis alg.)

If X [t] = x , how is X [t + 1] simulated?

(a) (Y 1, . . . ,Y K ) ∼ q(x ; ·). (◮POOL OF CAND.)

(b) Draw an index J ∈ {1, . . . ,K}, with probability proportional to[π(Y 1), . . . , π(Y K )] . (◮SELECTION A PRIORI)

(c) {Y J,i}i 6=J ∼ qJ(Y J , x ; ·). (◮AUXILIARY VARIABLES)

(d) Accept the proposal X [t + 1] = Y J w.p. αJ (x , (Y i)Ki=1, (Y

J,i)i 6=J)where

αj(x , (y i)Ki=1, (y

j,i)i 6=j ) = 1 ∧∑

i 6=j π(y i) + π(y j )∑

i 6=j π(y j,i) + π(x). (3)

(◮MH ACCEPTANCE PROBABILITY)

(e) Otherwise, X [t + 1] = X [t]

See MTM-C 9 / 25

Page 33: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

MCTM algorithm

Assume that qj(x ; y) = qj(y ; x) .

Algorithme (MCTM: Multiple Correlated try Metropolis alg.)

If X [t] = x , how is X [t + 1] simulated?

(a) (Y 1, . . . ,Y K ) ∼ q(x ; ·). (◮POOL OF CAND.)

(b) Draw an index J ∈ {1, . . . ,K}, with probability proportional to[π(Y 1), . . . , π(Y K )] . (◮SELECTION A PRIORI)

(c) {Y J,i}i 6=J ∼ qJ(Y J , x ; ·). (◮AUXILIARY VARIABLES)

(d) Accept the proposal X [t + 1] = Y J w.p. αJ (x , (Y i)Ki=1, (Y

J,i)i 6=J)where

αj(x , (y i)Ki=1, (y

j,i)i 6=j ) = 1 ∧∑

i 6=j π(y i) + π(y j )∑

i 6=j π(y j,i) + π(x). (3)

(◮MH ACCEPTANCE PROBABILITY)

(e) Otherwise, X [t + 1] = X [t]

See MTM-C 9 / 25

Page 34: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

MCTM algorithm

1 It generalises the classical Random Walk Hasting Metropolisalgorithm (which is the case K = 1). RWMC

10 / 25

Page 35: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

MCTM algorithm

1 It generalises the classical Random Walk Hasting Metropolisalgorithm (which is the case K = 1). RWMC

2 It satisfies the detailed balance condition wrt π:

π(x)

K∑

j=1

· · ·∫

qj(x ; y)Qj

x , y ;∏

i 6=j

d(y i)

Qj

y , x ;∏

i 6=j

d(y j,i)

π(y)

π(y) +∑

i 6=j π(y i)

(

1 ∧π(y) +

i 6=j π(y i)

π(x) +∑

i 6=j π(y j,i)

)

10 / 25

Page 36: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

MCTM algorithm

1 It generalises the classical Random Walk Hasting Metropolisalgorithm (which is the case K = 1). RWMC

2 It satisfies the detailed balance condition wrt π:

π(x)π(y)

K∑

j=1

qj(x ; y)

· · ·∫

Qj

x , y ;∏

i 6=j

d(y i)

Qj

y , x ;∏

i 6=j

d(y j,i)

(

1π(y) +

i 6=j π(y i )∧ 1π(x) +

i 6=j π(y j,i)

)

◮ symmetric wrt (x , y)

10 / 25

Page 37: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

MCTM algorithm

1 The MCTM uses the simulation of K random variables for thepool of candidates and K − 1 auxiliary variables to compute theMH acceptance ratio.

2 Can we reduce the number of simulated variables while keepingthe diversity of the pool?

3 Draw one random variable and use transformations to create thepool of candidates and auxiliary variables.

11 / 25

Page 38: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

MCTM algorithm

1 The MCTM uses the simulation of K random variables for thepool of candidates and K − 1 auxiliary variables to compute theMH acceptance ratio.

2 Can we reduce the number of simulated variables while keepingthe diversity of the pool?

3 Draw one random variable and use transformations to create thepool of candidates and auxiliary variables.

11 / 25

Page 39: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

MCTM algorithm

1 The MCTM uses the simulation of K random variables for thepool of candidates and K − 1 auxiliary variables to compute theMH acceptance ratio.

2 Can we reduce the number of simulated variables while keepingthe diversity of the pool?

3 Draw one random variable and use transformations to create thepool of candidates and auxiliary variables.

11 / 25

Page 40: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

MTM-C algorithms

Let

{

Ψi : X×[0, 1)r → X

Ψj,i : X×X → X.

Assume that1 For all j ∈ {1, . . . ,K}, set

Y j = Ψj(x ,V ) (◮COMMON R.V.)

where V ∼ U([0, 1)r )

2 For any (i, j) ∈ {1, . . . ,K}2,

Y i = Ψj,i(x ,Y j) . (◮RECONSTRUCTION OF THE OTHER CAND.)

(4)

Example:

ψi(x , v) = x + σΦ−1(< v i + v >) where v i =< iK a >, a ∈ Rr , Φ

cumulative repartition function of the normal distribution. ◮

Korobov seq. + Cranley Patterson rot.

ψi(x , v) = x + γ iΦ−1(v) . ◮ Hit and Run algorithm. 12 / 25

Page 41: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

MTM-C algorithms

Let

{

Ψi : X×[0, 1)r → X

Ψj,i : X×X → X.

Assume that1 For all j ∈ {1, . . . ,K}, set

Y j = Ψj(x ,V ) (◮COMMON R.V.)

where V ∼ U([0, 1)r )

2 For any (i, j) ∈ {1, . . . ,K}2,

Y i = Ψj,i(x ,Y j) . (◮RECONSTRUCTION OF THE OTHER CAND.)

(4)

Example:

ψi(x , v) = x + σΦ−1(< v i + v >) where v i =< iK a >, a ∈ Rr , Φ

cumulative repartition function of the normal distribution. ◮

Korobov seq. + Cranley Patterson rot.

ψi(x , v) = x + γ iΦ−1(v) . ◮ Hit and Run algorithm. 12 / 25

Page 42: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

MTM-C algorithms

Let

{

Ψi : X×[0, 1)r → X

Ψj,i : X×X → X.

Assume that1 For all j ∈ {1, . . . ,K}, set

Y j = Ψj(x ,V ) (◮COMMON R.V.)

where V ∼ U([0, 1)r )

2 For any (i, j) ∈ {1, . . . ,K}2,

Y i = Ψj,i(x ,Y j) . (◮RECONSTRUCTION OF THE OTHER CAND.)

(4)

Example:

ψi(x , v) = x + σΦ−1(< v i + v >) where v i =< iK a >, a ∈ Rr , Φ

cumulative repartition function of the normal distribution. ◮

Korobov seq. + Cranley Patterson rot.

ψi(x , v) = x + γ iΦ−1(v) . ◮ Hit and Run algorithm. 12 / 25

Page 43: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

MTM-C algorithms

Let

{

Ψi : X×[0, 1)r → X

Ψj,i : X×X → X.

Assume that1 For all j ∈ {1, . . . ,K}, set

Y j = Ψj(x ,V ) (◮COMMON R.V.)

where V ∼ U([0, 1)r )

2 For any (i, j) ∈ {1, . . . ,K}2,

Y i = Ψj,i(x ,Y j) . (◮RECONSTRUCTION OF THE OTHER CAND.)

(4)

Example:

ψi(x , v) = x + σΦ−1(< v i + v >) where v i =< iK a >, a ∈ Rr , Φ

cumulative repartition function of the normal distribution. ◮

Korobov seq. + Cranley Patterson rot.

ψi(x , v) = x + γ iΦ−1(v) . ◮ Hit and Run algorithm. 12 / 25

Page 44: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

MTM-C algorithms

Algorithme (MTM-C: Multiple Try Metropolis alg. with commonproposal)

(a) Draw V ∼ U([0, 1)r ) and set Y i = Ψi(x ,V ) for i = 1, . . . ,K .

(b) Draw an index J ∈ {1, . . . ,K}, with probability proportional to

[π(Y 1), . . . , π(Y K )] .

(c) Accept X [t + 1] = Y J with probability αJ (x ,Y ), where, forj ∈ {1, . . . ,K},

αj(x , y j) = αj (x , {Ψj,i(x , y j)}Ki=1, {Ψj,i(y j , x)}i 6=j

)

, (5)

with αj given in (3) and reject otherwise.

(d) Otherwise X [t + 1] = Y J .

See MCTM

13 / 25

Page 45: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

MTM-C algorithms

Algorithme (MTM-C: Multiple Try Metropolis alg. with commonproposal)

(a) Draw V ∼ U([0, 1)r ) and set Y i = Ψi(x ,V ) for i = 1, . . . ,K .

(b) Draw an index J ∈ {1, . . . ,K}, with probability proportional to

[π(Y 1), . . . , π(Y K )] .

(c) Accept X [t + 1] = Y J with probability αJ (x ,Y ), where, forj ∈ {1, . . . ,K},

αj(x , y j) = αj (x , {Ψj,i(x , y j)}Ki=1, {Ψj,i(y j , x)}i 6=j

)

, (5)

with αj given in (3) and reject otherwise.

(d) Otherwise X [t + 1] = Y J .

See MCTM

13 / 25

Page 46: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

MTM-C algorithms

Algorithme (MTM-C: Multiple Try Metropolis alg. with commonproposal)

(a) Draw V ∼ U([0, 1)r ) and set Y i = Ψi(x ,V ) for i = 1, . . . ,K .

(b) Draw an index J ∈ {1, . . . ,K}, with probability proportional to

[π(Y 1), . . . , π(Y K )] .

(c) Accept X [t + 1] = Y J with probability αJ (x ,Y ), where, forj ∈ {1, . . . ,K},

αj(x , y j) = αj (x , {Ψj,i(x , y j)}Ki=1, {Ψj,i(y j , x)}i 6=j

)

, (5)

with αj given in (3) and reject otherwise.

(d) Otherwise X [t + 1] = Y J .

See MCTM

13 / 25

Page 47: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

MTM-C algorithms

Algorithme (MTM-C: Multiple Try Metropolis alg. with commonproposal)

(a) Draw V ∼ U([0, 1)r ) and set Y i = Ψi(x ,V ) for i = 1, . . . ,K .

(b) Draw an index J ∈ {1, . . . ,K}, with probability proportional to

[π(Y 1), . . . , π(Y K )] .

(c) Accept X [t + 1] = Y J with probability αJ (x ,Y ), where, forj ∈ {1, . . . ,K},

αj(x , y j) = αj (x , {Ψj,i(x , y j)}Ki=1, {Ψj,i(y j , x)}i 6=j

)

, (5)

with αj given in (3) and reject otherwise.

(d) Otherwise X [t + 1] = Y J .

See MCTM

13 / 25

Page 48: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Plan

1 Introduction

2 MH algorithms with multiple proposalsRandom Walk MHMCTM algorithmMTM-C algorithms

3 Optimal scalingMain results

4 Optimising the speed up processMCTM algorithmMTM-C algorithms

5 Conclusion

14 / 25

Page 49: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

How to compare two MH algorithms

◮ PESKUN- If P1 and P2 are two π-reversible kernels and

∀x , y p1(x , y) ≤ p2(x , y)

then P2 is better than P1 in terms of the asymptotic variance ofN−1∑N

i=1 h(X1).

1 Off diagonal order: Not always easy to compare!

2 Moreover, one expression of the asymptotic variance is:

V = Varπ(h) + 2∞∑

t=1

Covπ(h(X0), h(Xt ))

15 / 25

Page 50: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

How to compare two MH algorithms

◮ PESKUN- If P1 and P2 are two π-reversible kernels and

∀x , y p1(x , y) ≤ p2(x , y)

then P2 is better than P1 in terms of the asymptotic variance ofN−1∑N

i=1 h(X1).

1 Off diagonal order: Not always easy to compare!

2 Moreover, one expression of the asymptotic variance is:

V = Varπ(h) + 2∞∑

t=1

Covπ(h(X0), h(Xt ))

15 / 25

Page 51: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

How to compare two MH algorithms

◮ PESKUN- If P1 and P2 are two π-reversible kernels and

∀x , y p1(x , y) ≤ p2(x , y)

then P2 is better than P1 in terms of the asymptotic variance ofN−1∑N

i=1 h(X1).

1 Off diagonal order: Not always easy to compare!

2 Moreover, one expression of the asymptotic variance is:

V = Varπ(h) + 2∞∑

t=1

Covπ(h(X0), h(Xt ))

15 / 25

Page 52: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Original idea of optimal scaling

For the RW-MH algorithm:

1 Increase dimension T .

2 Target distribution πT (x0:T ) =∏T

t=0 f (xt ) .

3 Assume that XT [0] ∼ πT .

4 Take a random walk increasingly conservative: draw candidateYT = XT [t] + ℓ√

TUT [t] where UT [t] centered standard normal.

5 What is the "best" ℓ?

16 / 25

Page 53: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Original idea of optimal scaling

For the RW-MH algorithm:

1 Increase dimension T .

2 Target distribution πT (x0:T ) =∏T

t=0 f (xt ) .

3 Assume that XT [0] ∼ πT .

4 Take a random walk increasingly conservative: draw candidateYT = XT [t] + ℓ√

TUT [t] where UT [t] centered standard normal.

5 What is the "best" ℓ?

16 / 25

Page 54: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Original idea of optimal scaling

For the RW-MH algorithm:

1 Increase dimension T .

2 Target distribution πT (x0:T ) =∏T

t=0 f (xt ) .

3 Assume that XT [0] ∼ πT .

4 Take a random walk increasingly conservative: draw candidateYT = XT [t] + ℓ√

TUT [t] where UT [t] centered standard normal.

5 What is the "best" ℓ?

16 / 25

Page 55: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Original idea of optimal scaling

For the RW-MH algorithm:

1 Increase dimension T .

2 Target distribution πT (x0:T ) =∏T

t=0 f (xt ) .

3 Assume that XT [0] ∼ πT .

4 Take a random walk increasingly conservative: draw candidateYT = XT [t] + ℓ√

TUT [t] where UT [t] centered standard normal.

5 What is the "best" ℓ?

16 / 25

Page 56: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Original idea of optimal scaling

For the RW-MH algorithm:

1 Increase dimension T .

2 Target distribution πT (x0:T ) =∏T

t=0 f (xt ) .

3 Assume that XT [0] ∼ πT .

4 Take a random walk increasingly conservative: draw candidateYT = XT [t] + ℓ√

TUT [t] where UT [t] centered standard normal.

5 What is the "best" ℓ?

16 / 25

Page 57: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Théorème

The first component of (XT [⌊Ts⌋])0≤s≤1 weakly converges in theSkorokhod topology to the stationary solution (W [λℓs], s ∈ R+) of theLangevin SDE

dW [s] = dB[s] +12

[ln f ]′ (W [s])ds ,

In particular, the first component of

(XT [0],XT [α1T ], . . . ,XT [αpT ])

converges weakly to the distribution of

(W [0],W [λℓα1T ], . . . ,W [λℓαpT ])

17 / 25

Page 58: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Théorème

The first component of (XT [⌊Ts⌋])0≤s≤1 weakly converges in theSkorokhod topology to the stationary solution (W [λℓs], s ∈ R+) of theLangevin SDE

dW [s] = dB[s] +12

[ln f ]′ (W [s])ds ,

In particular, the first component of

(XT [0],XT [α1T ], . . . ,XT [αpT ])

converges weakly to the distribution of

(W [0],W [λℓα1T ], . . . ,W [λℓαpT ])

Then, ℓ is chosen to maximize λℓ = 2ℓ2Φ(

− ℓ2

√I)

where

I =∫

{[ln f ]′(x)}2 f (x)dx.17 / 25

Page 59: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Théorème

The first component of (XT [⌊Ts⌋])0≤s≤1 weakly converges in theSkorokhod topology to the stationary solution (W [s], s ∈ R+) of theLangevin SDE

dW [s] =√

λℓdB[s] +λℓ

2[ln f ]′ (W [s])ds ,

In particular, the first component of

(XT [0],XT [α1T ], . . . ,XT [αpT ])

converges weakly to the distribution of

(W [0],W [λℓα1T ], . . . ,W [λℓαpT ])

Then, ℓ is chosen to maximize λℓ = 2ℓ2Φ(

− ℓ2

√I)

where

I =∫

{[ln f ]′(x)}2 f (x)dx.17 / 25

Page 60: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Main results

Optimal scaling for the MCTM algorithm

◮ The pool of candidates

Y iT ,t [n + 1] = XT ,t [n] + T−1/2U i

t [n + 1] , 0 ≤ t ≤ T , 1 ≤ i ≤ K ,

where for any t ∈ {0, . . . ,T},

(U it [n + 1])K

i=1 ∼ N (0,Σ) , (◮MCTM)

U it [n + 1] = ψi(Vt ), and Vt ∼ U [0, 1], (◮MTM-C)

◮ The auxiliary variables

Y j,iT ,t [n + 1] = XT ,t [n] + T−1/2U j,i

t [n + 1] , i 6= j ,

18 / 25

Page 61: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Main results

Optimal scaling for the MCTM algorithm

◮ The pool of candidates

Y iT ,t [n + 1] = XT ,t [n] + T−1/2U i

t [n + 1] , 0 ≤ t ≤ T , 1 ≤ i ≤ K ,

where for any t ∈ {0, . . . ,T},

(U it [n + 1])K

i=1 ∼ N (0,Σ) , (◮MCTM)

U it [n + 1] = ψi(Vt ), and Vt ∼ U [0, 1], (◮MTM-C)

◮ The auxiliary variables

Y j,iT ,t [n + 1] = XT ,t [n] + T−1/2U j,i

t [n + 1] , i 6= j ,

18 / 25

Page 62: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Main results

Théorème

Suppose that XT [0] is distributed according to the target density πT .Then, the process (XT ,0[sT ], s ∈ R+) weakly converges in theSkorokhod topology to the stationary solution (W [s], s ∈ R+) of theLangevin SDE

dW [s] = λ1/2dB[s] +12λ [ln f ]′ (W [s])ds ,

with λ , λ(

I, (Γj)Kj=1

)

, where Γj , 1 ≤ j ≤ K denotes the covariance

matrix of the random vector (U j0, (U

i0)i 6=j , (U0

j,i)i 6=j).

For the MCTM, Γj = Γj(Σ).

α(Γ) = E

[

A{

(

Gi − Var[Gi ]/2)2K−1

i=1

}]

, (6)

where A is bounded lip. and (Gi)2K−1i=1 ∼ N (0, Γ).

λ(

I, (Γj)Kj=1

)

,

K∑

j=1

Γj1,1 × α

[

IΓj] , (7)19 / 25

Page 63: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Main results

Théorème

Suppose that XT [0] is distributed according to the target density πT .Then, the process (XT ,0[sT ], s ∈ R+) weakly converges in theSkorokhod topology to the stationary solution (W [s], s ∈ R+) of theLangevin SDE

dW [s] = λ1/2dB[s] +12λ [ln f ]′ (W [s])ds ,

with λ , λ(

I, (Γj)Kj=1

)

, where Γj , 1 ≤ j ≤ K denotes the covariance

matrix of the random vector (U j0, (U

i0)i 6=j , (U0

j,i)i 6=j).

For the MCTM, Γj = Γj(Σ).

α(Γ) = E

[

A{

(

Gi − Var[Gi ]/2)2K−1

i=1

}]

, (6)

where A is bounded lip. and (Gi)2K−1i=1 ∼ N (0, Γ).

λ(

I, (Γj)Kj=1

)

,

K∑

j=1

Γj1,1 × α

[

IΓj] , (7)19 / 25

Page 64: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Main results

Théorème

Suppose that XT [0] is distributed according to the target density πT .Then, the process (XT ,0[sT ], s ∈ R+) weakly converges in theSkorokhod topology to the stationary solution (W [s], s ∈ R+) of theLangevin SDE

dW [s] = λ1/2dB[s] +12λ [ln f ]′ (W [s])ds ,

with λ , λ(

I, (Γj)Kj=1

)

, where Γj , 1 ≤ j ≤ K denotes the covariance

matrix of the random vector (U j0, (U

i0)i 6=j , (U0

j,i)i 6=j).

For the MCTM, Γj = Γj(Σ).

α(Γ) = E

[

A{

(

Gi − Var[Gi ]/2)2K−1

i=1

}]

, (6)

where A is bounded lip. and (Gi)2K−1i=1 ∼ N (0, Γ).

λ(

I, (Γj)Kj=1

)

,

K∑

j=1

Γj1,1 × α

[

IΓj] , (7)19 / 25

Page 65: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Plan

1 Introduction

2 MH algorithms with multiple proposalsRandom Walk MHMCTM algorithmMTM-C algorithms

3 Optimal scalingMain results

4 Optimising the speed up processMCTM algorithmMTM-C algorithms

5 Conclusion

20 / 25

Page 66: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

MCTM algorithm

We optimize the speed λ , λ(I, (Γj (Σ))Kj=1) over a subset G

G ={

Σ = diag(ℓ21, . . . , ℓ

2K ), (ℓ1, . . . , ℓK ) ∈ RK

}

: the proposalshave different scales but are independent.

G ={

Σ = ℓ2Σa, ℓ2 ∈ R

}

, where Σa is the extreme antitheticcovariance matrix:

Σa ,K

K − 1IK − 1

K − 11K 1T

K

with 1K = (1, . . . ,1)T .

21 / 25

Page 67: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

MCTM algorithm

MCTM algorithms

Table: Optimal scaling constants, value of the speed, and meanacceptance rate for independent proposals

K 1 2 3 4 5ℓ⋆ 2.38 2.64 2.82 2.99 3.12λ⋆ 1.32 2.24 2.94 3.51 4.00a⋆ 0.23 0.32 0.37 0.39 0.41

22 / 25

Page 68: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

MCTM algorithm

MCTM algorithms

Table: Optimal scaling constants, value of the speed, and meanacceptance rate for extreme antithetic proposals

K 1 2 3 4 5ℓ⋆ 2.38 2.37 2.64 2.83 2.99λ⋆ 1.32 2.64 3.66 4.37 4.91a⋆ 0.23 0.46 0.52 0.54 0.55

Table: Optimal scaling constants, value of the speed, and meanacceptance rate for the optimal covariance

K 1 2 3 4 5ℓ⋆ 2.38 2.37 2.66 2.83 2.98λ⋆ 1.32 2.64 3.70 4.40 4.93a⋆ 0.23 0.46 0.52 0.55 0.56

22 / 25

Page 69: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

MTM-C algorithms

MTM-C algorithms

Table: Optimal scaling constants, optimal value of the speed and themean acceptance rate for the RQMC MTM algorithm based on theKorobov sequence and Cranley-Patterson rotations

K 1 2 3 4 5σ⋆ 2.38 2.59 2.77 2.91 3.03λ⋆ 1.32 2.43 3.31 4.01 4.56a⋆ 0.23 0.36 0.42 0.47 0.50

Table: Optimal scaling constants, value of the speed, and meanacceptance rate for the hit-and-run algorithm

K 1 2 4 6 8ℓ⋆ 2.38 2.37 7.11 11.85 16.75λ⋆ 1.32 2.64 2.65 2.65 2.65a⋆ 0.23 0.46 0.46 0.46 0.46 23 / 25

Page 70: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Plan

1 Introduction

2 MH algorithms with multiple proposalsRandom Walk MHMCTM algorithmMTM-C algorithms

3 Optimal scalingMain results

4 Optimising the speed up processMCTM algorithmMTM-C algorithms

5 Conclusion

24 / 25

Page 71: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Conclusion

◮ MCTM algorithm:1 Extreme antithetic proposals improves upon the MTM with

independent proposals.2 Still, the improvement is not overly impressive and since the

introduction of correlation makes the computation of theacceptance ratio more complex.

◮ MTM-C algorithm:1 The advantage of the MTM-C algorithms: only one simulation

is required for obtaining the pool of proposals and auxiliaryvariables.

2 The MTM-RQMC ∼ the extreme antithetic proposals.3 Our preferred choice: the MTM-HR algorithm. In particular,

the case K = 2 induces a speed which is twice that of theMetropolis algorithm whereas the computational cost isalmost the same in many scenarios.

25 / 25

Page 72: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Conclusion

◮ MCTM algorithm:1 Extreme antithetic proposals improves upon the MTM with

independent proposals.2 Still, the improvement is not overly impressive and since the

introduction of correlation makes the computation of theacceptance ratio more complex.

◮ MTM-C algorithm:1 The advantage of the MTM-C algorithms: only one simulation

is required for obtaining the pool of proposals and auxiliaryvariables.

2 The MTM-RQMC ∼ the extreme antithetic proposals.3 Our preferred choice: the MTM-HR algorithm. In particular,

the case K = 2 induces a speed which is twice that of theMetropolis algorithm whereas the computational cost isalmost the same in many scenarios.

25 / 25

Page 73: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Conclusion

◮ MCTM algorithm:1 Extreme antithetic proposals improves upon the MTM with

independent proposals.2 Still, the improvement is not overly impressive and since the

introduction of correlation makes the computation of theacceptance ratio more complex.

◮ MTM-C algorithm:1 The advantage of the MTM-C algorithms: only one simulation

is required for obtaining the pool of proposals and auxiliaryvariables.

2 The MTM-RQMC ∼ the extreme antithetic proposals.3 Our preferred choice: the MTM-HR algorithm. In particular,

the case K = 2 induces a speed which is twice that of theMetropolis algorithm whereas the computational cost isalmost the same in many scenarios.

25 / 25

Page 74: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Conclusion

◮ MCTM algorithm:1 Extreme antithetic proposals improves upon the MTM with

independent proposals.2 Still, the improvement is not overly impressive and since the

introduction of correlation makes the computation of theacceptance ratio more complex.

◮ MTM-C algorithm:1 The advantage of the MTM-C algorithms: only one simulation

is required for obtaining the pool of proposals and auxiliaryvariables.

2 The MTM-RQMC ∼ the extreme antithetic proposals.3 Our preferred choice: the MTM-HR algorithm. In particular,

the case K = 2 induces a speed which is twice that of theMetropolis algorithm whereas the computational cost isalmost the same in many scenarios.

25 / 25

Page 75: Optimal scaling in MTM algorithms

Introduction MH algorithms with multiple proposals Optimal scaling Optimising the speed up process Conclusion

Conclusion

◮ MCTM algorithm:1 Extreme antithetic proposals improves upon the MTM with

independent proposals.2 Still, the improvement is not overly impressive and since the

introduction of correlation makes the computation of theacceptance ratio more complex.

◮ MTM-C algorithm:1 The advantage of the MTM-C algorithms: only one simulation

is required for obtaining the pool of proposals and auxiliaryvariables.

2 The MTM-RQMC ∼ the extreme antithetic proposals.3 Our preferred choice: the MTM-HR algorithm. In particular,

the case K = 2 induces a speed which is twice that of theMetropolis algorithm whereas the computational cost isalmost the same in many scenarios.

25 / 25