Download - Advanced Macroeconomics II - Isaac Baley Macroeconomics II Lecture 1 Stochastic Dynamic Programming Isaac Baley UPF & Barcelona GSE January 7, 2016 1/38

Advanced Macroeconomics II

Lecture 1

Stochastic Dynamic Programming

Isaac Baley

UPF & Barcelona GSE

January 7, 2016

1 / 38

Introduction

• Stochastic dynamic programming is a very useful tool to think and solve a

lot of economic problems.

• We review the theory of dynamic programming and extend to include

uncertainty.

• Straightforward extension of what you did in Advanced Macro I.

• References:

I Acemoglu (2009), Ch. 16.

I Adda and Cooper (2003), Ch 2-3.

I Ljungquvist and Sargent (2014), Ch 3-4.

I Stokey, Lucas, and Prescott (1989), Ch 3-4 (deterministic), 7-9

(stochastic with measure theory).

2 / 38

Roadmap

1 Sequence Problem

2 Recursive Formulation

3 Role of Uncertainty

4 Contraction Mapping Theorem

5 Characterization: Euler + Transversality

6 Solution Methods

3 / 38

Sequence Problem (1): Setup

• Time is infinite and discrete.

• An agent solves the following problem:

max{yt}∞t=0

E0

[ ∞∑t=0

βt U (xt , yt , zt)

]Subject to:

(control) yt ∈ G̃ (xt , zt)

(state) xt+1 ∈ f̃ (xt , yt , zt)

(initial conditions) x0, z0 given

I yt ∈ Y ⊂ RKy : control variables (choice variables, e.g. investment)

I xt ∈ X ⊂ RKx : endogenous state variables (predetermined, e.g. capital)

I zt ∈ Z : exogenous state variables (stochastic shocks, e.g. productivity)

I U : X × Y × Z → R instantaneous payoff

I β ∈ (0, 1] : discount factor

4 / 38

Sequence Problem (2): Exogenous Shocks

• zt is a stationary shock.

• For simplicity, we assume zt to be a first order Markov Chain.

I N possible realizations:

Z ≡ {z1, z2, ..., zN}

I Transition probability from i to j , denoted qji

pr [zt = zj | zt−1 = zi ] = qji

such that:

qji ≥ 0 andN∑j=1

qji = 1 for any i = 1, ...,N

I N × N matrix with i in rows and j in columns, where rows sum 1.

5 / 38

Sequence Problem (3): Constraint Sets

max{yt}∞t=0

E0

[ ∞∑t=0

βt U (xt , yt , zt)

]

(control) yt ∈ G̃ (xt , zt)

(state) xt+1 ∈ f̃ (xt , yt , zt)

(initial conditions) x0, z0 given

• Constraint sets:

I G̃ (xt , zt) constraint on admissible controls, for given states.

I f̃ (xt , yt , zt) law of motion of the state.

• Using f̃ we can substitute yt as a function of xt+1, xt and zt .

• Example: kt+1 = (1− δ)kt + it . Control variable can be either it or kt+1.

6 / 38

Sequence Problem (4): Solution

• Substitute yt using f̃ and obtain the Sequence Problem (P1):

V (x0, z0) = max{xt+1}∞t=0

E0

[ ∞∑t=0

βtU (xt , zt , xt+1)

]st : xt+1 ∈ G (xt , zt) , x0, z0 given

• V : X × Z → R is the value function.

• Problem is stationary in that U and G do not depend on time.

• Solution is an infinite sequence{x∗t+1

}∞t=0

• Idea of dynamic programming: transform the problem into one of finding

a time-invariant function π (xt , zt) rather than an infinite sequence.

I Example, instead of the infinite sequence of optimal capital{k∗t+1

}∞t=0

,

you find k∗t+1 = π(kt , zt), the time invariant capital policy function.

7 / 38

Roadmap

1 Sequence Problem





6 Solution Methods

8 / 38

Recursive Formulation (1): Principle of Optimality

• The Principle of Optimality allows to express the problem in recursive form.

I In an optimal policy, whatever the initial state and decision are, the

remaining decisions must be an optimal policy with regard to the state

resulting from the first decision.

• Consider the original sequence problem (P1):

V (x0, z0) = max{xt+1}∞t=0

E0 [U (x0, z0, x1) + βU (x1, z1, x2) + . . . ]

st : xt+1 ∈ G (xt , zt) , x0, z0 given

• Suppose{x∗t+1

}∞t=0

is a solution to the problem and V (x0, z0) is finite.

9 / 38

Recursive Formulation (2): Bellman Equation

• In period 1, the state variables are x∗1 and z1, and the sequence{x∗t+1

}∞t=0

is

also optimal from period 1 onwards:

V ∗ (x0, z0) = U (x0, z0, x∗1 ) + βE0 [U (x∗1 , z1, x

∗2 ) + . . . ]

= U (x0, z0, x∗1 ) + βE0 [V (x∗1 , z1)]

• Therefore P1 must be equal to maximizing the two period problem:

V (x0, z0) = maxx1=π(x0,z0)

U (x0, z0, x1) + βE0 [V (x1, z1)] , x0, z0 given

• For any t, we obtain the Recursive Problem (P2), a Bellman Equation:

V (xt , zt) = maxxt+1=π(xt ,zt)

{U (xt , zt , xt+1) + βEt [V (xt+1, zt+1)]} ∀x ∈ X

I xt and zt are states and xt+1 is the vector of controls (tomorrow’s state)

I We usually assume that z is an exogenous first order stochastic process:

Et (zt+1) only depends on zt .

10 / 38

Recursive Formulation (3): Important Notes

1 The infinite horizon plan is reduced to a two-period problem (today +

continuation value).

I Often gives better economic intuition

2 Once we have V (·), the policy function xt+1 = π(xt , zt) can be found from:

V (xt , zt) = U (xt , π(xt , zt), zt) + βEt [V (π(xt , zt), zt+1)] ∀xt ∈ X

3 P2 is a functional equation, i.e. a function of a functions.

I Under certain assumptions, we can guarantee existence and properties

of the solutions.

I Contraction Mapping Theorem.

4 P2 is recursive in that V (·) appears both on the LHS and the RHS.

I Powerful numerical tools to find the solution.

I Contraction Mapping Theorem.

11 / 38

Recursive Formulation (4): Additional Assumptions

• We will make the following assumptions:

(i) limn→∞ E0

[∑nt=0 β

nU(x∗t[z t−1

], x∗t+1 [z t ] , zt

)]exists and is finite.

(ii) X is a compact subset of RK .

(iii) G (x , z) is nonempty, compact valued, and continuous. Moreover, it is

convex in x for any z ∈ Z .

(iv) U is continuous, concave, differentiable and increasing in the state x for

any z ∈ Z .

• With the previous assumptions, we can establish the following results:

1 Equivalence of P1 (sequence problem) and P2 (recursive problem).

2 V : X → R exists, is unique, bounded, continuous, concave, increasing

and differentiable.

3 There exists a unique optimal plan with

x∗t+1 = π(x∗t , zt)

12 / 38

Roadmap

1 Sequence Problem





6 Solution Methods

13 / 38

Role of Uncertainty

• What are the ”practical” complications introduced by uncertainty?

• Once we make sure the assumptions we need to solve the recursive problem

hold, then not much.

• The expectation is just a weighed average of outcomes in different states:

V (xt , zt) = maxxt+1=π(xt ,zt )

{U (xt , zt , xt+1) + β

N∑j=1

pr [zt+1 = zj | zt ]V (xt+1, zj)

}

• It is key the fact that the process is exogenous, and hence pr [zt+1 = zj | zt ]is not affected by the control variable xt+1. This implies that:

∂Et [V (xt+1, zt+1)]

∂xt+1=

N∑j=1

pr [zt+1 = zj | zt ]∂V (xt+1, zj)

∂xt+1= Et

[∂V (xt+1, zt+1)

∂xt+1

]

14 / 38

Roadmap

1 Sequence Problem





6 Solution Methods

15 / 38

Contraction Mapping Theorem (1): Motivation

• A map is a function that transforms functions into functions (rather than

numbers).

T (f (x)) = g (x)

• Bellman Equation can be written as map in value functions and policy rules.

For any function W , define the map T as:

T(W ) = maxxt+1∈G(xt ,zt)

U(xt , zt , xt+1) + βEt [W (xt+1, zt+1, xt+2)]

• The solution is a fixed point of the mapping:

V ∗ = T(V ∗)

• The Contraction Mapping Theorem ensures that you can find the fixed point

with an iterative procedure.

16 / 38

Contraction Mapping Theorem (2): Example

• Example: household that maximizes intertemporal consumption:

V (at) = maxct

u (ct) + βEt [V (at+1)]

s.t at+1 = Rat − ct + yt

• Substitute restriction into the value function:

V (at) = maxct

u (ct) + βEt

V at+1︷︸︸︷Rat − ct + yt

= T (V (Rat − ct + yt))

where the mapping is defined as:

T (W (a)) ≡ maxc

u (c) + βEt [W (Ra− c + y)]

• For every value of at , finding optimal consumption c∗t , is equivalent to finding

the the fixed point of the mapping, where the function that maps into itself:

T (V ∗ (a)) = V ∗ (a)17 / 38

Contraction Mapping Theorem (3): Definition

• Definition: Let (F , || · ||) be a metric space. An on-to map T : F → F is a

contraction map iff there exists a number β ∈ [0, 1) such that

‖Tf1 (x)− Tf2 (x)‖ ≤ β ‖f1 (x)− f2 (x)‖ , ∀f1, f2 ∈ F

i.e. functions Tf1 (x) and Tf2 (x) are ‘closer’ than f1 (x) and f2 (x).

• Why is this useful? Consider a sequence of functions {fn (x)}∞n=0 given as:

fn (x) = Tfn−1 (x)

If T is a contraction map, then

‖fn (x)− fn−1 (x)‖ = ‖Tfn−1 (x)− Tfn−2 (x)‖

≤ β ‖fn−1 (x)− fn−2 (x)‖ ≤ ‖fn−1 (x)− fn−2 (x)‖

⇒ Functions in the sequence become closer and closer.

18 / 38

Contraction Mapping Theorem (4): Theorem

Theorem 1Let (F , || · ||) be a complete metric space and T a contraction mapping. Then it

has a unique fixed point, Tf ∗ = f ∗.

Moreover, for any initial guess f0, the sequence fn = Tfn−1 will converge to f ∗.

fn (x)→n→∞ f ∗ (x)

• Why is it unique? Assume not, then ∃ f ∗ 6= f̃ such that Tf ∗ = f ∗, Tf̃ = f̃ .

0 <∥∥∥f ∗ − f̃

∥∥∥ =∥∥∥Tf ∗ − Tf̃

∥∥∥ ≤ β ∥∥∥f ∗ − f̃∥∥∥ =⇒ β ≥ 1! (Contradition)

• Very useful in practice: take any arbitrary initial guess for V , iterate the

Bellman equation until convergence.

• The key is to check that our Bellman equation is a contraction map!

19 / 38

Contraction Mapping Theorem (5): Blackwell Conditions

• In general, it is hard to prove a map is a contraction.

• But we have the following useful sufficient conditions (Blackwell):

(B1) Monotonicity:

f1 (x) ≤ f2 (x) for all x =⇒ Tf1 (x) ≤ Tf2 (x) for all x

(B2) Discounting: There exists a β ∈ [0, 1) such that, for any constant k

and any function f , we have:

T (f + k) ≤ Tf + βk

• If map T satisfies (B1) and (B2), then T is a contraction.

• Let us check if a map T given by a Bellman Equation is a contraction.

20 / 38

Contraction Mapping Theorem (6): Bellman & Blackwell

• Bellman and Monotonicity:

Suppose V (a) ≥W (a) for all a. Then it must be that TV (a) ≥ TW (a),

TV (a) = maxc{u (c) + βE [V (Ra− c + y)]}

≥ u (c∗W ) + βE [V (Ra− c∗W + y)]

≥ u (c∗W ) + βE [W (Ra− c∗W + y)]

= maxc{u (c) + βE [W (Ra− c∗W + y)]} = TW (a)

• Bellman and Discounting:

T (V (a) + k) = maxc{u (c) + βE [V (Ra− c + y) + k]}

= maxc{u (c) + βE [V (Ra− c + y)]}+ βk

= TV (X ) + βk

• Conclusion: if β < 1 then T (Bellman Equation) is a contraction map.

21 / 38

Roadmap

1 Sequence Problem





6 Solution Methods

22 / 38

Characterization (1): First Order Conditions

• Bellman Equation:

V (x , z) = maxx′=π(x,z)

U(x , z , x ′) + βE[V (x ′, z ′)] ∀x ∈ X

I By the above assumptions, the maximization is strictly concave and

differentiable.

I For interior solutions, the first order conditions (FOC) are necessary.

• The FOC with respect to control x ′:

Dx′U(x , z , x ′) + βDE[V (x ′, z ′)] = 0

where D denotes the gradient and Dx′ the gradient wrt the vector x ′.

• How to evaluate DE[V (x ′, z ′)] = 0?

23 / 38

Characterization (2): Envelope Conditions

• As we saw earlier, since the stochastic process is exogenous, we can

exchange the derivative and the expectation:

DE[V (x ′, z ′)] = E[DV (x ′, z ′)]

• Now use V (x , z) = U(x , z , x ′) + βE[V (x ′, z ′)] to compute DV (x , z):

DV (x , z) = DxU(x , z , x ′) + Dx′U(x , z , x ′)dx ′

dx+ βE

[DV (x ′, z ′)

dx ′

dx

]= DxU(x , z , x ′) + {Dx′U(x , z , x ′) + βE [DV (x ′, z ′)]}︸︷︷︸

=0 by FOC

dx ′

dx

= DxU(x , z , x ′)

• Intuition: V is maximized wrt to x ′ (small changes in x ′ do not affect it,

envelope theorem).

24 / 38

Characterization (3): Euler Equation

• Back to the FOC:

Dx′U(x , z , x ′) + βE[DV (x ′, z ′)] = 0

• The second term is the derivative we just computed with envelope condition,

but one period forward:

E[DV (x ′, z ′)] = E[Dx′U(x ′, z ′, x ′′)]

• Substituting back:

Dx′U(x , z , x ′) + βE[Dx′U(x ′, z ′, x ′′)] = 0

• This Euler equation characterizes implicitly the (unknown) optimal policy

π(x , z):

Dx′U(x , z , π(x , z)) + βE[Dx′U(π(x , z), z ′, π(π(x , z), z ′))] = 0

25 / 38

Characterization (4): Transversality Condition

• The Euler Equation establishes the optimality of the solution between two

contiguous periods (one period deviations from optimal policy are not

profitable).

• What about an infinite deviation?

• The solution must also satisfy the transversality condition:

limt→∞

βt Et

[DxU(x∗t , z , x

∗t+1)

]· x∗t = 0

I The discounted value of the x∗t must approach zero at infinity.

I Infinite-horizon equivalent of a “terminal” condition in the finite case,

so that all wealth must be consumed by the end of period.

26 / 38

Characterization (5): One-dimensional case

• Suppose x , z and x ′ are real numbers.

• The First Order Condition is:

−∂U(x , z , x ′)

∂x ′= βE

[∂V (x ′, z ′)

∂x ′

]

I Indifference condition: Today’s cost of increasing x ′ (i.e. capital

tomorrow) has to be equal to its expected discounted marginal gain on

future utility (i.e. profits).

• The Envelope Condition:

∂V (x , z)

∂x=∂U(x , z , x ′)

∂x

27 / 38

Characterization (5): One-dimensional case (cont...)

• Forward the envelope one period

∂V (x ′, z ′)

∂x ′=∂U(x ′, z ′, x ′′)

∂x ′

• The Euler Equation (substitute forwarded envelope into FOC):

−∂U(x , z , x ′)

∂x ′= βE

[∂U(x ′, z ′, x ′′)

∂x ′

]

I New indifference condition only between today and tomorrow (effect on

future continuation value is second order, because of envelope theorem).

28 / 38

Characterization (5): One-dimensional case (cont...)

• Finally, the Transversality Condition.

• Suppose the last period is t = T , we choose xT+1 to maxβTU(xT , zT , xT+1)

• Because of potential corner, the FOC reads:

βT ∂U(xT , zT , xT+1)

∂xT+1xT+1 = 0

I Either an interior solution is optimal(∂U(xT ,zT ,xT+1)

∂xT+1= 0)

or we go to a

corner solution xT+1 = 0.

• When T →∞, we take the limit:

limT→∞

βT ∂U(xT , zT , xT+1)

∂xT+1︸︷︷︸use Euler

xT+1 = limT→∞

βT+1 ∂U(xT+1, zT+1, xT+2)

∂xT+1xT+1 = 0

29 / 38

Characterization (6): Our previous example

• Bellman:

V (a, y) = maxa′

u (Ra + y − a′) + βE [V (a′, y ′)]

• FOC:∂u(c)

∂c= βE

[∂V (a′, y ′)

∂a′

]• Envelope:

∂V (a, y)

∂a= R

∂u(c)

∂c=⇒forward

∂V (a′, y ′)

∂a′= R

∂u(c ′)

∂c ′

• Euler = FOC + Forward Envelope

∂u(c)

∂c= βRE

[∂u(c ′)

∂c ′

]• Transversality

limt→∞

βt ∂u(ct)

∂ctat = 0

30 / 38

Roadmap

1 Sequence Problem





6 Solution Methods

31 / 38

Solution Methods

• The Bellman equation is a functional equation.

• How to Solve Functional Equations? No general way, several

approaches.

• Closed form: Guess a functional form (often same form as U) with

undetermined coefficients and verify.

• Numerical dynamic programming:

a) Value function iteration.

b) Policy function iteration (Howard improvement algorithm).

c) Projection methods (approximate policy with polynomials).

32 / 38

Guess and Verify (Undetermined Coefficients)

• In some special cases, one can obtain closed form solutions.

• Example: Stochastic growth with Log utility and Cobb-Douglas production.

I Consider the problem:

max{ct ,kt+1}∞t=0

E0

( ∞∑t=0

βt ln ct

)st : kt+1 = θtk

αt − ct , k0, θ0 given

log(θt) ∼ iid(0, σ2)

I In this case kt (state), kt+1 (control) and ct = θtkαt − kt+1


V (kt , θt) = maxkt+1

ln (θtkαt − kt+1) + βEt [V (kt+1, θt+1)]

33 / 38


• First Order Condition:

1

θtkαt − k+1= βEt

[∂V (kt+1, θt+1)

∂kt+1

]• Envelope condition:

∂V (kt , θt)

∂kt=

αθtkα−1t

θtkαt − kt+1

• Forwarding by one period:

Et

[∂V (kt+1, θt+1)

∂kt+1

]= Et

[αθt+1k

α−1t+1

θt+1kαt+1 − kt+2

]

• Substituting back in FOC:

1

θtkαt − kt+1= Et

[αβθt+1k

α−1t+1

θt+1kαt+1 − kt+2

]

34 / 38


• We guess the value function as a log-linear function of the states:

V (kt , θt) = v1 + v2 log kt + v3 log θt

• Since log(θt) ∼ iid(0, σ2), the guess implies that:

Et [V (kt+1, θt+1)] = v1 + v2 log kt+1

• Substituting the guess in the Bellman Equation:


ln (θtkαt − kt+1) + βv1 + βv2 log kt+1

35 / 38




ln (θtkαt − kt+1) + βv1 + βv2 log kt+1

• FOC:

− 1

θtkαt − kt+1+βv2kt+1

= 0 =⇒ kt+1 =βv2

1 + βv2θtk

αt

• Substitute the solution into the value function:

V (kt , θt) = ln

(θtk

αt −

βv21 + βv2

θtkαt

)+ βv1 + βv2 log

βv21 + βv2

θtkαt

• Rearrange as follows (For homework verify this claim):

V (kt , θt) = constant + (1 + βv2) ln (θtkαt )

and conclude that:

V (kt , θt) = constant + α (1 + βv2)︸︷︷︸v2

ln kt + (1 + βv2)︸︷︷︸v3

ln θt

36 / 38


• If the guess is right, then the following equations must have a solution

α (1 + βv2) = v2

1 + βv2 = v3

• Solving for v2 and v3 we obtain:

v2 =α

1− αβ, v3 =

1

1− αβ

• Hence the policy function:

kt+1 =

[1 +

1

βv2

]−1θtk

αt = αβθtk

αt

• And the value function:

V (kt , θt) = constant +α

1− αβln kt +

1

1− αβln θt

37 / 38

Homework: Autocorrelated shocks

• Assume now that the productivity shocks θt are autocorrelated.

log θt = γ log θt−1 + εt

where εt is iid(0, σ2ε) and γ < 1.

• Verify that the guess V (kt , θt) = v1 + v2 log kt + v3 log θt is still correct.

• Compute the optimal policy and the value function in this case.

• How does the semi-elasticity of the value function with respect to θ change

with the persistence parameter γ?

38 / 38