Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey...

28
Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University & University of Nottingham

Transcript of Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey...

Page 1: Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University &

Characterizing International Policy Learning:

A Decision Theoretic Approach

Oliver MorrisseyUniversity of Nottingham

&Doug Nelson

Tulane University&

University of Nottingham

Page 2: Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University &

Outline of Presentation

• Three Basic Models of Policy Learning– Decision-Theoretic Learning

• Policy Experiments and Learning by Doing

– Social Learning• Learning from Others• Information Cascades

– Hierarchical Social Learning

• Extensions of the Basic Models– Heterogeneous states– Strategic/Political Economy Issues

Page 3: Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University &

Decision Theoretic Learning:The One Country Case, 1

• Suppose the state chooses a policy xt X– For concreteness, suppose the policies are

progressive tax (P) and flat tax (F), so

X = {P,F}.– Furthermore, we summarize those aspects of

reality that cannot be controlled by policy, but effect the outcome, as: θ Θ.

– Together, these result in a signal y(x,θ) Y = {Good, Bad}.

Page 4: Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University &

Decision Theoretic Learning:The One Country Case, 2

• Beliefs about policy effectiveness– We will suppose that policy x(i)t produces good states

with unknown probability p(i) and bad states with [1 – p(i)] and that

– the policymaker begins with prior belief about the likelihood of a good outcome under policy i, i0 [0,1], which is commonly taken as deriving from a private signal of bounded accuracy that the policymaker receives in period t = 0.

– Knowing xt and yt, the policymaker can update his beliefs, t-1, using Bayes rule to get t.

– Only the element of t referring to the active policy in period t changes in the updating, since there is no information about the effectiveness of a policy that is inactive

Page 5: Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University &

Decision Theoretic Learning:The One Country Case, 3

• We assume that the state’s objective is to maximize the expected number of “good” realizations.– Specifically, if we let yt = good = 1 and yt =

bad = 0, and assume that the policymaker applies geometric discounting with discount factor [0,1), we can write this objective as:

10

argmax , ; .t t

tt t

x t

V E y x

Page 6: Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University &

Decision Theoretic Learning:The One Country Case, 4

• This is a Bernoulli two-armed bandit problem– Let the policy history at t = k, hk, be the set of all

policy choices, xt, and realizations, yt, from t = 1 to t = k -1.

– Let Hk be the set of all possible period k histories.

• A strategy, , for the policymaker specifies a policy choice to be made in any period as a function of initial beliefs and Bayesian updating on the history up to that point.

Page 7: Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University &

Decision Theoretic Learning:The One Country Case, 5

• Gittins indexes and index strategies– Gittins and Jones (1974) proved a striking result for problems of

this sort: to every policy there is associated an index which depends only on the current prior on that policy, ik(i0,hk), and

– the optimal strategy at time k is to adopt the strategy (“play the arm”) with the highest index.

• Furthermore, this index is essentially the value of a payment that would make the policymaker indifferent between stopping and continuing with strategy i.

• As a result, solving the policymaker’s problem involves solving an optimal stopping problem for each of the policies in X.

Page 8: Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University &

Decision Theoretic Learning:The One Country Case, 6

• Using this framework, we can ask whether the state will learn the optimal policy with certainty. That is, will it be the case that:– ρi = pi i X; and– ρi > ρj if pi > pj (i j X).

• The usual answer to this class of question is that complete learning generically fails.– Specifically, with strictly positive probability, the

policymaker may eventually select and stay with the “wrong” policy—i.e. the policy j in the preceding inequality.

– Furthermore, in finite time, a policymaker might switch between policies many times.

Page 9: Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University &

Decision Theoretic Learning:The One Country Case, 7

• What is the role of policy advice in this model?• Essentially one of decision support

– Increasing understanding of y(x;θ); and– Support in the updating process.

• Constructing the initial prior: ρ0

– Lessons of history (e.g. globalization)– Implications of crisis (e.g. depression, etc.)

Page 10: Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University &

Social Learning, 1

• Now we presume that, when making its policy decision, the state constructs its prior, i0 [0,1], based on:– A private signal of bounded accuracy; and– Observation of the policy actions of all states that

have already adopted policies.

• However, we start with the assumption that a state cannot observe:– The private signals of other states; or– The outcomes of other states’ policies.

Page 11: Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University &

Social Learning, 2

• To begin, we suppose that states take their policy actions in a fixed, and commonly known, order.– Thus, we can denote states with the by their

policy action and the time in which they move: at A.

– The vector of policy choices at t, xt, is common knowledge at t, but

– The yta(xt

a, θ) are not observed.– The history at k is now: 1

1, .

kka at t t

h y

x

Page 12: Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University &

Social Learning, 3

• Now prior to making its policy choice, each state, ak, develops its policy belief based on:– The common t0 prior;– The t = k public prior, πk(ρ0, hka), based on the

history at t = k, hka; and– Its private signal.

Page 13: Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University &

Social Learning, 4

• How does social learning affect outcomes?• As in the private learning context, social learning will not

generally be complete:– while i = pi for some i X, this will only be true for the i finally

selected, and– j pj j i X. It need not even be the case that pi > pj if i is

actually selected. • Thus, herding occurs with probability 1 and what are

essentially information cascades occur.– That is, because policymaker’s herd, potentially useful collective

information is lost. – Note that the possibility of cascading or herding on an inefficient

policy does not imply that social learning is in any sense worse than private learning.

– Note that analogues of these occur in the private learning case.

Page 14: Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University &

Social Learning, 5

• Distinctive aspects of social learning:– First, every policymaker observes more information at

each t, at least until a herd occurs.– However, where the private learner internalizes the

trade-off between expected current reward and accumulation of information (that is what the Gittins index does), in the social learning context only private learning is internalised in this fashion.

• That is, there is an information externality.

Page 15: Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University &

Social Learning, 6

• On word-of-mouth learning– It is possible that, in addition to observing a private

signal and observing the policy choices of previous states,

– A state at time k can “hear about” the outcomes of some subset of previous movers.

– Smith and Sørenson (1996) and Cao and Hirshleifer (2000) present models of this sort showing that, under plausible assumptions, cascades still occur with positive probability.

Page 16: Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University &

Social Learning, 7

• Implications for policy advice:– Similar to the private learning case:

• Assistance in forming initial prior; and• Assistance in updating; but

– Cascades are fragile in the face of even very noisy public information release.

• Thus, a public release of, for example, consensus on the part of public leaders, or even economists, could have sizable effects on the public belief;

• This might be particularly true of cross-national comparative research—to which we now turn.

Page 17: Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University &

Hierarchical Social Learning, 1

• Broadly speaking, we are interested in a model in which:– Every country receives a private signal, of bounded

accuracy;– A “research authority” observes each state’s policy

choice, computes what is essentially the public prior; and

– Offers to insure countries against bad realizations if they adopt the authority’s preferred policy.

Page 18: Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University &

Hierarchical Social Learning, 2

• The insurance component of this model moves us from a model of statistical cascades to one of reputational cascades (Gul and Lundholm, 1995).

• a reputational cascade is driven by an agency relationship embedded in the sequential decision problem. – Scharfstein and Stein (1990), who consider a model of

investment by managers in an environment characterized by common prediction error in their decision to make one or another investment.

– This means that owners, in evaluating the performance of managers, consider both outcomes and whether a given manager did the same thing that other managers did.

– This creates an incentive for herding, even if there is no convergence in beliefs.

Page 19: Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University &

Hierarchical Social Learning, 3

• A simple model:– in t0: nature selects ( ); the policymakers have

initial beliefs 0a ( a A); and the international agency

announces its initial beliefs and the terms of insurance against a bad realization.

– The model then proceeds as above:• policymakers choose a policy (xt

a X);

• receive a signal (yta(xt

a;) Y);

• if the realization is bad, and they followed the preferred policy of the agency, they get a transfer; and

• update their prior to ta.

Page 20: Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University &

Hierarchical Social Learning, 4

• Implications– It should be clear that this environment would induce

herding, and an information cascade, without inducing convergence of beliefs.

– In fact, if the insurance were large enough it would induce a herd in t1, so there could be no social learning.

– Unless we are quite sure that the international agency’s prior beliefs are accurate, then this sort of institutional environment is clearly harmful.

Page 21: Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University &

Hierarchical Social Learning, 5

• Now add policy research– Suppose that, in addition to the international agency and the

policymakers, there is now a finite set of economists.– Now suppose that it is the economists, not the policymakers,

who observe the vector of policies selected by the policymakers.• Note that policymakers and economists observe different things:• each of the former observes a country-specific signal, while• each of the latter observes the full set of policies adopted in each

period.

– In this extended version of the above model, the international agency forms its prior beliefs exclusively by aggregating the expressed conclusions of the economists.

Page 22: Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University &

Hierarchical Social Learning, 6• If there were no international agency, neither the economists nor the

policymakers would herd.• If the international agency played a purely informational role,

publicly reporting an aggregate prior based on the reports of the economists’ work, both groups would herd in essentially the same fashion as the policymakers alone herd in the second subsection.

• However, now suppose that international agency offers to (partially) insure countries against bad realizations if an orthodox policy was pursued in the previous period.

• An orthodox policy will be a policy such that:– 1) it is consistent with the international agency’s current belief about the

best policy; and – 2) a majority of other policymakers are pursuing that policy.

• This again creates a strong reputational incentive to herd, and an incentive to herd on the agency’s preferred policy, with a concomitant loss of socially valuable information.

Page 23: Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University &

Hierarchical Social Learning, 7

• If, as a result of elective affinity, common training, or some other factors, economists are more prone to herding than policymakers, the existence of an agency that enforces the beliefs of economists will have two effects.– To the extent that, because they are aggregating information

from a number of countries, their conclusions are more accurate, this should raise welfare by encouraging the adoption of better policies (think of this as the Krueger effect).

– Because this institutional arrangement encourages rapid herding, information will be lost, increasing the likelihood of a herd on an inferior policy (think of this as an anti-Gittins effect, reflecting that the institution tilts decision making toward current welfare and away from learning).

Page 24: Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University &

Hierarchical Social Learning, 8

• Do economists herd?– economists do appear quite prone to herding.– The case discussed in detail in Krueger (1997) starts from a very

tight collective prior on the benefits of first-stage IS.• By some time in the 1980s there was an equally tight collective prior

on the benefits of XO. • What is striking is how little compelling empirical evidence was

developed in the interim. • As of the time that we are writing this paper, there seems to be a

substantial reaction to precisely this fact (e.g. Rodríguez and Rodrik, 2001).

– At this point, we do not have a particularly good story to explain how economists shift among quite tight collective priors on such apparently different policy conclusions, but the fact suggests the importance of taking into account the potential social costs implied by the above model.

Page 25: Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University &

Extending Social Learning Models: State Heterogeneity, 1

• Considerable comparative research suggests state heterogeneity:– A large body of research examines optimal

macroeconomic policy based on, e.g.:• Labor market structure;• Political center of gravity;• Degree of openness; etc.

– Similarly, recent research on optimal trade policy stresses the role of institutions.

• While these claims seem plausible:– Their exact nature is uncertain; and– Characterizing countries with respect to the relevant

variables is difficult.

Page 26: Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University &

Extending Social Learning Models:State Heterogeneity, 2

• Suppose we go back to the social learning model, but now we assume:– Preferences over policy are heterogeneous, so we need to

include type in the characterization of agents; and– We include some crazy types (i.e. states that adopt a given

policy independent of information).• In a model of this sort, Smith and Sørenson (2000) show

that cascades remain a robust property, but they identify another important phenomenon: confounded learning—an interior rational expectations dynamic steady state in which: – It may be impossible to draw any clear inference from history

even while it continues to accumulate privately-informed decisions; and

– This incomplete learning informational pooling outcome exists even with unbounded beliefs, when an incorrect herd is impossible

Page 27: Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University &

Extending Social Learning Models:Heterogeneous States, 3

• In the social learning context this doesn’t change what policy advisors do—for better and worse.

• With type heterogeneity the risks of hierarchical social learning would appear to be even higher.– This is one way of representing one of the main line of

criticism of IMF policy during the Asian crisis.

Page 28: Characterizing International Policy Learning: A Decision Theoretic Approach Oliver Morrissey University of Nottingham & Doug Nelson Tulane University &

Extending Social Learning Models:Political Economy Problems

• To this point we have assumed that:– All states are:

• Self interested, but• Non-competitive; and

– The research authority seeks to maximize the number of good realizations.

• It should be clear that these assumptions are not universally shared.– These concerns take us into the realm of:

• game learning models; and • dynamic principal agent models

– We hope to pursue these issues in future work.