When Does Learning in Games Generate Convergence to Nash...

31
When Does Learning in Games Generate Convergence to Nash Equilibria? The Role of Supermodularity in an Experimental Setting By YAN CHEN AND ROBERT GAZZALE* This study clarifies the conditions under which learning in games produces con- vergence to Nash equilibria in practice. We experimentally investigate the role of supermodularity, which is closely related to the more familiar concept of strategic complementarities, in achieving convergence through learning. Using a game from the literature on solutions to externalities, we find that supermodular and “near- supermodular” games converge significantly better than those far below the thresh- old of supermodularity. From a little below the threshold to the threshold, the improvement is statistically insignificant. Increasing the parameter far beyond the threshold does not significantly improve convergence. (JEL C91, D83) When do players learn to play Nash equilib- ria? The answer to this important question will help us identify when the outcomes predicted by theory will be realized in competitive environ- ments involving real people. This question has been examined both theoretically (see Drew Fudenberg and David Levine, 1998, for a sur- vey) and experimentally (see Colin Camerer, 2003, for a survey). According to the theoretical literature, games with strategic complementarities (Paul Milgrom and John Roberts, 1991; Milgrom and Chris Shannon, 1994) have robust dynamic stability properties: under numerous learning dynamics, they converge to the set of Nash equilibria that bound the serially undominated set. The learn- ing dynamics include Bayesian learning, ficti- tious play, adaptive learning, Cournot best reply, and many others. These games include the supermodular games of Donald Topkis (1979), Xavier Vives (1985, 1990), Russell Cooper and Andrew John (1988), and Milgrom and Roberts (1990). In supermodular games, each player’s marginal utility of increasing her strategy rises with increases in her rival’s strat- egies, so that (roughly) the players’ strategies are “strategic complements.” Existing literature recognizes that games with strategic complementarities encompass impor- tant economic applications of noncooperative game theory, for example, macroeconomics un- der imperfect competition (Cooper and John, 1988), search (Peter A. Diamond, 1982), bank runs (Douglas W. Diamond and Philip H. Dybvig, 1983; Andrew Postlewaite and Vives, 1987), network and adoption externalities (Dybvig and Chester S. Spatt, 1983), and mech- anism design (Chen, 2002). Past experimental studies of learning and mechanism design suggest that, in addition to equilibrium efficiency, mechanism choice should depend on whether players learn to play the equilibrium and the nature of play on the path to equilibrium. In reviewing the ex- perimental literature on incentive-compatible * Chen: School of Information, University of Michigan, 1075 Beal Avenue, Ann Arbor, MI 48109 (email: [email protected]); Gazzale: Department of Economics, Williams College, Fernald House, Williamstown, MA 01267 (email: [email protected]). We thank Jim An- dreoni, Charles Brown, David Cooper, Julie Cullen, John DiNardo, Jacob Goeree, Ernan Haruvy, Daniel Houser, Pe- ter Katuscak, and Jean Krisch for helpful discussions, sem- inar participants at the University of Michigan, Virginia Tech, and ESA 2001 (Barcelona and Tucson) for their comments, Emre Sucu for programming for the experiment, and Yuri Khoroshilov, Jim Leady, and Belal Sabki for excellent research assistance. We are grateful for the con- structive comments of B. Douglas Bernheim, the co-editor, and three anonymous referees, which significantly im- proved the paper. The research support provided by NSF Grant SES-0079001 to Chen is gratefully acknowledged. Any errors are our own. 1505

Transcript of When Does Learning in Games Generate Convergence to Nash...

Page 1: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

When Does Learning in Games Generate Convergenceto Nash Equilibria The Role of Supermodularity

in an Experimental Setting

By YAN CHEN AND ROBERT GAZZALE

This study clarifies the conditions under which learning in games produces con-vergence to Nash equilibria in practice We experimentally investigate the role ofsupermodularity which is closely related to the more familiar concept of strategiccomplementarities in achieving convergence through learning Using a game fromthe literature on solutions to externalities we find that supermodular and ldquonear-supermodularrdquo games converge significantly better than those far below the thresh-old of supermodularity From a little below the threshold to the threshold theimprovement is statistically insignificant Increasing the parameter far beyond thethreshold does not significantly improve convergence (JEL C91 D83)

When do players learn to play Nash equilib-ria The answer to this important question willhelp us identify when the outcomes predicted bytheory will be realized in competitive environ-ments involving real people This question hasbeen examined both theoretically (see DrewFudenberg and David Levine 1998 for a sur-vey) and experimentally (see Colin Camerer2003 for a survey)

According to the theoretical literature gameswith strategic complementarities (Paul Milgromand John Roberts 1991 Milgrom and ChrisShannon 1994) have robust dynamic stabilityproperties under numerous learning dynamics

they converge to the set of Nash equilibria thatbound the serially undominated set The learn-ing dynamics include Bayesian learning ficti-tious play adaptive learning Cournot bestreply and many others These games includethe supermodular games of Donald Topkis(1979) Xavier Vives (1985 1990) RussellCooper and Andrew John (1988) and Milgromand Roberts (1990) In supermodular gameseach playerrsquos marginal utility of increasing herstrategy rises with increases in her rivalrsquos strat-egies so that (roughly) the playersrsquo strategiesare ldquostrategic complementsrdquo

Existing literature recognizes that games withstrategic complementarities encompass impor-tant economic applications of noncooperativegame theory for example macroeconomics un-der imperfect competition (Cooper and John1988) search (Peter A Diamond 1982) bankruns (Douglas W Diamond and Philip HDybvig 1983 Andrew Postlewaite and Vives1987) network and adoption externalities(Dybvig and Chester S Spatt 1983) and mech-anism design (Chen 2002)

Past experimental studies of learning andmechanism design suggest that in additionto equilibrium efficiency mechanism choiceshould depend on whether players learn to playthe equilibrium and the nature of play onthe path to equilibrium In reviewing the ex-perimental literature on incentive-compatible

Chen School of Information University of Michigan1075 Beal Avenue Ann Arbor MI 48109 (emailyanchenumichedu) Gazzale Department of EconomicsWilliams College Fernald House Williamstown MA01267 (email rgazzalewilliamsedu) We thank Jim An-dreoni Charles Brown David Cooper Julie Cullen JohnDiNardo Jacob Goeree Ernan Haruvy Daniel Houser Pe-ter Katuscak and Jean Krisch for helpful discussions sem-inar participants at the University of Michigan VirginiaTech and ESA 2001 (Barcelona and Tucson) for theircomments Emre Sucu for programming for the experimentand Yuri Khoroshilov Jim Leady and Belal Sabki forexcellent research assistance We are grateful for the con-structive comments of B Douglas Bernheim the co-editorand three anonymous referees which significantly im-proved the paper The research support provided by NSFGrant SES-0079001 to Chen is gratefully acknowledgedAny errors are our own

1505

mechanisms for pure public goods Chen (forth-coming) finds that mechanisms with strategiccomplementarities such as the Groves-Ledyardmechanism under a high punishment parameterconverge robustly to the efficient equilibrium(Chen and Charles R Plott 1996 Chen andFang-Fang Tang 1998) Conversely those faraway from the threshold of strategic comple-mentarities do not seem to converge (Vernon LSmith 1979 Ronald M Harstad and MichaelMarrese 1982) In previous experiments pa-rameters are set either far away from the thresh-old for strategic complementarities (eg Chenand Tang 1998) or very close to the threshold(eg Josef Falkinger et al 2000)1 These ex-periments do not however systematically setthe parameters below close to at and above thethreshold to assess the effect of strategiccomplementarities on convergence This is thefirst such systematic experimental study ofgames with strategic complementarities2 Con-sequently this study answers three importantquestions that the theory on games with strate-gic complementarities does not address Firstas the parameters approach the threshold ofstrategic complementarities will play convergeto equilibrium gradually or abruptly Second isthere a clear performance ranking among gameswith strategic complementarities Third howimportant is strategic complementarity com-pared to other factors The answer to the firstquestion can help us assess a priori whether agame ldquocloserdquo to being supermodular such asthe Falkinger mechanism (Falkinger 1996)might also have good convergence propertiesThe answer to the second question will help uschoose the best parameters within the class ofgames with strategic complementarities Theanswer to the third question will help us assessthe importance of strategic complementaritiesin learning and convergence

To address these questions this study adoptsan experimental game from the literature onsolutions to externalities Hal R Varian (1994)

proposes a simple class of two-stage mecha-nisms the compensation mechanisms in whichsubgame-perfect equilibria implement efficientallocations In a generalized version of Varianrsquosmechanism (John Cheng 1998) one can varywithout altering the equilibrium a parameterthat determines whether the condition for stra-tegic complementarities is satisfied

There have been two experimental studies ofthe compensation mechanisms neither of whichadopts a version with strategic complementari-ties James Andreoni and Varian (1999) studythe mechanism in the context of the PrisonersrsquoDilemma They find that adding a commitmentstage to the standard Prisonersrsquo Dilemma gamenearly doubles the amount of cooperation totwo-thirds Yasuyo Hamaguchi et al (2003)investigate a version with a larger strategy spaceand find 20-percent Nash equilibrium play

In this paper we examine the compensationmechanism in an economic environment with amuch larger strategy space Furthermore wechoose various versions to study systematicallythe effect of strategic complementarities onconvergence

The paper is organized as follows Section Iintroduces games with strategic complementa-rities and presents theoretical properties of thecompensation mechanisms Section II presentsthe experimental design Section III introducesthe set of hypotheses Section IV presents ex-perimental results on the level and speed ofconvergence as well as efficiency Section Vpresents the calibration of three learning mod-els validation of the models on a hold-out sam-ple and simulation of performance in the longrun using a calibrated learning model SectionVI discusses our findings and Section VIIconcludes

I Strategic Complementarity and theCompensation Mechanisms

In this section we first introduce games withstrategic complementarities and their theoreticalproperties We then introduce and analyze afamily of mechanisms in this class of gamesnamely the compensation mechanisms

Games with strategic complementarities(Milgrom and Shannon 1994) are those wheregiven an ordering of strategies higher actionsby one player provide an incentive for the other

1 Proofs of supermodularity of the Groves-Ledyard andthe Falkinger mechanisms are presented in Chen (in press)

2 By systematically varying the free parameter in theGroves-Ledyard mechanism Jasmina Arifovic and John OLedyard (2001) study learning dynamics and mechanismconvergence using genetic algorithms compared with ex-perimental data

1506 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

players to take higher actions as well Thesegames include supermodular games in whichthe incremental return to any player from in-creasing her strategy is a nondecreasing func-tion of the strategy choices of other players(increasing differences) Furthermore if a play-errsquos strategy space has more than one dimen-sion components of a playerrsquos strategy arecomplements (supermodularity) Membershipin this class of games is easy to check Indeedfor smooth functions in n letting Pi be thestrategy space and i be the payoff function ofplayer i the following theorem characterizesincreasing differences and supermodularity

THEOREM 1 (Topkis 1978) Let i be twicecontinuously differentiable on Pi Then i hasincreasing differences in (pi pj) if and only if2ipihpjl 0 for all i j and all 1 h ki and all 1 l kj and i is supermodular inpi if and only if 2ipihpil 0 for all i and all1 h l ki

Increasing differences means that an increasein the strategy of player irsquos rivals raises hermarginal utility of playing a high strategy Thesupermodularity requirement ensures comple-mentarity among components of a playerrsquosstrategies and is automatically satisfied in aone-dimensional strategy space Note that asupermodular game is a game with strategiccomplementarities but the converse is nottrue

Supermodular games have interesting theo-retical properties In particular they are robustlystable Milgrom and Roberts (1990) prove thatin these games learning algorithms consistentwith adaptive learning converge to the setbounded by the largest and the smallest Nashequilibrium strategy profiles Intuitively a se-quence is consistent with adaptive learning ifplayers ldquoeventually abandon strategies that per-form consistently badly in the sense that thereexists some other strategy that performs strictlyand uniformly better against every combinationof what the competitors have played in thenot-too-distant pastrdquo (Milgrom and Roberts1990) This includes numerous learning dynam-ics such as Bayesian learning fictitious playadaptive learning and Cournot best replyWhile strategic complementarity is sufficientfor convergence it is not a necessary condition

Thus while games with strategic complemen-tarities ought to converge robustly to the Nashequilibrium games without strategic comple-mentarities may also converge under specificlearning algorithms Whether these specificlearning algorithms are a realistic description ofhuman learning is an empirical question

While the theory on games with strategiccomplementarities predicts convergence toequilibrium it does not address four practicalissues First as the parameters of a game ap-proach the threshold of strategic complementa-rities does play converge gradually or abruptlySecond is convergence faster further past thethreshold Third how important is strategiccomplementarity compared to other features ofa game that might also induce convergence toequilibrium Last for supermodular games withmultiple Nash equilibria will players learn tocoordinate on a particular equilibrium Wechoose a game that allows us to answer the firstthree questions The fourth question has beenaddressed by John Van Huyck et al (1990) andJames Cox and Mark Walker (1998)3

Specifically we use the compensation mech-anism to study the role of strategic comple-mentarities in learning and convergence toequilibrium play In the mechanism each of twoplayers offers to compensate the other for theldquocostsrdquo of the efficient choice Assume thatwhen player 1rsquos production equals x her netprofit is rx c(x) where r is the market priceand c is a differentiable positive increasingand convex cost function Production causes anexternality on player 2 whose payoff is e(x)

3 Cox and Walker (1998) study whether subjects canlearn to play Cournot duopoly strategies in games with twokinds of interior Nash equilibrium Their type I duopoly hasa stable interior Nash equilibrium under Cournot best-replydynamics and therefore is dominance solvable (Herve Mou-lin 1984) Their type II duopoly has an unstable interiorNash equilibrium and two boundary equilibria underCournot best-reply dynamics and therefore is not domi-nance solvable They found that after a few periods subjectsdid play stable interior dominance-solvable equilibria butthey did not play the unstable interior equilibria nor theboundary equilibria It is interesting to note that theseduopoly games are submodular games Being two-playergames they are also supermodular (Rabah Amir 1996)Results of Cox and Walker (1998) illustrate the importanceof uniqueness together with supermodularity in inducingconvergence

1507VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

also assumed to be differentiable positive in-creasing and convex The mechanism is a two-staged game where the unique subgame-perfectNash equilibrium induces the Pareto-efficientoutcome of x such that r e(x) c(x) In thefirst stage (the announcement stage) player 1announces p1 a per-unit subsidy to be paid toplayer 2 while player 2 simultaneously an-nounces p2 a per-unit tax to be paid by player 1Announcements are revealed to both players Inthe second stage (the production stage) player 1chooses a production level x The payoff toplayer 1 is 1 rx c(x) p2x (p1 p2)2 while the payoff to player 2 is 2 p1x e( x) where 0 is a free punishmentparameter chosen by the designer

We study a generalized version of the com-pensation mechanism (Cheng 1998) whichadds a punishment term (p1 p2)2 toplayer 2rsquos payoff function thus making the pay-off functions

(1) 1 rx cx p2 x p1 p2 2

2 p1x ex p1 p22

Using the generalized version we solve thegame by backward induction In the productionstage player 1 chooses the quantity that solvesmaxx rx c(x) p2x (p1 p2)2 The firstorder condition r c(x) p2 0 character-izes the best response in the second stage x(p2)In the announcement stage player 1 solvesmaxp1

rx c(x) p2x (p1 p2)2 The firstorder condition is

(2)1

p1 2p1 p2 0

which yields the best response function forplayer 1 as p1 p2 Player 2 simultaneouslysolves maxp2

p1x(p2) e(x(p2)) (p1 p2)2 The first order condition is

(3)2

p2 p1 x p2 exx p2

2p1 p2 0

which characterizes player 2rsquos best responsefunction

The unique subgame-perfect equilibrium hasp1 p2 p where p is the Pigovian taxwhich induces the efficient quantity x As theequilibrium does not depend on the value of it holds for the original version where 0When is set appropriately however thegeneralized version is a supermodular mecha-nism The following proposition characterizesthe necessary and sufficient condition forsupermodularity

PROPOSITION 1 (Cheng 1998) The gener-alized version of the compensation mechanismis supermodular if and only if 0 and 1

2x(p2)

The proof is simple First as the strategyspace is one-dimensional the supermodularitycondition is automatically satisfied Second weuse Theorem 1 to check for increasing differ-ences For player 1 from equation (2) we have21p1p2 2 0 while for player 2 fromequation (3) we have 22p1p2 x(p2) 2 Therefore 22p1p2 0 if and only if 1

2x(p2)

To obtain analytical solutions we use a qua-dratic cost function c(x) cx2 where c 0and a quadratic externality function e(x) ex2where e 0 We now summarize the bestresponse functions equilibrium solutions andstability analysis in this environment

PROPOSITION 2 Quadratic cost and exter-nality functions yield the following character-izations

(i) The best response functions for players 1and 2 are

(4) p1 p2

(5) p2

1

4c

e

4c2

p1

er

4c2

e

4c2

and

x max0r p2

2c

(ii) The subgame-perfect Nash equilibrium ischaracterized as

1508 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

p1 p2 x

er

e c

er

e c

r

2e c

(iii) If players follow Cournot best-reply dy-namics (p1 p2) is a globally asymptoti-cally stable equilibrium of the continuoustime dynamical system for any 0 and 0

(iv) The game is supermodular if and only if 0 and 14c

PROOFSee Appendix

The best-response functions presented in part(i) of Proposition 2 reveal interesting incentivesWhile player 1 has an incentive to always matchplayer 2rsquos price player 2 has an incentive tomatch only when player 1 plays the equilibriumstrategy Also at the threshold for strategiccomplementarity 14c player 2 has a dom-inant strategy p2 er(e c) p2 Part (iii) ofProposition 2 extends Cheng (1998) whoshows that the original version of the compen-sation mechanism ( 0) is globally stableunder continuous-time Cournot best-reply dy-namics4 Cournot best-reply is however a rel-atively poor description of human learning(Richard Boylan and Mahmoud El-Gamal1993) Therefore global stability under Cournotbest reply for any 0 does not imply equi-librium convergence among human subjectsPart (iv) characterizes the threshold for strategiccomplementarity in our experiment a more ro-bust stability criterion than that characterized bypart (iii)

An intuition for how strategic complementa-rities affect the outcome of adaptive learningcan be gained from analysis of the best responsefunctions equations (4) and (5) While player1rsquos best response function is always upwardsloping player 2rsquos best response function equa-tion (5) is nondecreasing if and only if 14c ie when the game is supermodular Be-yond the threshold for strategic complementar-ity both best-response functions are upward

sloping and they intersect at the equilibrium Itis easy to verify graphically that adaptive learn-ers eg Cournot best reply will converge tothe equilibrium regardless of where they start

To examine how which is unrelated tostrategic complementarity might affect behav-ior we observe that when player 1 deviatesfrom the best response by ie p1 p2 his profit loss is 1 2 This profit losswhich is proportional to the magnitude of isthe deviation cost for player 1 Based on previ-ous experimental evidence the incentive to de-viate from best response decreases when thedeviation cost increases Chen and Plott (1996)call it the General Incentive Hypothesis ie theerror of game theoretic models decreases as theincentive to best-respond increases We there-fore expect that an increase in improvesplayer 1rsquos convergence to equilibrium Whenplayer 1 plays equilibrium strategy player 2rsquosbest response is to play equilibrium strategy aswell Therefore we expect that an increase in might improve player 2rsquos convergence to equi-librium as well It is not clear however whetherthe -effects systematically change the effectsof the supermodularity parameter We rely onexperimental data to test the interaction of the-effects on -effects

II Experimental Design

Our experimental design reflects both theo-retical and technical considerations Specifi-cally we chose an environment that allowssignificant latitude in varying the free parame-ters to better assess the performance of thecompensation mechanism around the thresholdof strategic complementarity We describe thisenvironment and the experimental proceduresbelow

A The Economic Environment

We use the general payoff functions pre-sented in equation (1) with quadratic cost andexternality functions to obtain analytical solu-tions c(x) cx2 and e(x) ex2 We use thefollowing parameters c 180 e 140 r 24 From Proposition 2 the subgame-perfectNash equilibrium is (p1 p2 x) (16 16 320)and the threshold for strategic complementarityis 14c 20

4 Cheng (1998) also shows that the original mechanismis locally stable under discrete-time Cournot best-reply dy-namics

1509VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

In the experiment each player chooses pi 0 1 40 Without the compensation mech-anism the profit-maximizing production level isx r2c 960 three times higher than theefficient level To reduce the complexity ofplayer 1rsquos problem we use a grid size of 10 forthe quantity and truncate the strategy space ieplayer 1 chooses X 0 1 50 where X x10 The payoff functions presented to the sub-jects are adjusted accordingly Truncating thestrategy space to X 50 also reduces the pos-sibility of player 2rsquos bankruptcy To reducepayoff asymmetry we give player 1 a lump-sum payment of 250 points each round There-fore the equilibrium payoffs under themechanism for the two players are 1 1530and 2 2560

The functional forms and specific parametervalues are chosen for several reasons Firstequilibrium solutions are integers Secondequilibrium prices and quantities do not lie inthe center of the strategy space thus avoidingequilibrium convergence as a result of focalpoints Third there is a salient gap betweenefficiency with and without the mechanismWithout the mechanism the profit-maximizingproduction level is X 50 resulting in anefficiency level of 684 percent5 With themechanism the system achieves 100-percentefficiency in equilibrium Finally since thethreshold for strategic complementarity is 20 there is a large number of integer values tochoose from both below and above thethreshold

To study how strategic complementarity af-fects equilibrium convergence we keep 20and vary 0 18 20 and 40 To studywhether affects convergence we also keep 10 and vary 0 20

B Experimental Procedures

Our experiment involves 12 players persessionmdashsix player 1s (called Red players in the

instructions) and six player 2s (Blue players)Each player remains the same type throughoutthe experiment At the beginning of each ses-sion subjects randomly draw a PC terminalnumber Each then sits in front of the corre-sponding terminal and is given printed instruc-tions After the instructions are read aloudsubjects are encouraged to ask questions Theinstruction period lasts between 15 and 30minutes

Each round a player 1 is randomly matchedwith a player 2 Subjects are randomly re-matched each round to minimize repeated gameeffects The random rematching protocol alsominimizes the possibility that players colludeon a high-subsidy and low-tax outcome6 Eachsession consists of 60 rounds As we are inter-ested in learning there are no practice roundsEach round consists of two stages

(i) Announcement Stage Each player simulta-neously and independently chooses a pricepi 0 1 40

(ii) Production Stage After (p1 p2) are chosenplayer 1rsquos computer displays player 2rsquosprice and a payoff table showing her payofffor each X 0 1 50 Player 1 thenchooses a quantity X The server calculatespayoffs and sends each player his payoffthe quantity chosen and the prices submit-ted by him and his match

To summarize each subject knows both pay-off functions the choices made each round byhimself and his match as well as his per-periodand cumulative payoffs At any point subjectshave ready access to all of this information Themechanism is thus implemented as a game ofcomplete information We do not know how-ever how subjects processed this informationnor do we know their beliefs about the rational-

5 We compute the efficiency level by using the ratio oftotal earnings without the mechanism and total earningswith the mechanism Without the mechanism at the pro-duction level of X 50 (or x 500) total earning is 1 2 (rx cx2) ex2 2625 Therefore the efficiencylevel is 2625(1280 2560) 0684

6 If the players could commit to maximizing joint profitsby choosing p1 and p2 cooperatively and x noncoopera-tively they would choose by equation (1) (p1 p2 x) (4( )er[4( )(e c) 1] r(4e( ) 1)[4( )(e c) 1] 2r( )[4( )(e c) 1]) With ourchoice of parameters we get p1 24 p2 12 for 20and 0 p1 192 p2 144 for 20 and 20and p1 18 p2 15 for 20 and 40 etc If inaddition they could choose (p1 p2 x) cooperatively then forour parameter values we get (p1 p2 x) (min480[3( ) 20] 40 min960( )[3( ) 20] 50)

1510 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

ity of others Both of these factors introduceuncertainty in the environment

Table 1 presents features of the experimentalsessions including parameters number of inde-pendent sessions in each treatment whether themechanism is supermodular in that treatmentand equilibrium prices and quantities Overall27 independent computerized sessions wereconducted in the Research Center for GroupDynamics (RCGD) lab at the University ofMichigan from April to July 2001 and in April2003 We used zTree to program our experi-ments Our subjects were students from the Uni-versity of Michigan No subject was used inmore than one session yielding 324 subjectsEach session lasted approximately one-and-a-half hours The exchange rate for all treatmentswas one dollar for 4250 points The averageearnings was $2282 Data are available fromthe authors upon request

III Hypotheses

Given the design above we next identify ourhypotheses To do so we first define and discusstwo measures of convergence the level andspeed7 In theory convergence implies that allplayers play the stage game equilibrium and nodeviation is observed This is not realistic how-ever in an experimental setting Therefore wedefine the following measures

DEFINITION 1 The level of convergence atround t L(t) is measured by the proportion ofNash equilibrium play in that round The levelof convergence for a block of rounds Lb(t1 t2)measures the average proportion of Nash equi-librium play between rounds t1 and t2 ie Lb(t1t2) yentt1

t2 L(t)(t2 t1 1) where 0 t1 t2 T and T is the total number of rounds

We define the level of convergence for both around and a block of rounds The block conver-gence measure smooths out inter-roundvariation

Ideally the speed of convergence shouldmeasure how quickly all players converge toequilibrium strategies In our experimental set-ting however we never observe perfect con-vergence We therefore use a more generaldefinition for the speed of convergence

DEFINITION 2 For a given level of conver-gence L (0 1] the speed of convergence ismeasured by the first round in which the level ofconvergence reaches L and does not subse-quently drop below this level ie such thatL(t) L for any t

We use Definition 2 for analysis of conver-gence speed in simulations Due to oscillation inexperimental data Definition 2 is not practicalWe use the following observation which en-ables us to measure convergence speed usingestimates for the slope of L(t) L(t) L(t 1) L(t) obtained in regression analysis LetLy(t) be the convergence level of treatment y at

7 We thank anonymous referees for suggesting this sep-aration and appropriate measures

TABLE 1mdashFEATURES OF EXPERIMENTAL SESSIONS

Parameters and treatments Properties Equilibrium

Supermodular (p1 p2 X)10 20

1000 20000 (4 sessions) (5 sessions)

2018 18 (4 sessions) No (16 16 32)

1020 202020 (5 sessions) (5 sessions)

204040 (4 sessions) Yes (16 16 32)

1511VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

time t We now relate the slope of L(t) and theinitial level of convergence L(1) to the speed ofconvergence

OBSERVATION 1 If L1(1) L2(1) andL1(t) L2(t) for all t [1 T 1] thengiven any L (0 1] the first treatment con-verges more quickly than the second treatment

Based on theories presented in Section I wenow form our hypotheses about the level andspeed of convergence While theories of strate-gic complementarities do not make any predic-tions about the speed of convergence we formour hypotheses based on previous experimentsthat incidentally address games with strategiccomplementarities

HYPOTHESIS 1 When 20 increasing from 0 to 20 significantly increases (a) the leveland (b) the speed of convergence

Hypothesis 1 is based on the theoretical predic-tion that games with strategic complementari-ties converge to the unique Nash equilibrium aswell as on previous experimental findings thatsupermodular games perform robustly betterthan their non-supermodular counterparts

HYPOTHESIS 2 When 20 increasing from 0 to 18 significantly increases (a) the leveland (b) the speed of convergence

Hypothesis 2 is based on the findings ofFalkinger et al (2000) that average play is closeto equilibrium when the free parameter isslightly below the supermodular threshold

HYPOTHESIS 3 When 20 increasing from 18 to 20 significantly increases (a) thelevel and (b) the speed of convergence

Since we have not found any previous experi-mental studies that compare the performance ofgames with strategic complementarities withthose near the threshold Hypothesis 3 is purespeculation

HYPOTHESIS 4 When 20 increasing from 20 to 40 does not significantly in-crease either (a) the level or (b) the speed ofconvergence

Since we have not found any previous experi-mental studies within the class of games withstrategic complementarities Hypotheses 4 isagain our speculation

HYPOTHESIS 5 When 0 or 20 increas-ing from 10 to 20 significantly increases (a)the level and (b) the speed of convergence

HYPOTHESIS 6 Changing from 10 to 20significantly increases the improvement in (a)the level and (b) the speed of convergence thatresults from increasing from 0 to 20

Hypothesis 5 is based on experimental findingssupporting the General Incentive HypothesisHypothesis 6 is our speculation

Hypotheses 1 through 6 are concerned withonly one measure of performance convergenceto equilibrium Other measures such as effi-ciency and budget balance can be largely de-rived from convergence patterns Thereforealthough we omit the formal hypotheses wepresent results regarding these measures in Sec-tions IV and V

IV Experimental Results

In this section we compare the performanceof the mechanism as we vary and At theindividual level we look at both the level andspeed of convergence to subgame-perfect equi-librium and near equilibrium in each of the sixdifferent treatments At the aggregate level weexamine the efficiency and budget imbalancegenerated by each treatment In the followingdiscussion we focus on prices rather than onquantity Recall that player 1rsquos best response inthe production stage is uniquely determined byp2 and that player 1 has all of the informationneeded to select this best response In our ex-periment deviations in the production stagetend to be small (the average absolute deviationis less than 1 in 23 of 27 sessions) and does notdiffer significantly among treatments There-fore it is not surprising that in all of our anal-yses the results for quantities largely mirror theresults for player 2rsquos price8

Recall that subgame perfect Nash equilib-

8 Tables are available from the authors upon request

1512 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium for a stage game is (p1 p2 X) (16 1632) Since the strategy space in this experimentis rather large and the payoff function is rela-tively flat near equilibrium a small deviationfrom equilibrium is not very costly For exam-ple in the 2020 treatment a one-unit unilat-eral deviation from equilibrium prices costsplayer 1 $0005 and player 2 $0014 Thereforewe check the -equilibrium play by looking atthe proportion of price announcements within1 of the equilibrium price and the quantityannouncement within 4 of the equilibriumquantity since a one-unit change in p2 results ina four-unit best-response quantity changeTherefore the -equilibrium prediction is (-p1-p2 -x) (15 16 17 15 16 1728 32 36)

Figures 1 and 2 contain box and whiskersplots for the prices of each treatment for all 60rounds by players 1 and 2 respectively The boxrepresents the ranges of the twenty-fifth andseventy-fifth percentiles of prices while thewhiskers extend to the minimum and maximumprices in each round The horizontal line withineach box represents the median price Com-pared with the 00 treatments equilibriumprice convergence is clearly more pronouncedin the supermodular and near-supermodulartreatments

To analyze the performance of the compen-sation mechanism we first compare the conver-gence level achieved in the last 20 rounds ofeach treatment Table 2 reports the level ofconvergence (Lb(41 60)) to subgame-perfectNash equilibrium (panels A and B) and -Nashequilibrium (panels C and D) for each sessionunder each of the six different treatments aswell as the alternative hypotheses and the cor-responding p-values of one-tailed permutationtests9 under the null hypothesis that the con-vergence levels in the two treatments (column

[8]) are the same While the proportion of-equilibrium play is much higher than the pro-portion of equilibrium play the results of thepermutation tests largely follow similar pat-terns We now formally test our hypothesesregarding convergence level Parts (i) to (iii) ofResult 1 present the effects of the degree ofstrategic complementarity (-effects) Parts (iv)and (v) present effects due to changes in (-effects)

RESULT 1 (Level of Convergence in theLast 20 Rounds)

(i) When 20 increasing from 0 to 18and from 0 to 20 significantly10 increasesthe level of convergence in p1 p2 and -p2

(ii) When 20 increasing from 18 to 20does not change the level of convergencesignificantly

(iii) When 20 increasing from 20 to 40does not change the level of convergencesignificantly

(iv) When 0 increasing from 10 to 20weakly increases the level of convergencein p2 and significantly increases the levelof convergence in -p2 but has no signif-icant effects on p1 or -p1

(v) When 20 increasing from 10 to 20weakly increases the level of convergencein p1 but not in -p1 p2 or -p2

SUPPORT The last two columns of Table2 report the corresponding alternative hypothe-ses and permutation test results

By part (i) of Result 1 we accept Hypotheses1(a) and 2(a) Part (i) also confirms previousexperimental findings that supermodular gamesperform significantly better than those far fromthe supermodular threshold Furthermore near-supermodular games also perform significantlybetter than those far from the threshold

By part (ii) however we reject Hypothesis3(a) This is the first experimental result thatshows that from a little below the supermodularthreshold ( 18) to the threshold ( 20)

9 The permutation test also known as the Fisher random-ization test is a nonparametric version of a difference oftwo means t-test (See eg Sidney Siegel and N JohnCastellan 1988) The idea is simple and intuitive by pool-ing all independent observations the p-value is the exactprobability of observing a separation between the two treat-ments as the one observed when the pooled observations arerandomly divided into two equal-sized groups This testuses all of the information in the sample and thus has100-percent power-efficiency It is among the most power-ful of all statistical tests

10 When presenting results throughout the paper wefollow the convention that a significance level of 5 percentor less is significant while a significance level between 5percent and 10 percent is weakly significant

1513VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

improvement in convergence level is statisti-cally insignificant In other words we do notsee a dramatic improvement at the thresholdThis implies that the performance of near-supermodular games such as the Falkingermechanism ought to be comparable to that ofsupermodular games

By contrast we accept Hypothesis 4(a) bypart (iii) This is the first experimental resultsystematically comparing convergence levels ofsupermodular games where theory is silentThe convergence level does not significantlyimprove as increases from the threshold 20to 40 Therefore the marginal returns for being

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β00 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β00 Player 1

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β20 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β20 Player 1

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α20β18 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β40 Player 1

Pri

ce A

nn

ou

nce

d

Round

FIGURE 1 DISTRIBUTION OF ANNOUNCED PRICES IN EXPERIMENTAL TREATMENTS PLAYER 1

1514 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

ldquomore supermodularrdquo diminish once the payoffsbecome supermodular

Proposition 2 predicts convergence underCournot best reply for any 0 Howeverthere is a significant difference in convergencelevel as increases from 0 to 18 20 and

beyond We investigate in Section V whetherthis difference persists in the long run

While parts (i) to (iii) present the -effectsparts (iv) and (v) examine the -effects and wepartially accept Hypothesis 5(a) Recall fromequation (5) and subsequent discussions that at

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β00 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β00 Player 2

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β20 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β20 Player 2

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α20β18 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β40 Player 2

Pri

ce A

nn

ou

nce

d

Round

FIGURE 2 DISTRIBUTION OF ANNOUNCED PRICES IN EXPERIMENTAL TREATMENTS PLAYER 2

1515VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the threshold of strategic complementarity 20 player 2rsquos Nash equilibrium strategy is alsoa dominant strategy The finding of no -effecton player 2rsquos equilibrium play when 20 isconsistent with this observation

To determine the effects of strategic comple-mentarities and other factors on convergencespeed and to investigate further their effects onconvergence level we use probit models withclustering at the individual level The resultsfrom these models are presented in Table 3 Thedependent variable is -p1 in specifications (1)and (3) and -p2 in specifications (2) and (4)

In specifications (1) and (2) the independentvariables are treatment dummies Dy wherey 1000 2000 18 1020 and 40ln(Round) and a constant We use dummies fordifferent values of and rather than directparameter values to avoid assuming a linearrelationship of parameter effects and omit thedummy for 2020 Therefore in these twospecifications restricting learning speed to bethe same across treatments the estimated coef-ficient of Dy captures the difference in theconvergence level between treatments y and2020 Results from these specifications are

TABLE 2mdashLEVEL OF CONVERGENCE IN THE LAST 20 ROUNDS

Panel A Proportion of player 1 Nash equilibrium price (p1) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0000 0008 0158 0008 0044 1000 2000 03972000 0000 0083 0083 0042 0058 0053 2000 2020 00082018 0125 0117 0100 0192 0133 2000 2018 00081020 0242 0067 0067 0067 0208 0130 1020 2020 00642020 0325 0267 0208 0058 0367 0245 2018 2020 00642040 0175 0400 0158 0133 0217 2020 2040 0643

Panel B Proportion of player 2 Nash equilibrium price (p2) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0025 0042 0025 0067 0040 1000 2000 00562000 0067 0083 0033 0075 0050 0062 2000 2020 00322018 0133 0292 0108 0258 0198 2000 2018 00081020 0392 0108 0175 0067 0667 0282 1020 2020 05042020 0308 0117 0483 0017 0450 0275 2018 2020 02382040 0325 0492 0233 0233 0321 2020 2040 0365

Panel C Proportion of player 1 -Nash equilibrium price (-p1) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0108 0200 0350 0158 0204 1000 2000 01832000 0067 0458 0358 0183 0425 0298 2000 2020 00872018 0317 0625 0300 0583 0456 2000 2018 01031020 0475 0200 0300 0375 0492 0368 1020 2020 01942020 0450 0558 0475 0133 0775 0478 2018 2020 04442040 0458 0492 0375 0442 0442 2020 2040 0643

Panel D Proportion of player 2 -Nash equilibrium price (-p2) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0133 0200 0242 0225 0200 1000 2000 00162000 0300 0400 0200 0275 0333 0302 2000 2020 00162018 0717 0667 0342 0700 0606 2000 2018 00161020 0650 0517 0542 0400 0892 0600 1020 2020 05642020 0533 0592 0642 0225 0900 0578 2018 2020 05562040 0825 0792 0467 0517 0650 2020 2040 0294

Note Significant at 10-percent level 5-percent level 1-percent level

1516 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

largely consistent with Result 1 which uses amore conservative test The coefficients ofln(Round) are both positive and highly signifi-cant indicating that players learn to play equilib-rium strategies over time The concave functionalform ln(Round) which yields a better log-likelihood than either the linear or quadratic func-tional form indicates that learning is rapid at thebeginning and decreases over time We will ex-amine learning in more detail in Section V

In specifications (3) and (4) we useln(Round) interaction of each of the treatmentdummies with ln(Round) and a constant asindependent variables The interaction term al-lows different slopes for different treatmentsCompared with the coefficient of ln(Round) the

coefficient for the interaction term Dyln(Round) captures the slope differences be-tween treatment y and 2020 By Observation1 as we cannot reject the hypotheses that initialround probability of equilibrium play is thesame across all treatments11 we can comparethe slope of L(t) L(t) between different treat-ments From the probit specification L(t) 13(constant Db ln(t)) where D is the 1 5vector of treatment dummies and b is the 5 1vector of estimated coefficients we can derive

11 When including treatment dummies in this specifica-tion Wald tests on the hypothesis that the treatment dummycoefficients are all zero yield p-values of 02703 for -p1and 03383 for -p2

TABLE 3mdashCONVERGENCE SPEED PROBIT MODELS WITH CLUSTERING AT INDIVIDUAL LEVEL

Dependent variable -Nash equilibrium play

(1) -Nashprice 1

(2) -Nashprice 2

(3) -Nashprice 1

(4) -Nashprice 2

D1000 0143 0268(0048) (0050)

D2000 0090 0217(0060) (0057)

D18 0005 0004(0071) (0069)

D1020 0080 0025(0060) (0073)

D40 0000 0078(0068) (0078)

ln(Round) 0115 0167 0131 0183(0014) (0017) (0021) (0022)

D1000ln(Round) 0050 0095(0019) (0022)

D2000ln(Round) 0031 0073(0020) (0021)

D18ln(Round) 0001 0004(0020) (0021)

D1020ln(Round) 0024 0008(0019) (0022)

D40ln(Round) 0002 0023(0019) (0024)

Observations 9720 9720 9720 9720Number of groups 162 162 162 162Log pseudo-likelihood 5562890 5824055 5557424 5793013

D1000 D2000 150 131D1000ln(Round) D2000ln(Round) 133 123D1020 D1000 D2000 005 107D1020ln(Round) D1000ln(Round)

D2000ln(Round)005 103

Notes Coefficients are probability derivatives Robust standard errors in parentheses are adjusted for clustering at theindividual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummy variable fortreatment y Excluded dummy is D2020 The bottom panel presents null hypotheses and Wald 2(1) test statistics

1517VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the slope of the probability function for treat-ment y with respect to t Ly(t) 13(con-stant Db)byt All coefficients reported inTable 3 are increments of probability deriva-tives 13(constant Db)by compared to thebaseline case of 2020

RESULT 2 (Speed of Convergence)

(i) When 20 increasing from 0 to 18and from 0 to 20 significantly increases thespeed of convergence in -p1 and -p2

(ii) When 20 increasing from 18 to 20has no significant effect on convergencespeed

(iii) When 20 increasing from 20 to 40has no significant effect on convergencespeed

(iv) When 0 increasing from 10 to 20has no significant effects on convergencespeed

(v) When 20 increasing from 10 to 20has no significant effects on convergencespeed

SUPPORT Models (3) and (4) in Table 3 re-port analyses on convergence speed For eachindependent variable the coefficients standarderrors (in parentheses) and significance levelsare reported The second Wald test looks atwhether the coefficient of D1000ln(Round)equals that of D2000ln(Round)

Part (i) of Result 2 supports Hypotheses 1(b)and 2(b) while by part (ii) we reject Hypothesis3(b) Part (iii) supports Hypothesis 4(b) Parts(iv) and (v) reject Hypothesis 5(b) Result 2provides the first empirical evidence on the roleof strategic complementarity and the speed ofconvergence

Although part (ii) indicates that increasing from 18 to 20 does not significantly changeconvergence speed we now investigate whetherthere are any differences between 18 and thesupermodular treatments In particular we com-pare 2018 with 2020 and 204012 InResult 1 we show that these treatments achieve

the same convergence level Also we cannotreject the hypothesis that round-one prices foreach player are drawn for the same distributionfor each treatment13 As the treatments all startand converge to similar levels of equilibriumplay we use a more flexible functional form tolook for differences in speed

In Table 4 we report the results of probitregressions comparing 2018 with the two20 supermodular treatments We use Roundln(Round) and their interactions with 2018as independent variables to allow different con-vergence speeds to the same level of conver-gence In model (1) the dependent variable isplayer 1 -equilibrium play There is no signif-icant difference between 2018 and the 20supermodular treatments In model (2) the de-pendent variable is player 2 -equilibrium playThere are significant differences between thenear-supermodular and 20 supermodular treat-ments In early rounds ln(Round) is large relativeto Round The negative and weakly significantcoefficient on D2018ln(Round) implies a lowerprobability of early-round equilibrium play in

12 We omit 1020 from this analysis First it does notachieve the same -p1 convergence level as the other su-permodular treatments Second omitting it avoids the pos-sibility of an -effect

13 Kolmogorov-Smirnov test result tables are availableby request

TABLE 4mdashCONVERGENCE SPEED

Probit models with individual-level clustering comparing2018 with the 20 supermodular treatments achieving

similar convergence levels

Dependent variable -Nashequilibrium play

(1) -Nashprice 1

(2) -Nashprice 2

ln(Round) 0054 0216(0040) (0043)

Round 0004 0001(0002) (0002)

D2018ln(Round) 0013 0066(0032) (0035)

D2018Round 0001 0006(0002) (0003)

Observations 4680 4680Log pseudo-likelihood 287907 299385

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses Significant at 10-percent level 5-percent level 1-percent level Dy isthe dummy variable for treatment y Excluded dummies areD2020 and D2040

1518 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

2018 relative to the supermodular treatmentsThe 18 treatment does catch up to these treat-ments which is reflected in the positive andsignificant coefficient on D2018 interactedwith Round This result suggests a differencebetween supermodular and near-supermodulartreatments while they achieve similar conver-gence levels the supermodular treatments per-form better in early rounds and thus mightreach certain convergence levels faster thantheir near-supermodular counterparts

The previous discussion examines the sepa-rate effects of strategic complementarity (-effects) and (-effects) on convergenceVarying allows us however to test the im-portance of strategic complementarity relativeto other features of mechanism design Having afull factorial design in the parameters 1020 and 0 20 we can study whether affects the role of strategic complementarity Inparticular as increases from 0 to 20 weexpect improvement in convergence level andspeed We study whether this improvementchanges when increases from 10 to 20

The last two Wald tests in the bottom panelof Table 3 separately examine the -effectson the change in convergence level and speedresulting from increasing from 0 to 20(-effects on -effects) Changing from 10 to20 does not significantly change the improve-ment in either convergence level or speed Thisis the first empirical result examining the ef-fects of other factors on the role of strategiccomplementarity

So far we have discussed the performance ofthe mechanism relative to equilibrium predic-tions and individual behavior We now turn togroup-level welfare results Since the compen-sation mechanism balances the budget only in

equilibrium total payoffs off the equilibriumpath can be only weakly related to efficientpayoffs Therefore we use two separate mea-sures to capture welfare implications an effi-ciency measure and a measure of budgetimbalance

We first define the per-round efficiency mea-sure which includes neither the taxsubsidy northe penalty terms ie

et rxt cxt ext

rx cx ex

where x is the efficient quantity The efficiencyachieved in a block e(t1 t2) where 0 t1 t2 60 is then defined as

et1 t2 t t1

t2 et

t2 t1 1

RESULT 3 (Efficiency in the Last 20Rounds)

(i) When 20 increasing from 0 to 18significantly improves efficiency while in-creasing from 0 to 20 weakly improvesefficiency

(ii) When 20 increasing from 18 to 20has no significant effect on efficiency

(iii) When 20 increasing from 20 to 40weakly improves efficiency

(iv) When 0 increasing from 10 to 20weakly improves efficiency

(v) When 20 increasing from 10 to 20has no significant effect on efficiency

SUPPORT Table 5 presents the efficiencymeasure for each session in each treatment

TABLE 5mdashEFFICIENCY MEASURE

Efficiency Last 20 rounds

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0600 0671 0804 0858 0733 1000 2000 00872000 0650 0870 0907 0873 0825 0825 2000 2020 00952018 0963 0952 0889 0959 0941 2000 2018 00161020 0958 0919 0905 0881 0984 0929 1020 2020 07142020 0888 0937 0939 0780 0971 0903 2018 2020 08332040 0973 0962 0943 0949 0957 2020 2040 0064

Note Significant at 10-percent level 5-percent level 1-percent level

1519VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

in the last 20 rounds the alternative hypoth-eses and the results of one-tailed permutationtests

Result 3 is largely consistent with Result 1indicating that supermodular and near-supermodular mechanisms induce a signifi-cantly higher proportion of equilibrium playthan mechanisms far from the supermodularthreshold The new finding is that increasing from the threshold 20 to 40 weakly improvesefficiency

Second the budget surplus at round t is thesum of the penalty terms plus tax minus sub-sidy in that round ie s(t) ( )(p1(t) p2(t))2 p2(t)x(t) p1(t)x(t) Examining thesession-level budget surplus across all roundswe find the following results First 22 out of 27sessions result in a budget surplus This comesfrom a combination of our choice of parametersand the dynamics of play Second the onlysignificant difference across treatments is thatbudget surpluses under 2018 and 2020are significantly lower than those under2040 ( p-values 0029 and 0024 respec-tively) This finding points to a potential costof driving up the punishment parameter Inother words before the system equilibrates ahigh punishment parameter can worsen thebudget imbalance problem inherent in themechanism

Overall Results 1 through 3 suggest the fol-lowing observations regarding games with stra-tegic complementarities First in terms ofconvergence level and speed supermodular andnear-supermodular games perform significantlybetter than those far under the threshold Sec-ond while there is no significant differencebetween supermodular and near-supermodulargames in terms of convergence level early per-formance differences may lead to supermodulartreatments reaching certain convergence levelsmore quickly Third beyond the threshold in-creasing has no significant effect on either thelevel or the speed of convergence but has thedisadvantage of risking higher budget imbal-ance Last for a given increasing haspartial effects on convergence level but nosignificant effect on convergence speed Tocheck the persistence of these experimental re-sults in the long run we use simulations inSection V

V Simulation Results Continued Dynamics

Our experiment examines the relationship be-tween strategic complementarity and conver-gence to equilibrium Learning theory predictslong-run convergence Figures 1 and 2 showthat convergence continues to improve in laterrounds for several treatments suggesting con-tinued dynamics Due to time attention andresource constraints it was not feasible to runthe experiment for much longer than 60rounds Therefore we rely on simulations tostudy continued convergence beyond 60rounds

To do so we look for a learning algorithmwhich when calibrated closely approximatesthe observed dynamic paths over 60 roundsThe large empirical literature on learning ingames (see eg Camerer 2003 for a survey)suggests many models Our interest here is notto compare the performance of various learningmodels We look for learning models that matchtwo criteria First we require a model that per-forms well in a variety of experimental gamesSecond given our experiment has complete in-formation about the payoff structure the modelneeds to incorporate this information In thefollowing subsections we first introduce threelearning models which meet our criteria Wethen look at how well the learning models pre-dict the experimental data by calibrating eachalgorithm on a subset of the experimental dataand validating the models on a hold-out sampleWe then report the forecasting results using oneof the calibrated algorithms

A Three Learning Models

The models we examine are stochastic ficti-tious play with discounting (hereafter shortenedas sFP) (Yin-Wong Cheung and Daniel Fried-man 1997 Fudenberg and Levine 1998) func-tional EWA (fEWA) (Teck-Hua Ho et al2001) and the payoff assessment learning model(PA) (Rajiv Sarin and Farshid Vahid 1999)We now give a brief overview of each modelInterested readers are referred to the originalsfor complete descriptions

The particular version of sFP that we use islogistic fictitious play (see eg Fudenberg andLevine 1998) A player predicts the matchrsquosprice in the next round pi(t 1) according to

1520 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(6) pi t 1

pi t yen 1

t 1

rpi t

1 yen 1

t 1

r

for some discount factor r [0 1] Note r 0corresponds to the Cournot best reply assess-ment pi(t 1) pi(t) When r 1 it yieldsthe standard fictitious play assessment Theusual adaptive learning model assumes 0 r 1All observations influence the expected state butmore recent observations have greater weight

As opposed to standard fictitious play stochas-tic fictitious play allows decision randomizationand thus better captures the human learning pro-cess Omitting time subscripts the probability thata player announces price pi is given by

(7) Pr pipi exp13 pi pi

yenj 0

40

exp13 pj pj

Given a predicted price a player is thus morelikely to play strategies that yield higher pay-offs How much more likely is determined by 13the sensitivity parameter As 13 increases theprobability of a best response to pi increases

The second model we consider fEWA is aone-parameter variant of the experience-weighted attraction (EWA) model (Camererand Ho 1999) In this model strategy probabil-ities are determined by logit probabilities simi-lar to equation (7) with actual payoffs ((pipi)) replaced by strategy attraction In bothvariants strategy attraction Ai

j(t) and an expe-rience weight N(t) are updated after everyperiod The experience weight is updated ac-cording to N(t) (1 ) N(t 1) 1where is the change-detection parameter and controls exploration (low ) versus exploita-tion Attractions are updated according to thefollowing rule

(8) Aijt

Nt 1Ai

jt 1 1 I pij pi ti pj pi t

Nt

where the indicator function I(pij pi(t)) equals

one if pij pi(t) and zero otherwise and [0

1] is the imagination weight In EWA all pa-rameters are estimated whereas in fEWA theseparameters (except for 13) are endogenously de-termined by the following functions Thechange-detector function i(t) is given by

(9) i t 1 5 j 1

mi Ipij pi t

1

yen 1

t

I pij pi t

t 2

This function will be close to 1 when recenthistory resembles previous history The imagi-nation weight i(t) equals i(t) while equalsthe Gini coefficient of previous choice frequen-cies EWA models encompass a variety offamiliar learning models cumulative reinforce-ment learning ( 0 1 N(0) 1)weighted reinforcement learning ( 0 0N(0) 1) weighted fictitious play ( 1 0) standard fictitious play ( 1 0)and Cournot best reply ( 1 1)

Finally we introduce the main componentsof the PA model For simplicity we omit allsubscripts that represent player i and let j(t)represent the actual payoff of strategy j in roundt Since the game has a large strategy space weincorporate similarity functions into the modelto represent agent use of strategy similarity Asstrategies in this game are naturally ordered bytheir labels we use the Bartlett similarity func-tion fjk(h t) to denote the similarity betweenthe played strategy k and an unplayed strategyj at period t

fjk h t 1 j kh if j k h0 otherwise

In this function the parameter h determines theh 1 unplayed strategies on either side of theplayed strategy to be updated When h 1fjk(1 t) degenerates into an indicator functionequal to one if strategy j is chosen in round t andzero otherwise

The PA model assumes that a player is amyopic subjective maximizer That is hechooses strategies based on assessed payoffs

1521VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

and does not explicitly take into account thelikelihood of alternate states Let uj(t) denotethe subjective assessment of strategy pj at timet and r the discount factor Payoff assessmentsare updated through a weighted average of hisprevious assessments and the payoff he actuallyobtains at time t If strategy k is chosen at timet then

(10) uj t 1 1 rfjk h tuj t

rfjk h tk t j

Each period the assessed strategy payoffs aresubject to zero-mean symmetrically distributedshocks zj(t) The decision maker chooses on thebasis of his shock-distorted subjective assess-ments uj(t) uj(t) zj(t) At time t he choosesstrategy pk if

(11) uk t uj t 0 pj pk

Note that mood shocks affect only his choicesand not the manner in which assessments areupdated

B Calibration

Literature assessing the performance oflearning models contains two approaches to cal-ibration and validation The first approach cal-ibrates the model on the first t rounds andvalidates on the remaining rounds The secondapproach uses half of the sessions in each treat-ment to calibrate and the other half to validateWe choose the latter approach for two reasonsFirst this approach is feasible as we have mul-tiple independent sessions for each treatmentSecond we need not assume that the parametersof later rounds are the same as those in earlierrounds We thus calibrate the parameters ofeach model in blocks of 15 rounds using theexperimental data from the first two sessions ofeach treatment We then evaluate each model bymeasuring how well the parameterized modelpredicts play in the remaining two or threesessions

For parameter estimation we conduct MonteCarlo simulations designed to replicate thecharacteristics of the experimental settings In

all calibrations we exclude the last two rounds(59 and 60) to avoid any end-of-game effectsWe then compare the simulated paths with theexperimental data to find those parameters thatminimize the mean-squared deviation (MSD)scores Since the final distributions of our pricedata are unimodal the simulated mean is aninformative statistic and is well captured byMSD (Ernan Haruvy and Dale Stahl 2000) Inall simulations we use the k-period-aheadrather than the one-period-ahead approach14 be-cause we are interested in forecasting the long-run mechanism performance In doing so wechoose k 10 15 20 30 and 58 We look atblocks of 10 15 20 and 30 rounds because asplayers gain information and experience infor-mation use may change over time We use k 15 rather than the other values because it bestcaptures the actual dynamics in the experimen-tal data

Each simulation consists of 1500 games(18000 players) and the following steps

(i) Simulated players are randomly matchedinto pairs at the beginning of each round

(ii) Simulated players select price announce-ments(a) Initial round Almost half of all play-

ers selected a first-round price of 20(44 percent of player 1s and 43 percentof player 2s) There are also consider-able spikes in the frequency of select-ing other prices divisible by 515 Basedon Kolmogorov-Smirnov tests on theactual round-one price distribution wereject the null hypotheses of uniformdistribution (d 0250 p-value 0000) and normal distribution (d 0381 p-value 0000) We thus fol-low the convention (eg Ho et al2001) and use the actual first-roundempirical distribution of choices togenerate the first-round choices

(b) Subsequent rounds Simulated playerstrategies are determined via equation(7) for sFP and fEWA and equation(11) for PA

14 Ido Erev and Haruvy (2000) discuss the tradeoffs ofthe two approaches

15 For example the seven most frequently selected pricesare divisible by 5

1522 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iii) Simulated player 1rsquos quantity choice isbased on the following steps(a) Determine each player 1rsquos best re-

sponse to p2(b) Determine whether player 1 will devi-

ate from the best response via the ac-tual probability of errors for eachblock

(c) If yes deviate via the actual mean andstandard deviation for the block Oth-erwise play the best response

(iv) Payoffs are determined by the payoff func-tion of the compensation mechanismequation (1) for each treatment

(v) Assessments are updated according toequation (8) for fEWA and equation (10)for PA

(vi) Proceed to the next round

The discount factor r [0 1] is searched ata grid size of 01 The parameter 13 is searchedat a grid size of 01 in the interval [15 105] forfEWA and [15 250] for sFP The size of thesimilarity window h [1 10] is searched at agrid size of 1 Mood shocks z are drawn from

a uniform distribution16 on an interval [a a]where a is searched on [0 500] with a step sizeof 50 For all parameters intervals and gridsizes are determined by payoff magnitude

Table 6 reports the calibrated parameters(discount factor sensitivity parameter moodshock interval and similarity window size) forthe first two sessions of each treatment in 15-round blocks17 Estimated parameters for thesupermodular and near-supermodular treatmentsare consistent with the increased level of con-vergence over time while the 00 treatmentsare not The second column (sFP) reports thebest-fit parameters for the stochastic fictitiousplay model With the exception of treatment2000 the discount factor is close to 10

16 Chen and Yuri Khoroshilov (2003) compare threeversions of PA models where shocks were drawn fromuniform logistic and double exponential distributions ontwo sets of experimental data and find that the performanceof the three versions of PA were statistically indistinguish-able

17 We also calibrate all parameters for the entire 60rounds but do not report results due to space limitations

TABLE 6mdashCALIBRATION OF THREE LEARNING MODELS IN FIFTEEN-ROUND BLOCKS

Model sFP fEWA PA

Block 1 2 3 4 1 2 3 4 1 2 3 4

Discount rate (r) Discount rate (r)1000 10 07 09 10 01 00 09 102000 07 06 01 01 05 10 02 012018 10 10 09 09 02 10 03 021020 10 10 10 09 01 03 06 032020 10 10 10 09 02 01 05 022040 10 10 10 07 01 10 04 03

Sensitivity (13) Sensitivity (13) Shock interval (a)1000 15 44 45 28 16 45 44 37 500 500 250 502000 15 43 140 180 17 40 51 74 50 100 150 502018 16 80 115 183 15 53 62 95 50 300 500 1501020 19 59 86 138 18 47 62 82 100 500 450 3002020 28 56 81 112 24 38 61 67 200 300 350 3502040 34 65 140 205 32 49 67 99 150 50 150 50

Similarity window (h)1000 1 1 10 92000 10 10 9 62018 5 4 6 11020 1 1 8 32020 2 2 7 12040 1 3 2 2

1523VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

indicating that players keep track of all past playwhen forming beliefs about opponentsrsquo nextmove18 Given these beliefs the increasing sen-sitivity parameter 13 for all treatments except1000 indicates that the likelihood of bestresponse increases over time The third columnreports calibration results for the fEWA modelAgain with the exception of treatment 1000the parameter 13 increases over time The finalcolumn reports calibration of parameters in thePA model For each treatment except 1000 amiddle block has the highest discount factorindicating more weight on new informationabout a strategyrsquos performance The second pa-rameter in the PA model represents the upperbound of the interval from which shocks aredrawn a Experimentation should decrease inthe final rounds Indeed for all treatments theestimated mood-shock ranges (weakly) de-crease from the third to the last block Finallythe decreasing similarity windows from thethird to final block indicate decreasing strategyspillover Relatively large discount factors inthe third block combined with relatively largesimilarity windows flatten the payoff assess-ments around the most recently played strate-gies consistent with the local experimentation(or relatively stabilized play) observed in thedata

C Validation

Using the parameters calibrated on the firsttwo sessions of each treatment we next com-pare the performance of the three learning mod-els in predicting play in the hold-out sampleFor comparison we also present the perfor-mance of two static models The random choicemodel assumes that each player randomlychooses any strategy with equal probability forall rounds This model incorporates only num-ber of strategies and thus provides a minimumstandard for a dynamic learning model Theequilibrium model assumes that each playerplays the subgame perfect Nash equilibriumevery round Its fit conveys the same informa-tion as the proportion of equilibrium play pre-

sented in Section IV but with a different metric(MSD)

Table 7 presents each modelrsquos MSD scoresfor each hold-out session Recall that two ofeach treatmentrsquos four or five independent ses-sions are used for calibration and the rest forvalidation The results indicate that all threelearning models perform significantly betterthan the random choice model (p-value 001one-sided permutation tests) and by a largermargin significantly better than the equilibriummodel (p-value 001 one-sided permutationtests) While the equilibrium model does a poorjob of explaining overall experimental data itsperformance improvement over time19 justifiesthe use of learning models to explain the dy-namics of play Within each of the top threepanels (ie learning models) session-levelMSD scores are lower for the supermodular andnear-supermodular treatments indicating thateach learning model does a better job explainingbehavior in those treatments Indeed in the em-pirical learning literature learning models fitbetter in experiments with better equilibriumconvergence (see eg Chen and Khoroshilov2003)

RESULT 4 (Comparison of Learning Mod-els) The performance of PA is weakly betterthan that of fEWA and strictly better than that ofsFP The performances of fEWA and sFP arenot significantly different

SUPPORT Table 7 reports the MSD scores foreach independent session in the hold-out sampleunder each model The Wilcoxon signed-rankstests show that the MSD scores under PA areweakly lower than those under fEWA (z 0085 p-value 0068) and strictly lowerthan those under sFP (z 0028 p-value 0023) Using the same test we cannot reject thehypothesis that fEWA and sFP yield the sameMSD scores (z 0662 p-value 0508)

Although the performance of PA is onlyweakly better than that of fEWA the overallMSD scores for PA are lower than those for

18 As the discount factor is zero in Cournot best replywe can reject this model based on our estimation of thediscount factors

19 We omit the table to show the performance of theequilibrium model in blocks of 15 rounds as the informa-tion is repetitive with Figures 1 and 2

1524 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

fEWA for every treatment Therefore we usePA for forecasting beyond 60 rounds

Figure 3 presents the simulated path of thecalibrated PA model and compares it with theactual data in the hold-out sample by superim-posing the simulated mean (black line) plus andminus one standard deviation (grey lines) on theactual mean (black box) and standard deviation(error bars) The simulation does a good job oftracking both the mean and the standard devia-tion of the actual data In treatments 1000and 2040 however the reduction in variancelags that seen in the middle rounds of theexperiment

D Forecasting

We now report the results from the PAmodel As the strategic complementarity predic-tions concern long-run performance we use thecalibrated parameters for the first 58 rounds tosimulate play in later rounds This exercise al-

lows us to study convergence and efficiency inlong but finite horizons

In all forecasting exercises we base all post-58-round parameters on those from the lastblock (calibrated from rounds 46 to 58) Sincewe do not know how these parameters mightchange beyond 60 rounds we use three differ-ent specifications First we retain all final-blockparameters Second we exponentially decay insubsequent blocks the probability of deviationfrom the best response in the production stage20

Third we exponentially decay the shock inter-val in subsequent blocks As all three specifica-tions yield qualitatively similar results wereport results from only the third specificationdue to space limitations21 In presenting the

20 In block k 4 we use the probability of deviationPrk minPr4(k 4)2 108 We set a lower bound of108 to avoid the division-by-zero problem

21 Different sequences of random numbers produceslightly different parameters estimates While the conver-

TABLE 7mdashVALIDATION ON HOLD-OUT SESSIONS

Session

Stochastic fictitious play

1000 2000 2018 1020 2020 2040

1 0955 0934 0940 0930 0888 09402 0960 0946 0890 0917 0973 08783 0948 0883 0873Overall 0957 0943 0915 0910 0912 0909

fEWA

1 0956 0934 0942 0928 0887 09482 0960 0946 0890 0922 0978 08863 0946 0879 0871Overall 0958 0942 0916 0910 0912 0917

Payoff assessment

1 0952 0933 0914 0941 0888 09162 0957 0944 0899 0922 0917 08983 0947 0894 0903Overall 0955 0941 0907 0919 0902 0907

Equilibrium play

1 1867 1853 1833 1833 1489 17922 1889 1919 1606 1839 1953 16693 1917 1447 1539Overall 1878 1896 1719 1706 1660 1731

Random play

Overall 0976 0976 0976 0976 0976 0976

1525VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

results we use the shorthand notation x y todenote a measure under treatment x is signifi-cantly higher than that under treatment y at the5-percent level or less and x y to denote thatthe measures are not significantly different atthe 5-percent level

Figure 4 reports the simulated proportion of-equilibrium play We report only the resultsfor the first 500 rounds since the dynamics donot change much thereafter In our simulationall treatments reach higher convergence levelsthan those achieved in the 60 rounds of the

experiment Since the simulated variance reduc-tion lags that of the experiment round 60 resultsare not achieved until approximately 90 roundsof simulation We further note that these im-provements slow down after 200 rounds Theproportion of -equilibrium play is bounded by72 percent for player 1 and 93 percent for player2 We now outline the simulation results

RESULT 5 (Level of Convergence in Round500) In the simulated data for player 1 atround 500 we have the following level of con-vergence ranking in

(i) p1 2018 2040 2020 1020 2000 1000

(ii) -p1 2018 2040 2020 1020 2000 1000

(iii) p2 2040 2018 1020 2020 2000 1000 and

gence level of a given treatment for a given set of parameterestimates is not affected by the sequence of random num-bers this convergence level is somewhat sensitive to theexact parameter calibration We therefore conduct four dif-ferent calibrations for each treatment and select the param-eters that yield the highest level The relative rankings oftreatments are quite robust with respect to ordinal selected

FIGURE 3 SIMULATED DYNAMIC PATH VERSUS ACTUAL DATA

1526 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iv) -p2 2040 2018 1020 2020 2000 1000

SUPPORT The fourth and seventh columns ofTable 8 report p-values for t-tests of the preced-ing alternative hypotheses for players 1 and 2respectively

Comparing Results 1 and 5 we first note thatthe four supermodular and near-supermodulartreatments continue to dominate the 00 treat-ments In fact the simulations suggest that thegap remains constant compared with 2000and actually increases compared with 1000Unlike Result 1 however the four dominanttreatments differ In particular 2018 per-forms better in terms of player 1rsquos convergenceand 2040 in terms of player 2rsquos although thedifferences within the top four treatments aresmaller than the differences between these four

and the 00 treatments In addition an increasein significantly improves convergence by alarge margin (40 to 80 percent) for both playerswhen 0

It is instructive to compare these results withthose concerning the speed of convergence inour experimental treatments In terms of con-vergence to -p2 our regression results (Table4) indicate that 2018rsquos performance in laterrounds enables it to achieve the same conver-gence level as the supermodular treatmentsThis trend continues in our simulations as2018 performs robustly well compared to thesupermodular treatments Likewise in our anal-ysis of the experimental results the conver-gence speed of 2040 was not significantlydifferent from those of the other dominant treat-ments In our simulations however this treat-ment dominates all treatments in player 2equilibrium play

FIGURE 3 CONTINUED

1527VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

In our simulations the convergence improve-ment due to increasing from 0 to 20 doesdepend on (-effect on -effect) An increasein significantly decreases the improvement in

convergence level (all p-values for one-sidedt-tests are less than 001) Due to the relativelypoor performance of 1000 in our simula-tions increasing from 10 to 20 when 0

Player 1

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500Round

Ep

silo

n-e

qu

ilib

riu

m p

lay

α10β00α20β00α20β18α10β20α20β20α20β40

Player 2

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500

Round

Ep

silo

n-e

qu

ilib

riu

m p

lay α10β00

α20β00α20β18α10β20α20β20α20β40

FIGURE 4 SIMULATED -EQUILIBRIUM PLAY

1528 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

improves convergence more dramatically inround 500 than in rounds 40 to 60 In the longrun an increase in is a partial substitute for anincrease in

We now use definition 2 to examine the con-vergence speed in the long run Table 9 presentsresults for the first time a treatment reaches thelevel of convergence L We omit the two 00treatments since they do not converge to thesame levels as the other treatments In terms ofplayer 1 2018 achieves the fastest conver-gence for all levels L The picture for player 2convergence however is more subtle First forthe 20 and 18 treatments the speeds of con-vergence to all L are remarkably similarWhile 2040 lags the other treatments interms of achieving 80-percent -equilibriumplay for player 2 it is the only treatment toachieve L 90 percent

Apart from convergence to equilibrium it isalso important to look at long-run welfare prop-erties We evaluate mechanism performance us-ing the efficiency measure in Section IV

Figure 5 summarizes the simulated and actualefficiency achieved by each of the treatmentsThe top panel reports simulated results whilethe bottom panel reports efficiency in the ex-perimental data In the long run supermodularand near-supermodular treatments continue todiffer from the 00 treatments with 20 supe-rior to 10 in the 00 treatments

VI Interpretation and Discussion

The underlying force for convergence ingames with strategic complementarities is thecombination of the slope of the best-responsefunctions and adaptive learning by players In

TABLE 9mdashSPEED OF CONVERGENCE IN SIMULATED DATA

(Threshold is the percent of players playing epsilon-equilibrium strategy Entry indicates round in which went overthreshold for good where ldquomdashrdquo indicates that treatment never achieved threshold)

Threshold

Player 1 Player 2

04 05 06 07 04 05 06 07 08 09

2018 54 83 126 316 38 52 62 87 127 mdash1020 129 221 mdash mdash 36 48 63 91 148 mdash2020 61 91 149 mdash 39 51 61 84 137 mdash2040 101 166 263 496 30 38 56 94 166 339

TABLE 8mdashRESULTS OF t-TESTS COMPARING LEVEL OF CONVERGENCE OF SIMULATIONS OF EXPERIMENTAL TREATMENTS

(Values are for round 500 and based on 1500 simulated games)

Treatment

Player 1 equilibrium price Player 2 equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0073 00822000 0188 2000 1000 0000 0213 2000 1000 00002018 0265 2018 2040 0187 0381 2018 1020 02641020 0211 1020 2000 0000 0376 1020 2020 00012020 0243 2020 1020 0000 0353 2020 2000 00002040 0259 2040 2020 0007 0442 2040 2018 0000

Treatment

Player 1 -equilibrium price Player 2 -equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0216 02322000 0519 2000 1000 0000 0573 2000 1000 00002018 0713 2018 2040 0045 0870 2018 1020 00081020 0584 1020 2000 0000 0857 1020 2020 00002020 0662 2020 1020 0000 0834 2020 2000 00002040 0701 2040 2020 0000 0925 2040 2018 0000

1529VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the compensation mechanisms the slope ofplayer 2rsquos best response function is determinedby As and determine the penalty for

mismatched prices we seriously consider thepossibility that convergence in supermodularand near-supermodular treatments to an equilib-

Simulation Data

75

80

85

90

95

100

50 150 250 350 450

Round

α10β00α20β00α20β18α10β20α20β20α20β40

Experimental Data

40

45

50

55

60

65

70

75

80

85

90

0 10 20 30 40 50 60

Round

Fra

ctio

n o

f M

axim

um

Wel

fare

α10β00α20β00α20β18α10β20α20β20α20β40

Fra

ctio

n o

f M

axim

um

Wel

fare

FIGURE 5 EFFICIENCY IN THE SIMULATED AND ACTUAL DATA

1530 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium where both players announce the sameprice is not due to strategic complementaritiesbut rather to increases in and making thepenalty terms more prominent and matching strat-egies more focal22 We thus have two hypoth-eses to explain the observed improvement inconvergence to equilibrium play in supermodu-lar and near-supermodular treatments a best-response hypothesis and a matching hypothesisIn this section we present evidence that theimprovement in convergence is due to payoff-relevant changes to best responses and not tomatching becoming more focal

While the focal point hypothesis suggeststhat players converge to a match it does notspecify the price at which they match In thefirst round of our experiments almost 50 per-cent of all prices were 20 and less than 10percent were in the -equilibrium range of[15 17] In the final rounds of our experi-ments play in the four supermodular or near-supermodular treatments converges stronglyto this range suggesting more than simplematching

By equation (4) player 1rsquos best response isalways to match regardless of p2 By theGeneral Incentive Hypothesis we expectmore matching behavior by player 1 as increases By equation (5) however match-ing is a best response for player 2 only inequilibrium Therefore data for player 2 canhelp us separate the best-response and match-ing hypotheses

We first investigate which model better ex-plains our experimental data We operationalizethe best response hypothesis by the stochasticfictitious play model of Section V and thematching hypothesis with a stochastic matchingmodel23 In both models the predicted player 1price is specified by equation (6) In the match-ing model player 2 plays this price with prob-ability 1 and one of the other 40 priceswith probability 40 We estimate using thefirst two sessions of each treatment24 We then

compare the performance of the matchingmodel with the sFP model using the hold-outsample Using the Wilcoxon signed-rank teststo compare the MSD scores in the hold-outsample we conclude that sFP explains player2rsquos behavior significantly better than the match-ing model (p-value 0006) We conclude thatbest response to a predicted price explains be-havior significantly better than the matchingmodel

We next investigate whether differences inthe cost of matching defined as the differencebetween matching and best-response profitscan explain all of the differences in match-ing behavior among treatments or whetherthere exists additional matching behaviorthat cannot be explained by payoff-relevantfactors

We examine the probability that player 2 triesto match the anticipated player 1 price We donot know what price player 2 anticipates Toestimate how players weigh history we cali-brate a matching model in a manner similar tothat outlined in Section V B We assume aplayer matches the price he anticipates givenby equation (6) and we find the discount ratefor each treatment that minimizes MSD25 Thisgives us the anticipated price that best explainsthe matching hypothesis Using this discountrate we calculate for each t 1 an anticipatedplayer 1 price p1 for each player 2 We alsocalculate the opportunity cost of matching p1equal to the difference between best-responseand matching profits This opportunity cost cap-tures the payoff relevant effects of and alsodepends on the price to be matched

We next regress the probability that player 2matches the anticipated price Pr(p2 p1) onln(Round) treatment dummies and treatmentdummies interacted with ln(Round) In onespecification we include the cost of matching asa regressor If parameter changes make match-ing more focal then changes in the cost ofmatching will not explain all of the changes inprobability of matching

22 We thank an anonymous referee and the co-editor forpointing this out

23 We do not report the results of the deterministicmatching model as it performs significantly worse than itsstochastic counterpart

24 Estimated in each block are as follows 1000 10 7 6 8 2000 7 6 5 4 2018 7

2 3 3 1020 5 3 3 2 2020 1 0 0 02040 2 0 0 1

25 In all treatments the calibrated discount rate is r 01

1531VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Table 10 presents the results of two probitmodels with clustering at the individual level Inthe first specification we do not include the costof matching as a regressor In this specificationthe coefficient for D00 is negative and signifi-cant thus reducing from 20 to 0 decreases theprobability that player 2 will try to match player1rsquos price Including the cost of matching in theregression however we find that neither thedummies nor their interactions are signifi-cantmdashthe coefficient for the previously signifi-cant D00 dummy falls almost by a half whilethe precision of the estimate stays about thesame The coefficient on the cost of matchinghowever is significant and negative Thisfinding suggests that the rise in matching that

occurs when we increase from 0 to 20 is notbecause matching is more focal but ratherbecause an increase of decreases the cost ofmatching26

VII Concluding Remarks

The appeal of games with strategic comple-mentarities is simple as long as players areadaptive learners the game will in the limitconverge to the set bounded by the largest andsmallest Nash equilibria This convergence de-pends on neither initial conditions nor the as-sumption of a particular learning modelUnfortunately while many competitive envi-ronments are supermodular many are not Inthis study we examine whether there are non-supermodular games which have the same con-vergence properties as supermodular ones Wealso study whether there exists a clear conver-gence ranking among games with strategiccomplementarities

Our results confirm the findings of previousexperimental studies that supermodular gamesperform significantly better than games far be-low the supermodular threshold From a littlebelow the threshold to the threshold howeverthe change in convergence level is statisticallyinsignificant This results suggests that in thecontext of mechanism design the designerneed not be overly concerned with setting pa-rameters that are firmly above the supermodularthreshold close is just as good It also enlargesthe set of robustly stable games For examplethe Falkinger mechanism (Falkinger 1996) isnot supermodular but close These results sug-gest that near-supermodular games perform likesupermodular ones

Our next result concerns convergence perfor-mance within the class of games with strategiccomplementarities Variations in the degree ofcomplementarities have no significant effect onperformance within the 60 experimental roundsOur simulations suggest however an increased

26 We consider other specifications of anticipated priceincluding the averages of the last one two three and fourprices seen In only one specification was a dummy or itsinteraction with ln(Round) significant In that specificationthe coefficient for D18 is positive and its interaction withln(Round) is negative a finding which is not consistent witha focal point hypothesis

TABLE 10mdashPLAYER 2 MATCHING HYPOTHESIS

(Probit models with individual-level clustering comparingmodels with and without the cost to Player 2 of matching

the anticipated price of Player 1)

Model

Dependent variableProbability player 2 priceequals anticipated player 1

price

(1) (2)

D10 0014 0017(0043) (0041)

D00 0077 0043(0035) (0036)

D18 0046 0032(0053) (0056)

D40 0008 0041(0061) (0042)

ln(Round) 0041 0028(0011) (0010)

Match Cost 0001(0000)

ln(Round)D10 0014 0012(0012) (0012)

ln(Round)D00 0003 0000(0013) (0012)

ln(Round)D18 0006 0002(0021) (0020)

ln(Round)D40 0001 0015(0018) (0016)

Observations 9588 9588Log pseudo-likelihood 324321 317716

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses are adjusted for cluster-ing at the individual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummyvariable for treatment y Excluded dummy is D2020Match Cost is the difference in cents of matching andbest-responding to anticipated Player 1 price

1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 2: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

mechanisms for pure public goods Chen (forth-coming) finds that mechanisms with strategiccomplementarities such as the Groves-Ledyardmechanism under a high punishment parameterconverge robustly to the efficient equilibrium(Chen and Charles R Plott 1996 Chen andFang-Fang Tang 1998) Conversely those faraway from the threshold of strategic comple-mentarities do not seem to converge (Vernon LSmith 1979 Ronald M Harstad and MichaelMarrese 1982) In previous experiments pa-rameters are set either far away from the thresh-old for strategic complementarities (eg Chenand Tang 1998) or very close to the threshold(eg Josef Falkinger et al 2000)1 These ex-periments do not however systematically setthe parameters below close to at and above thethreshold to assess the effect of strategiccomplementarities on convergence This is thefirst such systematic experimental study ofgames with strategic complementarities2 Con-sequently this study answers three importantquestions that the theory on games with strate-gic complementarities does not address Firstas the parameters approach the threshold ofstrategic complementarities will play convergeto equilibrium gradually or abruptly Second isthere a clear performance ranking among gameswith strategic complementarities Third howimportant is strategic complementarity com-pared to other factors The answer to the firstquestion can help us assess a priori whether agame ldquocloserdquo to being supermodular such asthe Falkinger mechanism (Falkinger 1996)might also have good convergence propertiesThe answer to the second question will help uschoose the best parameters within the class ofgames with strategic complementarities Theanswer to the third question will help us assessthe importance of strategic complementaritiesin learning and convergence

To address these questions this study adoptsan experimental game from the literature onsolutions to externalities Hal R Varian (1994)

proposes a simple class of two-stage mecha-nisms the compensation mechanisms in whichsubgame-perfect equilibria implement efficientallocations In a generalized version of Varianrsquosmechanism (John Cheng 1998) one can varywithout altering the equilibrium a parameterthat determines whether the condition for stra-tegic complementarities is satisfied

There have been two experimental studies ofthe compensation mechanisms neither of whichadopts a version with strategic complementari-ties James Andreoni and Varian (1999) studythe mechanism in the context of the PrisonersrsquoDilemma They find that adding a commitmentstage to the standard Prisonersrsquo Dilemma gamenearly doubles the amount of cooperation totwo-thirds Yasuyo Hamaguchi et al (2003)investigate a version with a larger strategy spaceand find 20-percent Nash equilibrium play

In this paper we examine the compensationmechanism in an economic environment with amuch larger strategy space Furthermore wechoose various versions to study systematicallythe effect of strategic complementarities onconvergence

The paper is organized as follows Section Iintroduces games with strategic complementa-rities and presents theoretical properties of thecompensation mechanisms Section II presentsthe experimental design Section III introducesthe set of hypotheses Section IV presents ex-perimental results on the level and speed ofconvergence as well as efficiency Section Vpresents the calibration of three learning mod-els validation of the models on a hold-out sam-ple and simulation of performance in the longrun using a calibrated learning model SectionVI discusses our findings and Section VIIconcludes

I Strategic Complementarity and theCompensation Mechanisms

In this section we first introduce games withstrategic complementarities and their theoreticalproperties We then introduce and analyze afamily of mechanisms in this class of gamesnamely the compensation mechanisms

Games with strategic complementarities(Milgrom and Shannon 1994) are those wheregiven an ordering of strategies higher actionsby one player provide an incentive for the other

1 Proofs of supermodularity of the Groves-Ledyard andthe Falkinger mechanisms are presented in Chen (in press)

2 By systematically varying the free parameter in theGroves-Ledyard mechanism Jasmina Arifovic and John OLedyard (2001) study learning dynamics and mechanismconvergence using genetic algorithms compared with ex-perimental data

1506 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

players to take higher actions as well Thesegames include supermodular games in whichthe incremental return to any player from in-creasing her strategy is a nondecreasing func-tion of the strategy choices of other players(increasing differences) Furthermore if a play-errsquos strategy space has more than one dimen-sion components of a playerrsquos strategy arecomplements (supermodularity) Membershipin this class of games is easy to check Indeedfor smooth functions in n letting Pi be thestrategy space and i be the payoff function ofplayer i the following theorem characterizesincreasing differences and supermodularity

THEOREM 1 (Topkis 1978) Let i be twicecontinuously differentiable on Pi Then i hasincreasing differences in (pi pj) if and only if2ipihpjl 0 for all i j and all 1 h ki and all 1 l kj and i is supermodular inpi if and only if 2ipihpil 0 for all i and all1 h l ki

Increasing differences means that an increasein the strategy of player irsquos rivals raises hermarginal utility of playing a high strategy Thesupermodularity requirement ensures comple-mentarity among components of a playerrsquosstrategies and is automatically satisfied in aone-dimensional strategy space Note that asupermodular game is a game with strategiccomplementarities but the converse is nottrue

Supermodular games have interesting theo-retical properties In particular they are robustlystable Milgrom and Roberts (1990) prove thatin these games learning algorithms consistentwith adaptive learning converge to the setbounded by the largest and the smallest Nashequilibrium strategy profiles Intuitively a se-quence is consistent with adaptive learning ifplayers ldquoeventually abandon strategies that per-form consistently badly in the sense that thereexists some other strategy that performs strictlyand uniformly better against every combinationof what the competitors have played in thenot-too-distant pastrdquo (Milgrom and Roberts1990) This includes numerous learning dynam-ics such as Bayesian learning fictitious playadaptive learning and Cournot best replyWhile strategic complementarity is sufficientfor convergence it is not a necessary condition

Thus while games with strategic complemen-tarities ought to converge robustly to the Nashequilibrium games without strategic comple-mentarities may also converge under specificlearning algorithms Whether these specificlearning algorithms are a realistic description ofhuman learning is an empirical question

While the theory on games with strategiccomplementarities predicts convergence toequilibrium it does not address four practicalissues First as the parameters of a game ap-proach the threshold of strategic complementa-rities does play converge gradually or abruptlySecond is convergence faster further past thethreshold Third how important is strategiccomplementarity compared to other features ofa game that might also induce convergence toequilibrium Last for supermodular games withmultiple Nash equilibria will players learn tocoordinate on a particular equilibrium Wechoose a game that allows us to answer the firstthree questions The fourth question has beenaddressed by John Van Huyck et al (1990) andJames Cox and Mark Walker (1998)3

Specifically we use the compensation mech-anism to study the role of strategic comple-mentarities in learning and convergence toequilibrium play In the mechanism each of twoplayers offers to compensate the other for theldquocostsrdquo of the efficient choice Assume thatwhen player 1rsquos production equals x her netprofit is rx c(x) where r is the market priceand c is a differentiable positive increasingand convex cost function Production causes anexternality on player 2 whose payoff is e(x)

3 Cox and Walker (1998) study whether subjects canlearn to play Cournot duopoly strategies in games with twokinds of interior Nash equilibrium Their type I duopoly hasa stable interior Nash equilibrium under Cournot best-replydynamics and therefore is dominance solvable (Herve Mou-lin 1984) Their type II duopoly has an unstable interiorNash equilibrium and two boundary equilibria underCournot best-reply dynamics and therefore is not domi-nance solvable They found that after a few periods subjectsdid play stable interior dominance-solvable equilibria butthey did not play the unstable interior equilibria nor theboundary equilibria It is interesting to note that theseduopoly games are submodular games Being two-playergames they are also supermodular (Rabah Amir 1996)Results of Cox and Walker (1998) illustrate the importanceof uniqueness together with supermodularity in inducingconvergence

1507VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

also assumed to be differentiable positive in-creasing and convex The mechanism is a two-staged game where the unique subgame-perfectNash equilibrium induces the Pareto-efficientoutcome of x such that r e(x) c(x) In thefirst stage (the announcement stage) player 1announces p1 a per-unit subsidy to be paid toplayer 2 while player 2 simultaneously an-nounces p2 a per-unit tax to be paid by player 1Announcements are revealed to both players Inthe second stage (the production stage) player 1chooses a production level x The payoff toplayer 1 is 1 rx c(x) p2x (p1 p2)2 while the payoff to player 2 is 2 p1x e( x) where 0 is a free punishmentparameter chosen by the designer

We study a generalized version of the com-pensation mechanism (Cheng 1998) whichadds a punishment term (p1 p2)2 toplayer 2rsquos payoff function thus making the pay-off functions

(1) 1 rx cx p2 x p1 p2 2

2 p1x ex p1 p22

Using the generalized version we solve thegame by backward induction In the productionstage player 1 chooses the quantity that solvesmaxx rx c(x) p2x (p1 p2)2 The firstorder condition r c(x) p2 0 character-izes the best response in the second stage x(p2)In the announcement stage player 1 solvesmaxp1

rx c(x) p2x (p1 p2)2 The firstorder condition is

(2)1

p1 2p1 p2 0

which yields the best response function forplayer 1 as p1 p2 Player 2 simultaneouslysolves maxp2

p1x(p2) e(x(p2)) (p1 p2)2 The first order condition is

(3)2

p2 p1 x p2 exx p2

2p1 p2 0

which characterizes player 2rsquos best responsefunction

The unique subgame-perfect equilibrium hasp1 p2 p where p is the Pigovian taxwhich induces the efficient quantity x As theequilibrium does not depend on the value of it holds for the original version where 0When is set appropriately however thegeneralized version is a supermodular mecha-nism The following proposition characterizesthe necessary and sufficient condition forsupermodularity

PROPOSITION 1 (Cheng 1998) The gener-alized version of the compensation mechanismis supermodular if and only if 0 and 1

2x(p2)

The proof is simple First as the strategyspace is one-dimensional the supermodularitycondition is automatically satisfied Second weuse Theorem 1 to check for increasing differ-ences For player 1 from equation (2) we have21p1p2 2 0 while for player 2 fromequation (3) we have 22p1p2 x(p2) 2 Therefore 22p1p2 0 if and only if 1

2x(p2)

To obtain analytical solutions we use a qua-dratic cost function c(x) cx2 where c 0and a quadratic externality function e(x) ex2where e 0 We now summarize the bestresponse functions equilibrium solutions andstability analysis in this environment

PROPOSITION 2 Quadratic cost and exter-nality functions yield the following character-izations

(i) The best response functions for players 1and 2 are

(4) p1 p2

(5) p2

1

4c

e

4c2

p1

er

4c2

e

4c2

and

x max0r p2

2c

(ii) The subgame-perfect Nash equilibrium ischaracterized as

1508 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

p1 p2 x

er

e c

er

e c

r

2e c

(iii) If players follow Cournot best-reply dy-namics (p1 p2) is a globally asymptoti-cally stable equilibrium of the continuoustime dynamical system for any 0 and 0

(iv) The game is supermodular if and only if 0 and 14c

PROOFSee Appendix

The best-response functions presented in part(i) of Proposition 2 reveal interesting incentivesWhile player 1 has an incentive to always matchplayer 2rsquos price player 2 has an incentive tomatch only when player 1 plays the equilibriumstrategy Also at the threshold for strategiccomplementarity 14c player 2 has a dom-inant strategy p2 er(e c) p2 Part (iii) ofProposition 2 extends Cheng (1998) whoshows that the original version of the compen-sation mechanism ( 0) is globally stableunder continuous-time Cournot best-reply dy-namics4 Cournot best-reply is however a rel-atively poor description of human learning(Richard Boylan and Mahmoud El-Gamal1993) Therefore global stability under Cournotbest reply for any 0 does not imply equi-librium convergence among human subjectsPart (iv) characterizes the threshold for strategiccomplementarity in our experiment a more ro-bust stability criterion than that characterized bypart (iii)

An intuition for how strategic complementa-rities affect the outcome of adaptive learningcan be gained from analysis of the best responsefunctions equations (4) and (5) While player1rsquos best response function is always upwardsloping player 2rsquos best response function equa-tion (5) is nondecreasing if and only if 14c ie when the game is supermodular Be-yond the threshold for strategic complementar-ity both best-response functions are upward

sloping and they intersect at the equilibrium Itis easy to verify graphically that adaptive learn-ers eg Cournot best reply will converge tothe equilibrium regardless of where they start

To examine how which is unrelated tostrategic complementarity might affect behav-ior we observe that when player 1 deviatesfrom the best response by ie p1 p2 his profit loss is 1 2 This profit losswhich is proportional to the magnitude of isthe deviation cost for player 1 Based on previ-ous experimental evidence the incentive to de-viate from best response decreases when thedeviation cost increases Chen and Plott (1996)call it the General Incentive Hypothesis ie theerror of game theoretic models decreases as theincentive to best-respond increases We there-fore expect that an increase in improvesplayer 1rsquos convergence to equilibrium Whenplayer 1 plays equilibrium strategy player 2rsquosbest response is to play equilibrium strategy aswell Therefore we expect that an increase in might improve player 2rsquos convergence to equi-librium as well It is not clear however whetherthe -effects systematically change the effectsof the supermodularity parameter We rely onexperimental data to test the interaction of the-effects on -effects

II Experimental Design

Our experimental design reflects both theo-retical and technical considerations Specifi-cally we chose an environment that allowssignificant latitude in varying the free parame-ters to better assess the performance of thecompensation mechanism around the thresholdof strategic complementarity We describe thisenvironment and the experimental proceduresbelow

A The Economic Environment

We use the general payoff functions pre-sented in equation (1) with quadratic cost andexternality functions to obtain analytical solu-tions c(x) cx2 and e(x) ex2 We use thefollowing parameters c 180 e 140 r 24 From Proposition 2 the subgame-perfectNash equilibrium is (p1 p2 x) (16 16 320)and the threshold for strategic complementarityis 14c 20

4 Cheng (1998) also shows that the original mechanismis locally stable under discrete-time Cournot best-reply dy-namics

1509VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

In the experiment each player chooses pi 0 1 40 Without the compensation mech-anism the profit-maximizing production level isx r2c 960 three times higher than theefficient level To reduce the complexity ofplayer 1rsquos problem we use a grid size of 10 forthe quantity and truncate the strategy space ieplayer 1 chooses X 0 1 50 where X x10 The payoff functions presented to the sub-jects are adjusted accordingly Truncating thestrategy space to X 50 also reduces the pos-sibility of player 2rsquos bankruptcy To reducepayoff asymmetry we give player 1 a lump-sum payment of 250 points each round There-fore the equilibrium payoffs under themechanism for the two players are 1 1530and 2 2560

The functional forms and specific parametervalues are chosen for several reasons Firstequilibrium solutions are integers Secondequilibrium prices and quantities do not lie inthe center of the strategy space thus avoidingequilibrium convergence as a result of focalpoints Third there is a salient gap betweenefficiency with and without the mechanismWithout the mechanism the profit-maximizingproduction level is X 50 resulting in anefficiency level of 684 percent5 With themechanism the system achieves 100-percentefficiency in equilibrium Finally since thethreshold for strategic complementarity is 20 there is a large number of integer values tochoose from both below and above thethreshold

To study how strategic complementarity af-fects equilibrium convergence we keep 20and vary 0 18 20 and 40 To studywhether affects convergence we also keep 10 and vary 0 20

B Experimental Procedures

Our experiment involves 12 players persessionmdashsix player 1s (called Red players in the

instructions) and six player 2s (Blue players)Each player remains the same type throughoutthe experiment At the beginning of each ses-sion subjects randomly draw a PC terminalnumber Each then sits in front of the corre-sponding terminal and is given printed instruc-tions After the instructions are read aloudsubjects are encouraged to ask questions Theinstruction period lasts between 15 and 30minutes

Each round a player 1 is randomly matchedwith a player 2 Subjects are randomly re-matched each round to minimize repeated gameeffects The random rematching protocol alsominimizes the possibility that players colludeon a high-subsidy and low-tax outcome6 Eachsession consists of 60 rounds As we are inter-ested in learning there are no practice roundsEach round consists of two stages

(i) Announcement Stage Each player simulta-neously and independently chooses a pricepi 0 1 40

(ii) Production Stage After (p1 p2) are chosenplayer 1rsquos computer displays player 2rsquosprice and a payoff table showing her payofffor each X 0 1 50 Player 1 thenchooses a quantity X The server calculatespayoffs and sends each player his payoffthe quantity chosen and the prices submit-ted by him and his match

To summarize each subject knows both pay-off functions the choices made each round byhimself and his match as well as his per-periodand cumulative payoffs At any point subjectshave ready access to all of this information Themechanism is thus implemented as a game ofcomplete information We do not know how-ever how subjects processed this informationnor do we know their beliefs about the rational-

5 We compute the efficiency level by using the ratio oftotal earnings without the mechanism and total earningswith the mechanism Without the mechanism at the pro-duction level of X 50 (or x 500) total earning is 1 2 (rx cx2) ex2 2625 Therefore the efficiencylevel is 2625(1280 2560) 0684

6 If the players could commit to maximizing joint profitsby choosing p1 and p2 cooperatively and x noncoopera-tively they would choose by equation (1) (p1 p2 x) (4( )er[4( )(e c) 1] r(4e( ) 1)[4( )(e c) 1] 2r( )[4( )(e c) 1]) With ourchoice of parameters we get p1 24 p2 12 for 20and 0 p1 192 p2 144 for 20 and 20and p1 18 p2 15 for 20 and 40 etc If inaddition they could choose (p1 p2 x) cooperatively then forour parameter values we get (p1 p2 x) (min480[3( ) 20] 40 min960( )[3( ) 20] 50)

1510 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

ity of others Both of these factors introduceuncertainty in the environment

Table 1 presents features of the experimentalsessions including parameters number of inde-pendent sessions in each treatment whether themechanism is supermodular in that treatmentand equilibrium prices and quantities Overall27 independent computerized sessions wereconducted in the Research Center for GroupDynamics (RCGD) lab at the University ofMichigan from April to July 2001 and in April2003 We used zTree to program our experi-ments Our subjects were students from the Uni-versity of Michigan No subject was used inmore than one session yielding 324 subjectsEach session lasted approximately one-and-a-half hours The exchange rate for all treatmentswas one dollar for 4250 points The averageearnings was $2282 Data are available fromthe authors upon request

III Hypotheses

Given the design above we next identify ourhypotheses To do so we first define and discusstwo measures of convergence the level andspeed7 In theory convergence implies that allplayers play the stage game equilibrium and nodeviation is observed This is not realistic how-ever in an experimental setting Therefore wedefine the following measures

DEFINITION 1 The level of convergence atround t L(t) is measured by the proportion ofNash equilibrium play in that round The levelof convergence for a block of rounds Lb(t1 t2)measures the average proportion of Nash equi-librium play between rounds t1 and t2 ie Lb(t1t2) yentt1

t2 L(t)(t2 t1 1) where 0 t1 t2 T and T is the total number of rounds

We define the level of convergence for both around and a block of rounds The block conver-gence measure smooths out inter-roundvariation

Ideally the speed of convergence shouldmeasure how quickly all players converge toequilibrium strategies In our experimental set-ting however we never observe perfect con-vergence We therefore use a more generaldefinition for the speed of convergence

DEFINITION 2 For a given level of conver-gence L (0 1] the speed of convergence ismeasured by the first round in which the level ofconvergence reaches L and does not subse-quently drop below this level ie such thatL(t) L for any t

We use Definition 2 for analysis of conver-gence speed in simulations Due to oscillation inexperimental data Definition 2 is not practicalWe use the following observation which en-ables us to measure convergence speed usingestimates for the slope of L(t) L(t) L(t 1) L(t) obtained in regression analysis LetLy(t) be the convergence level of treatment y at

7 We thank anonymous referees for suggesting this sep-aration and appropriate measures

TABLE 1mdashFEATURES OF EXPERIMENTAL SESSIONS

Parameters and treatments Properties Equilibrium

Supermodular (p1 p2 X)10 20

1000 20000 (4 sessions) (5 sessions)

2018 18 (4 sessions) No (16 16 32)

1020 202020 (5 sessions) (5 sessions)

204040 (4 sessions) Yes (16 16 32)

1511VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

time t We now relate the slope of L(t) and theinitial level of convergence L(1) to the speed ofconvergence

OBSERVATION 1 If L1(1) L2(1) andL1(t) L2(t) for all t [1 T 1] thengiven any L (0 1] the first treatment con-verges more quickly than the second treatment

Based on theories presented in Section I wenow form our hypotheses about the level andspeed of convergence While theories of strate-gic complementarities do not make any predic-tions about the speed of convergence we formour hypotheses based on previous experimentsthat incidentally address games with strategiccomplementarities

HYPOTHESIS 1 When 20 increasing from 0 to 20 significantly increases (a) the leveland (b) the speed of convergence

Hypothesis 1 is based on the theoretical predic-tion that games with strategic complementari-ties converge to the unique Nash equilibrium aswell as on previous experimental findings thatsupermodular games perform robustly betterthan their non-supermodular counterparts

HYPOTHESIS 2 When 20 increasing from 0 to 18 significantly increases (a) the leveland (b) the speed of convergence

Hypothesis 2 is based on the findings ofFalkinger et al (2000) that average play is closeto equilibrium when the free parameter isslightly below the supermodular threshold

HYPOTHESIS 3 When 20 increasing from 18 to 20 significantly increases (a) thelevel and (b) the speed of convergence

Since we have not found any previous experi-mental studies that compare the performance ofgames with strategic complementarities withthose near the threshold Hypothesis 3 is purespeculation

HYPOTHESIS 4 When 20 increasing from 20 to 40 does not significantly in-crease either (a) the level or (b) the speed ofconvergence

Since we have not found any previous experi-mental studies within the class of games withstrategic complementarities Hypotheses 4 isagain our speculation

HYPOTHESIS 5 When 0 or 20 increas-ing from 10 to 20 significantly increases (a)the level and (b) the speed of convergence

HYPOTHESIS 6 Changing from 10 to 20significantly increases the improvement in (a)the level and (b) the speed of convergence thatresults from increasing from 0 to 20

Hypothesis 5 is based on experimental findingssupporting the General Incentive HypothesisHypothesis 6 is our speculation

Hypotheses 1 through 6 are concerned withonly one measure of performance convergenceto equilibrium Other measures such as effi-ciency and budget balance can be largely de-rived from convergence patterns Thereforealthough we omit the formal hypotheses wepresent results regarding these measures in Sec-tions IV and V

IV Experimental Results

In this section we compare the performanceof the mechanism as we vary and At theindividual level we look at both the level andspeed of convergence to subgame-perfect equi-librium and near equilibrium in each of the sixdifferent treatments At the aggregate level weexamine the efficiency and budget imbalancegenerated by each treatment In the followingdiscussion we focus on prices rather than onquantity Recall that player 1rsquos best response inthe production stage is uniquely determined byp2 and that player 1 has all of the informationneeded to select this best response In our ex-periment deviations in the production stagetend to be small (the average absolute deviationis less than 1 in 23 of 27 sessions) and does notdiffer significantly among treatments There-fore it is not surprising that in all of our anal-yses the results for quantities largely mirror theresults for player 2rsquos price8

Recall that subgame perfect Nash equilib-

8 Tables are available from the authors upon request

1512 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium for a stage game is (p1 p2 X) (16 1632) Since the strategy space in this experimentis rather large and the payoff function is rela-tively flat near equilibrium a small deviationfrom equilibrium is not very costly For exam-ple in the 2020 treatment a one-unit unilat-eral deviation from equilibrium prices costsplayer 1 $0005 and player 2 $0014 Thereforewe check the -equilibrium play by looking atthe proportion of price announcements within1 of the equilibrium price and the quantityannouncement within 4 of the equilibriumquantity since a one-unit change in p2 results ina four-unit best-response quantity changeTherefore the -equilibrium prediction is (-p1-p2 -x) (15 16 17 15 16 1728 32 36)

Figures 1 and 2 contain box and whiskersplots for the prices of each treatment for all 60rounds by players 1 and 2 respectively The boxrepresents the ranges of the twenty-fifth andseventy-fifth percentiles of prices while thewhiskers extend to the minimum and maximumprices in each round The horizontal line withineach box represents the median price Com-pared with the 00 treatments equilibriumprice convergence is clearly more pronouncedin the supermodular and near-supermodulartreatments

To analyze the performance of the compen-sation mechanism we first compare the conver-gence level achieved in the last 20 rounds ofeach treatment Table 2 reports the level ofconvergence (Lb(41 60)) to subgame-perfectNash equilibrium (panels A and B) and -Nashequilibrium (panels C and D) for each sessionunder each of the six different treatments aswell as the alternative hypotheses and the cor-responding p-values of one-tailed permutationtests9 under the null hypothesis that the con-vergence levels in the two treatments (column

[8]) are the same While the proportion of-equilibrium play is much higher than the pro-portion of equilibrium play the results of thepermutation tests largely follow similar pat-terns We now formally test our hypothesesregarding convergence level Parts (i) to (iii) ofResult 1 present the effects of the degree ofstrategic complementarity (-effects) Parts (iv)and (v) present effects due to changes in (-effects)

RESULT 1 (Level of Convergence in theLast 20 Rounds)

(i) When 20 increasing from 0 to 18and from 0 to 20 significantly10 increasesthe level of convergence in p1 p2 and -p2

(ii) When 20 increasing from 18 to 20does not change the level of convergencesignificantly

(iii) When 20 increasing from 20 to 40does not change the level of convergencesignificantly

(iv) When 0 increasing from 10 to 20weakly increases the level of convergencein p2 and significantly increases the levelof convergence in -p2 but has no signif-icant effects on p1 or -p1

(v) When 20 increasing from 10 to 20weakly increases the level of convergencein p1 but not in -p1 p2 or -p2

SUPPORT The last two columns of Table2 report the corresponding alternative hypothe-ses and permutation test results

By part (i) of Result 1 we accept Hypotheses1(a) and 2(a) Part (i) also confirms previousexperimental findings that supermodular gamesperform significantly better than those far fromthe supermodular threshold Furthermore near-supermodular games also perform significantlybetter than those far from the threshold

By part (ii) however we reject Hypothesis3(a) This is the first experimental result thatshows that from a little below the supermodularthreshold ( 18) to the threshold ( 20)

9 The permutation test also known as the Fisher random-ization test is a nonparametric version of a difference oftwo means t-test (See eg Sidney Siegel and N JohnCastellan 1988) The idea is simple and intuitive by pool-ing all independent observations the p-value is the exactprobability of observing a separation between the two treat-ments as the one observed when the pooled observations arerandomly divided into two equal-sized groups This testuses all of the information in the sample and thus has100-percent power-efficiency It is among the most power-ful of all statistical tests

10 When presenting results throughout the paper wefollow the convention that a significance level of 5 percentor less is significant while a significance level between 5percent and 10 percent is weakly significant

1513VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

improvement in convergence level is statisti-cally insignificant In other words we do notsee a dramatic improvement at the thresholdThis implies that the performance of near-supermodular games such as the Falkingermechanism ought to be comparable to that ofsupermodular games

By contrast we accept Hypothesis 4(a) bypart (iii) This is the first experimental resultsystematically comparing convergence levels ofsupermodular games where theory is silentThe convergence level does not significantlyimprove as increases from the threshold 20to 40 Therefore the marginal returns for being

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β00 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β00 Player 1

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β20 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β20 Player 1

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α20β18 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β40 Player 1

Pri

ce A

nn

ou

nce

d

Round

FIGURE 1 DISTRIBUTION OF ANNOUNCED PRICES IN EXPERIMENTAL TREATMENTS PLAYER 1

1514 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

ldquomore supermodularrdquo diminish once the payoffsbecome supermodular

Proposition 2 predicts convergence underCournot best reply for any 0 Howeverthere is a significant difference in convergencelevel as increases from 0 to 18 20 and

beyond We investigate in Section V whetherthis difference persists in the long run

While parts (i) to (iii) present the -effectsparts (iv) and (v) examine the -effects and wepartially accept Hypothesis 5(a) Recall fromequation (5) and subsequent discussions that at

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β00 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β00 Player 2

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β20 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β20 Player 2

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α20β18 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β40 Player 2

Pri

ce A

nn

ou

nce

d

Round

FIGURE 2 DISTRIBUTION OF ANNOUNCED PRICES IN EXPERIMENTAL TREATMENTS PLAYER 2

1515VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the threshold of strategic complementarity 20 player 2rsquos Nash equilibrium strategy is alsoa dominant strategy The finding of no -effecton player 2rsquos equilibrium play when 20 isconsistent with this observation

To determine the effects of strategic comple-mentarities and other factors on convergencespeed and to investigate further their effects onconvergence level we use probit models withclustering at the individual level The resultsfrom these models are presented in Table 3 Thedependent variable is -p1 in specifications (1)and (3) and -p2 in specifications (2) and (4)

In specifications (1) and (2) the independentvariables are treatment dummies Dy wherey 1000 2000 18 1020 and 40ln(Round) and a constant We use dummies fordifferent values of and rather than directparameter values to avoid assuming a linearrelationship of parameter effects and omit thedummy for 2020 Therefore in these twospecifications restricting learning speed to bethe same across treatments the estimated coef-ficient of Dy captures the difference in theconvergence level between treatments y and2020 Results from these specifications are

TABLE 2mdashLEVEL OF CONVERGENCE IN THE LAST 20 ROUNDS

Panel A Proportion of player 1 Nash equilibrium price (p1) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0000 0008 0158 0008 0044 1000 2000 03972000 0000 0083 0083 0042 0058 0053 2000 2020 00082018 0125 0117 0100 0192 0133 2000 2018 00081020 0242 0067 0067 0067 0208 0130 1020 2020 00642020 0325 0267 0208 0058 0367 0245 2018 2020 00642040 0175 0400 0158 0133 0217 2020 2040 0643

Panel B Proportion of player 2 Nash equilibrium price (p2) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0025 0042 0025 0067 0040 1000 2000 00562000 0067 0083 0033 0075 0050 0062 2000 2020 00322018 0133 0292 0108 0258 0198 2000 2018 00081020 0392 0108 0175 0067 0667 0282 1020 2020 05042020 0308 0117 0483 0017 0450 0275 2018 2020 02382040 0325 0492 0233 0233 0321 2020 2040 0365

Panel C Proportion of player 1 -Nash equilibrium price (-p1) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0108 0200 0350 0158 0204 1000 2000 01832000 0067 0458 0358 0183 0425 0298 2000 2020 00872018 0317 0625 0300 0583 0456 2000 2018 01031020 0475 0200 0300 0375 0492 0368 1020 2020 01942020 0450 0558 0475 0133 0775 0478 2018 2020 04442040 0458 0492 0375 0442 0442 2020 2040 0643

Panel D Proportion of player 2 -Nash equilibrium price (-p2) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0133 0200 0242 0225 0200 1000 2000 00162000 0300 0400 0200 0275 0333 0302 2000 2020 00162018 0717 0667 0342 0700 0606 2000 2018 00161020 0650 0517 0542 0400 0892 0600 1020 2020 05642020 0533 0592 0642 0225 0900 0578 2018 2020 05562040 0825 0792 0467 0517 0650 2020 2040 0294

Note Significant at 10-percent level 5-percent level 1-percent level

1516 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

largely consistent with Result 1 which uses amore conservative test The coefficients ofln(Round) are both positive and highly signifi-cant indicating that players learn to play equilib-rium strategies over time The concave functionalform ln(Round) which yields a better log-likelihood than either the linear or quadratic func-tional form indicates that learning is rapid at thebeginning and decreases over time We will ex-amine learning in more detail in Section V

In specifications (3) and (4) we useln(Round) interaction of each of the treatmentdummies with ln(Round) and a constant asindependent variables The interaction term al-lows different slopes for different treatmentsCompared with the coefficient of ln(Round) the

coefficient for the interaction term Dyln(Round) captures the slope differences be-tween treatment y and 2020 By Observation1 as we cannot reject the hypotheses that initialround probability of equilibrium play is thesame across all treatments11 we can comparethe slope of L(t) L(t) between different treat-ments From the probit specification L(t) 13(constant Db ln(t)) where D is the 1 5vector of treatment dummies and b is the 5 1vector of estimated coefficients we can derive

11 When including treatment dummies in this specifica-tion Wald tests on the hypothesis that the treatment dummycoefficients are all zero yield p-values of 02703 for -p1and 03383 for -p2

TABLE 3mdashCONVERGENCE SPEED PROBIT MODELS WITH CLUSTERING AT INDIVIDUAL LEVEL

Dependent variable -Nash equilibrium play

(1) -Nashprice 1

(2) -Nashprice 2

(3) -Nashprice 1

(4) -Nashprice 2

D1000 0143 0268(0048) (0050)

D2000 0090 0217(0060) (0057)

D18 0005 0004(0071) (0069)

D1020 0080 0025(0060) (0073)

D40 0000 0078(0068) (0078)

ln(Round) 0115 0167 0131 0183(0014) (0017) (0021) (0022)

D1000ln(Round) 0050 0095(0019) (0022)

D2000ln(Round) 0031 0073(0020) (0021)

D18ln(Round) 0001 0004(0020) (0021)

D1020ln(Round) 0024 0008(0019) (0022)

D40ln(Round) 0002 0023(0019) (0024)

Observations 9720 9720 9720 9720Number of groups 162 162 162 162Log pseudo-likelihood 5562890 5824055 5557424 5793013

D1000 D2000 150 131D1000ln(Round) D2000ln(Round) 133 123D1020 D1000 D2000 005 107D1020ln(Round) D1000ln(Round)

D2000ln(Round)005 103

Notes Coefficients are probability derivatives Robust standard errors in parentheses are adjusted for clustering at theindividual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummy variable fortreatment y Excluded dummy is D2020 The bottom panel presents null hypotheses and Wald 2(1) test statistics

1517VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the slope of the probability function for treat-ment y with respect to t Ly(t) 13(con-stant Db)byt All coefficients reported inTable 3 are increments of probability deriva-tives 13(constant Db)by compared to thebaseline case of 2020

RESULT 2 (Speed of Convergence)

(i) When 20 increasing from 0 to 18and from 0 to 20 significantly increases thespeed of convergence in -p1 and -p2

(ii) When 20 increasing from 18 to 20has no significant effect on convergencespeed

(iii) When 20 increasing from 20 to 40has no significant effect on convergencespeed

(iv) When 0 increasing from 10 to 20has no significant effects on convergencespeed

(v) When 20 increasing from 10 to 20has no significant effects on convergencespeed

SUPPORT Models (3) and (4) in Table 3 re-port analyses on convergence speed For eachindependent variable the coefficients standarderrors (in parentheses) and significance levelsare reported The second Wald test looks atwhether the coefficient of D1000ln(Round)equals that of D2000ln(Round)

Part (i) of Result 2 supports Hypotheses 1(b)and 2(b) while by part (ii) we reject Hypothesis3(b) Part (iii) supports Hypothesis 4(b) Parts(iv) and (v) reject Hypothesis 5(b) Result 2provides the first empirical evidence on the roleof strategic complementarity and the speed ofconvergence

Although part (ii) indicates that increasing from 18 to 20 does not significantly changeconvergence speed we now investigate whetherthere are any differences between 18 and thesupermodular treatments In particular we com-pare 2018 with 2020 and 204012 InResult 1 we show that these treatments achieve

the same convergence level Also we cannotreject the hypothesis that round-one prices foreach player are drawn for the same distributionfor each treatment13 As the treatments all startand converge to similar levels of equilibriumplay we use a more flexible functional form tolook for differences in speed

In Table 4 we report the results of probitregressions comparing 2018 with the two20 supermodular treatments We use Roundln(Round) and their interactions with 2018as independent variables to allow different con-vergence speeds to the same level of conver-gence In model (1) the dependent variable isplayer 1 -equilibrium play There is no signif-icant difference between 2018 and the 20supermodular treatments In model (2) the de-pendent variable is player 2 -equilibrium playThere are significant differences between thenear-supermodular and 20 supermodular treat-ments In early rounds ln(Round) is large relativeto Round The negative and weakly significantcoefficient on D2018ln(Round) implies a lowerprobability of early-round equilibrium play in

12 We omit 1020 from this analysis First it does notachieve the same -p1 convergence level as the other su-permodular treatments Second omitting it avoids the pos-sibility of an -effect

13 Kolmogorov-Smirnov test result tables are availableby request

TABLE 4mdashCONVERGENCE SPEED

Probit models with individual-level clustering comparing2018 with the 20 supermodular treatments achieving

similar convergence levels

Dependent variable -Nashequilibrium play

(1) -Nashprice 1

(2) -Nashprice 2

ln(Round) 0054 0216(0040) (0043)

Round 0004 0001(0002) (0002)

D2018ln(Round) 0013 0066(0032) (0035)

D2018Round 0001 0006(0002) (0003)

Observations 4680 4680Log pseudo-likelihood 287907 299385

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses Significant at 10-percent level 5-percent level 1-percent level Dy isthe dummy variable for treatment y Excluded dummies areD2020 and D2040

1518 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

2018 relative to the supermodular treatmentsThe 18 treatment does catch up to these treat-ments which is reflected in the positive andsignificant coefficient on D2018 interactedwith Round This result suggests a differencebetween supermodular and near-supermodulartreatments while they achieve similar conver-gence levels the supermodular treatments per-form better in early rounds and thus mightreach certain convergence levels faster thantheir near-supermodular counterparts

The previous discussion examines the sepa-rate effects of strategic complementarity (-effects) and (-effects) on convergenceVarying allows us however to test the im-portance of strategic complementarity relativeto other features of mechanism design Having afull factorial design in the parameters 1020 and 0 20 we can study whether affects the role of strategic complementarity Inparticular as increases from 0 to 20 weexpect improvement in convergence level andspeed We study whether this improvementchanges when increases from 10 to 20

The last two Wald tests in the bottom panelof Table 3 separately examine the -effectson the change in convergence level and speedresulting from increasing from 0 to 20(-effects on -effects) Changing from 10 to20 does not significantly change the improve-ment in either convergence level or speed Thisis the first empirical result examining the ef-fects of other factors on the role of strategiccomplementarity

So far we have discussed the performance ofthe mechanism relative to equilibrium predic-tions and individual behavior We now turn togroup-level welfare results Since the compen-sation mechanism balances the budget only in

equilibrium total payoffs off the equilibriumpath can be only weakly related to efficientpayoffs Therefore we use two separate mea-sures to capture welfare implications an effi-ciency measure and a measure of budgetimbalance

We first define the per-round efficiency mea-sure which includes neither the taxsubsidy northe penalty terms ie

et rxt cxt ext

rx cx ex

where x is the efficient quantity The efficiencyachieved in a block e(t1 t2) where 0 t1 t2 60 is then defined as

et1 t2 t t1

t2 et

t2 t1 1

RESULT 3 (Efficiency in the Last 20Rounds)

(i) When 20 increasing from 0 to 18significantly improves efficiency while in-creasing from 0 to 20 weakly improvesefficiency

(ii) When 20 increasing from 18 to 20has no significant effect on efficiency

(iii) When 20 increasing from 20 to 40weakly improves efficiency

(iv) When 0 increasing from 10 to 20weakly improves efficiency

(v) When 20 increasing from 10 to 20has no significant effect on efficiency

SUPPORT Table 5 presents the efficiencymeasure for each session in each treatment

TABLE 5mdashEFFICIENCY MEASURE

Efficiency Last 20 rounds

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0600 0671 0804 0858 0733 1000 2000 00872000 0650 0870 0907 0873 0825 0825 2000 2020 00952018 0963 0952 0889 0959 0941 2000 2018 00161020 0958 0919 0905 0881 0984 0929 1020 2020 07142020 0888 0937 0939 0780 0971 0903 2018 2020 08332040 0973 0962 0943 0949 0957 2020 2040 0064

Note Significant at 10-percent level 5-percent level 1-percent level

1519VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

in the last 20 rounds the alternative hypoth-eses and the results of one-tailed permutationtests

Result 3 is largely consistent with Result 1indicating that supermodular and near-supermodular mechanisms induce a signifi-cantly higher proportion of equilibrium playthan mechanisms far from the supermodularthreshold The new finding is that increasing from the threshold 20 to 40 weakly improvesefficiency

Second the budget surplus at round t is thesum of the penalty terms plus tax minus sub-sidy in that round ie s(t) ( )(p1(t) p2(t))2 p2(t)x(t) p1(t)x(t) Examining thesession-level budget surplus across all roundswe find the following results First 22 out of 27sessions result in a budget surplus This comesfrom a combination of our choice of parametersand the dynamics of play Second the onlysignificant difference across treatments is thatbudget surpluses under 2018 and 2020are significantly lower than those under2040 ( p-values 0029 and 0024 respec-tively) This finding points to a potential costof driving up the punishment parameter Inother words before the system equilibrates ahigh punishment parameter can worsen thebudget imbalance problem inherent in themechanism

Overall Results 1 through 3 suggest the fol-lowing observations regarding games with stra-tegic complementarities First in terms ofconvergence level and speed supermodular andnear-supermodular games perform significantlybetter than those far under the threshold Sec-ond while there is no significant differencebetween supermodular and near-supermodulargames in terms of convergence level early per-formance differences may lead to supermodulartreatments reaching certain convergence levelsmore quickly Third beyond the threshold in-creasing has no significant effect on either thelevel or the speed of convergence but has thedisadvantage of risking higher budget imbal-ance Last for a given increasing haspartial effects on convergence level but nosignificant effect on convergence speed Tocheck the persistence of these experimental re-sults in the long run we use simulations inSection V

V Simulation Results Continued Dynamics

Our experiment examines the relationship be-tween strategic complementarity and conver-gence to equilibrium Learning theory predictslong-run convergence Figures 1 and 2 showthat convergence continues to improve in laterrounds for several treatments suggesting con-tinued dynamics Due to time attention andresource constraints it was not feasible to runthe experiment for much longer than 60rounds Therefore we rely on simulations tostudy continued convergence beyond 60rounds

To do so we look for a learning algorithmwhich when calibrated closely approximatesthe observed dynamic paths over 60 roundsThe large empirical literature on learning ingames (see eg Camerer 2003 for a survey)suggests many models Our interest here is notto compare the performance of various learningmodels We look for learning models that matchtwo criteria First we require a model that per-forms well in a variety of experimental gamesSecond given our experiment has complete in-formation about the payoff structure the modelneeds to incorporate this information In thefollowing subsections we first introduce threelearning models which meet our criteria Wethen look at how well the learning models pre-dict the experimental data by calibrating eachalgorithm on a subset of the experimental dataand validating the models on a hold-out sampleWe then report the forecasting results using oneof the calibrated algorithms

A Three Learning Models

The models we examine are stochastic ficti-tious play with discounting (hereafter shortenedas sFP) (Yin-Wong Cheung and Daniel Fried-man 1997 Fudenberg and Levine 1998) func-tional EWA (fEWA) (Teck-Hua Ho et al2001) and the payoff assessment learning model(PA) (Rajiv Sarin and Farshid Vahid 1999)We now give a brief overview of each modelInterested readers are referred to the originalsfor complete descriptions

The particular version of sFP that we use islogistic fictitious play (see eg Fudenberg andLevine 1998) A player predicts the matchrsquosprice in the next round pi(t 1) according to

1520 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(6) pi t 1

pi t yen 1

t 1

rpi t

1 yen 1

t 1

r

for some discount factor r [0 1] Note r 0corresponds to the Cournot best reply assess-ment pi(t 1) pi(t) When r 1 it yieldsthe standard fictitious play assessment Theusual adaptive learning model assumes 0 r 1All observations influence the expected state butmore recent observations have greater weight

As opposed to standard fictitious play stochas-tic fictitious play allows decision randomizationand thus better captures the human learning pro-cess Omitting time subscripts the probability thata player announces price pi is given by

(7) Pr pipi exp13 pi pi

yenj 0

40

exp13 pj pj

Given a predicted price a player is thus morelikely to play strategies that yield higher pay-offs How much more likely is determined by 13the sensitivity parameter As 13 increases theprobability of a best response to pi increases

The second model we consider fEWA is aone-parameter variant of the experience-weighted attraction (EWA) model (Camererand Ho 1999) In this model strategy probabil-ities are determined by logit probabilities simi-lar to equation (7) with actual payoffs ((pipi)) replaced by strategy attraction In bothvariants strategy attraction Ai

j(t) and an expe-rience weight N(t) are updated after everyperiod The experience weight is updated ac-cording to N(t) (1 ) N(t 1) 1where is the change-detection parameter and controls exploration (low ) versus exploita-tion Attractions are updated according to thefollowing rule

(8) Aijt

Nt 1Ai

jt 1 1 I pij pi ti pj pi t

Nt

where the indicator function I(pij pi(t)) equals

one if pij pi(t) and zero otherwise and [0

1] is the imagination weight In EWA all pa-rameters are estimated whereas in fEWA theseparameters (except for 13) are endogenously de-termined by the following functions Thechange-detector function i(t) is given by

(9) i t 1 5 j 1

mi Ipij pi t

1

yen 1

t

I pij pi t

t 2

This function will be close to 1 when recenthistory resembles previous history The imagi-nation weight i(t) equals i(t) while equalsthe Gini coefficient of previous choice frequen-cies EWA models encompass a variety offamiliar learning models cumulative reinforce-ment learning ( 0 1 N(0) 1)weighted reinforcement learning ( 0 0N(0) 1) weighted fictitious play ( 1 0) standard fictitious play ( 1 0)and Cournot best reply ( 1 1)

Finally we introduce the main componentsof the PA model For simplicity we omit allsubscripts that represent player i and let j(t)represent the actual payoff of strategy j in roundt Since the game has a large strategy space weincorporate similarity functions into the modelto represent agent use of strategy similarity Asstrategies in this game are naturally ordered bytheir labels we use the Bartlett similarity func-tion fjk(h t) to denote the similarity betweenthe played strategy k and an unplayed strategyj at period t

fjk h t 1 j kh if j k h0 otherwise

In this function the parameter h determines theh 1 unplayed strategies on either side of theplayed strategy to be updated When h 1fjk(1 t) degenerates into an indicator functionequal to one if strategy j is chosen in round t andzero otherwise

The PA model assumes that a player is amyopic subjective maximizer That is hechooses strategies based on assessed payoffs

1521VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

and does not explicitly take into account thelikelihood of alternate states Let uj(t) denotethe subjective assessment of strategy pj at timet and r the discount factor Payoff assessmentsare updated through a weighted average of hisprevious assessments and the payoff he actuallyobtains at time t If strategy k is chosen at timet then

(10) uj t 1 1 rfjk h tuj t

rfjk h tk t j

Each period the assessed strategy payoffs aresubject to zero-mean symmetrically distributedshocks zj(t) The decision maker chooses on thebasis of his shock-distorted subjective assess-ments uj(t) uj(t) zj(t) At time t he choosesstrategy pk if

(11) uk t uj t 0 pj pk

Note that mood shocks affect only his choicesand not the manner in which assessments areupdated

B Calibration

Literature assessing the performance oflearning models contains two approaches to cal-ibration and validation The first approach cal-ibrates the model on the first t rounds andvalidates on the remaining rounds The secondapproach uses half of the sessions in each treat-ment to calibrate and the other half to validateWe choose the latter approach for two reasonsFirst this approach is feasible as we have mul-tiple independent sessions for each treatmentSecond we need not assume that the parametersof later rounds are the same as those in earlierrounds We thus calibrate the parameters ofeach model in blocks of 15 rounds using theexperimental data from the first two sessions ofeach treatment We then evaluate each model bymeasuring how well the parameterized modelpredicts play in the remaining two or threesessions

For parameter estimation we conduct MonteCarlo simulations designed to replicate thecharacteristics of the experimental settings In

all calibrations we exclude the last two rounds(59 and 60) to avoid any end-of-game effectsWe then compare the simulated paths with theexperimental data to find those parameters thatminimize the mean-squared deviation (MSD)scores Since the final distributions of our pricedata are unimodal the simulated mean is aninformative statistic and is well captured byMSD (Ernan Haruvy and Dale Stahl 2000) Inall simulations we use the k-period-aheadrather than the one-period-ahead approach14 be-cause we are interested in forecasting the long-run mechanism performance In doing so wechoose k 10 15 20 30 and 58 We look atblocks of 10 15 20 and 30 rounds because asplayers gain information and experience infor-mation use may change over time We use k 15 rather than the other values because it bestcaptures the actual dynamics in the experimen-tal data

Each simulation consists of 1500 games(18000 players) and the following steps

(i) Simulated players are randomly matchedinto pairs at the beginning of each round

(ii) Simulated players select price announce-ments(a) Initial round Almost half of all play-

ers selected a first-round price of 20(44 percent of player 1s and 43 percentof player 2s) There are also consider-able spikes in the frequency of select-ing other prices divisible by 515 Basedon Kolmogorov-Smirnov tests on theactual round-one price distribution wereject the null hypotheses of uniformdistribution (d 0250 p-value 0000) and normal distribution (d 0381 p-value 0000) We thus fol-low the convention (eg Ho et al2001) and use the actual first-roundempirical distribution of choices togenerate the first-round choices

(b) Subsequent rounds Simulated playerstrategies are determined via equation(7) for sFP and fEWA and equation(11) for PA

14 Ido Erev and Haruvy (2000) discuss the tradeoffs ofthe two approaches

15 For example the seven most frequently selected pricesare divisible by 5

1522 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iii) Simulated player 1rsquos quantity choice isbased on the following steps(a) Determine each player 1rsquos best re-

sponse to p2(b) Determine whether player 1 will devi-

ate from the best response via the ac-tual probability of errors for eachblock

(c) If yes deviate via the actual mean andstandard deviation for the block Oth-erwise play the best response

(iv) Payoffs are determined by the payoff func-tion of the compensation mechanismequation (1) for each treatment

(v) Assessments are updated according toequation (8) for fEWA and equation (10)for PA

(vi) Proceed to the next round

The discount factor r [0 1] is searched ata grid size of 01 The parameter 13 is searchedat a grid size of 01 in the interval [15 105] forfEWA and [15 250] for sFP The size of thesimilarity window h [1 10] is searched at agrid size of 1 Mood shocks z are drawn from

a uniform distribution16 on an interval [a a]where a is searched on [0 500] with a step sizeof 50 For all parameters intervals and gridsizes are determined by payoff magnitude

Table 6 reports the calibrated parameters(discount factor sensitivity parameter moodshock interval and similarity window size) forthe first two sessions of each treatment in 15-round blocks17 Estimated parameters for thesupermodular and near-supermodular treatmentsare consistent with the increased level of con-vergence over time while the 00 treatmentsare not The second column (sFP) reports thebest-fit parameters for the stochastic fictitiousplay model With the exception of treatment2000 the discount factor is close to 10

16 Chen and Yuri Khoroshilov (2003) compare threeversions of PA models where shocks were drawn fromuniform logistic and double exponential distributions ontwo sets of experimental data and find that the performanceof the three versions of PA were statistically indistinguish-able

17 We also calibrate all parameters for the entire 60rounds but do not report results due to space limitations

TABLE 6mdashCALIBRATION OF THREE LEARNING MODELS IN FIFTEEN-ROUND BLOCKS

Model sFP fEWA PA

Block 1 2 3 4 1 2 3 4 1 2 3 4

Discount rate (r) Discount rate (r)1000 10 07 09 10 01 00 09 102000 07 06 01 01 05 10 02 012018 10 10 09 09 02 10 03 021020 10 10 10 09 01 03 06 032020 10 10 10 09 02 01 05 022040 10 10 10 07 01 10 04 03

Sensitivity (13) Sensitivity (13) Shock interval (a)1000 15 44 45 28 16 45 44 37 500 500 250 502000 15 43 140 180 17 40 51 74 50 100 150 502018 16 80 115 183 15 53 62 95 50 300 500 1501020 19 59 86 138 18 47 62 82 100 500 450 3002020 28 56 81 112 24 38 61 67 200 300 350 3502040 34 65 140 205 32 49 67 99 150 50 150 50

Similarity window (h)1000 1 1 10 92000 10 10 9 62018 5 4 6 11020 1 1 8 32020 2 2 7 12040 1 3 2 2

1523VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

indicating that players keep track of all past playwhen forming beliefs about opponentsrsquo nextmove18 Given these beliefs the increasing sen-sitivity parameter 13 for all treatments except1000 indicates that the likelihood of bestresponse increases over time The third columnreports calibration results for the fEWA modelAgain with the exception of treatment 1000the parameter 13 increases over time The finalcolumn reports calibration of parameters in thePA model For each treatment except 1000 amiddle block has the highest discount factorindicating more weight on new informationabout a strategyrsquos performance The second pa-rameter in the PA model represents the upperbound of the interval from which shocks aredrawn a Experimentation should decrease inthe final rounds Indeed for all treatments theestimated mood-shock ranges (weakly) de-crease from the third to the last block Finallythe decreasing similarity windows from thethird to final block indicate decreasing strategyspillover Relatively large discount factors inthe third block combined with relatively largesimilarity windows flatten the payoff assess-ments around the most recently played strate-gies consistent with the local experimentation(or relatively stabilized play) observed in thedata

C Validation

Using the parameters calibrated on the firsttwo sessions of each treatment we next com-pare the performance of the three learning mod-els in predicting play in the hold-out sampleFor comparison we also present the perfor-mance of two static models The random choicemodel assumes that each player randomlychooses any strategy with equal probability forall rounds This model incorporates only num-ber of strategies and thus provides a minimumstandard for a dynamic learning model Theequilibrium model assumes that each playerplays the subgame perfect Nash equilibriumevery round Its fit conveys the same informa-tion as the proportion of equilibrium play pre-

sented in Section IV but with a different metric(MSD)

Table 7 presents each modelrsquos MSD scoresfor each hold-out session Recall that two ofeach treatmentrsquos four or five independent ses-sions are used for calibration and the rest forvalidation The results indicate that all threelearning models perform significantly betterthan the random choice model (p-value 001one-sided permutation tests) and by a largermargin significantly better than the equilibriummodel (p-value 001 one-sided permutationtests) While the equilibrium model does a poorjob of explaining overall experimental data itsperformance improvement over time19 justifiesthe use of learning models to explain the dy-namics of play Within each of the top threepanels (ie learning models) session-levelMSD scores are lower for the supermodular andnear-supermodular treatments indicating thateach learning model does a better job explainingbehavior in those treatments Indeed in the em-pirical learning literature learning models fitbetter in experiments with better equilibriumconvergence (see eg Chen and Khoroshilov2003)

RESULT 4 (Comparison of Learning Mod-els) The performance of PA is weakly betterthan that of fEWA and strictly better than that ofsFP The performances of fEWA and sFP arenot significantly different

SUPPORT Table 7 reports the MSD scores foreach independent session in the hold-out sampleunder each model The Wilcoxon signed-rankstests show that the MSD scores under PA areweakly lower than those under fEWA (z 0085 p-value 0068) and strictly lowerthan those under sFP (z 0028 p-value 0023) Using the same test we cannot reject thehypothesis that fEWA and sFP yield the sameMSD scores (z 0662 p-value 0508)

Although the performance of PA is onlyweakly better than that of fEWA the overallMSD scores for PA are lower than those for

18 As the discount factor is zero in Cournot best replywe can reject this model based on our estimation of thediscount factors

19 We omit the table to show the performance of theequilibrium model in blocks of 15 rounds as the informa-tion is repetitive with Figures 1 and 2

1524 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

fEWA for every treatment Therefore we usePA for forecasting beyond 60 rounds

Figure 3 presents the simulated path of thecalibrated PA model and compares it with theactual data in the hold-out sample by superim-posing the simulated mean (black line) plus andminus one standard deviation (grey lines) on theactual mean (black box) and standard deviation(error bars) The simulation does a good job oftracking both the mean and the standard devia-tion of the actual data In treatments 1000and 2040 however the reduction in variancelags that seen in the middle rounds of theexperiment

D Forecasting

We now report the results from the PAmodel As the strategic complementarity predic-tions concern long-run performance we use thecalibrated parameters for the first 58 rounds tosimulate play in later rounds This exercise al-

lows us to study convergence and efficiency inlong but finite horizons

In all forecasting exercises we base all post-58-round parameters on those from the lastblock (calibrated from rounds 46 to 58) Sincewe do not know how these parameters mightchange beyond 60 rounds we use three differ-ent specifications First we retain all final-blockparameters Second we exponentially decay insubsequent blocks the probability of deviationfrom the best response in the production stage20

Third we exponentially decay the shock inter-val in subsequent blocks As all three specifica-tions yield qualitatively similar results wereport results from only the third specificationdue to space limitations21 In presenting the

20 In block k 4 we use the probability of deviationPrk minPr4(k 4)2 108 We set a lower bound of108 to avoid the division-by-zero problem

21 Different sequences of random numbers produceslightly different parameters estimates While the conver-

TABLE 7mdashVALIDATION ON HOLD-OUT SESSIONS

Session

Stochastic fictitious play

1000 2000 2018 1020 2020 2040

1 0955 0934 0940 0930 0888 09402 0960 0946 0890 0917 0973 08783 0948 0883 0873Overall 0957 0943 0915 0910 0912 0909

fEWA

1 0956 0934 0942 0928 0887 09482 0960 0946 0890 0922 0978 08863 0946 0879 0871Overall 0958 0942 0916 0910 0912 0917

Payoff assessment

1 0952 0933 0914 0941 0888 09162 0957 0944 0899 0922 0917 08983 0947 0894 0903Overall 0955 0941 0907 0919 0902 0907

Equilibrium play

1 1867 1853 1833 1833 1489 17922 1889 1919 1606 1839 1953 16693 1917 1447 1539Overall 1878 1896 1719 1706 1660 1731

Random play

Overall 0976 0976 0976 0976 0976 0976

1525VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

results we use the shorthand notation x y todenote a measure under treatment x is signifi-cantly higher than that under treatment y at the5-percent level or less and x y to denote thatthe measures are not significantly different atthe 5-percent level

Figure 4 reports the simulated proportion of-equilibrium play We report only the resultsfor the first 500 rounds since the dynamics donot change much thereafter In our simulationall treatments reach higher convergence levelsthan those achieved in the 60 rounds of the

experiment Since the simulated variance reduc-tion lags that of the experiment round 60 resultsare not achieved until approximately 90 roundsof simulation We further note that these im-provements slow down after 200 rounds Theproportion of -equilibrium play is bounded by72 percent for player 1 and 93 percent for player2 We now outline the simulation results

RESULT 5 (Level of Convergence in Round500) In the simulated data for player 1 atround 500 we have the following level of con-vergence ranking in

(i) p1 2018 2040 2020 1020 2000 1000

(ii) -p1 2018 2040 2020 1020 2000 1000

(iii) p2 2040 2018 1020 2020 2000 1000 and

gence level of a given treatment for a given set of parameterestimates is not affected by the sequence of random num-bers this convergence level is somewhat sensitive to theexact parameter calibration We therefore conduct four dif-ferent calibrations for each treatment and select the param-eters that yield the highest level The relative rankings oftreatments are quite robust with respect to ordinal selected

FIGURE 3 SIMULATED DYNAMIC PATH VERSUS ACTUAL DATA

1526 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iv) -p2 2040 2018 1020 2020 2000 1000

SUPPORT The fourth and seventh columns ofTable 8 report p-values for t-tests of the preced-ing alternative hypotheses for players 1 and 2respectively

Comparing Results 1 and 5 we first note thatthe four supermodular and near-supermodulartreatments continue to dominate the 00 treat-ments In fact the simulations suggest that thegap remains constant compared with 2000and actually increases compared with 1000Unlike Result 1 however the four dominanttreatments differ In particular 2018 per-forms better in terms of player 1rsquos convergenceand 2040 in terms of player 2rsquos although thedifferences within the top four treatments aresmaller than the differences between these four

and the 00 treatments In addition an increasein significantly improves convergence by alarge margin (40 to 80 percent) for both playerswhen 0

It is instructive to compare these results withthose concerning the speed of convergence inour experimental treatments In terms of con-vergence to -p2 our regression results (Table4) indicate that 2018rsquos performance in laterrounds enables it to achieve the same conver-gence level as the supermodular treatmentsThis trend continues in our simulations as2018 performs robustly well compared to thesupermodular treatments Likewise in our anal-ysis of the experimental results the conver-gence speed of 2040 was not significantlydifferent from those of the other dominant treat-ments In our simulations however this treat-ment dominates all treatments in player 2equilibrium play

FIGURE 3 CONTINUED

1527VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

In our simulations the convergence improve-ment due to increasing from 0 to 20 doesdepend on (-effect on -effect) An increasein significantly decreases the improvement in

convergence level (all p-values for one-sidedt-tests are less than 001) Due to the relativelypoor performance of 1000 in our simula-tions increasing from 10 to 20 when 0

Player 1

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500Round

Ep

silo

n-e

qu

ilib

riu

m p

lay

α10β00α20β00α20β18α10β20α20β20α20β40

Player 2

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500

Round

Ep

silo

n-e

qu

ilib

riu

m p

lay α10β00

α20β00α20β18α10β20α20β20α20β40

FIGURE 4 SIMULATED -EQUILIBRIUM PLAY

1528 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

improves convergence more dramatically inround 500 than in rounds 40 to 60 In the longrun an increase in is a partial substitute for anincrease in

We now use definition 2 to examine the con-vergence speed in the long run Table 9 presentsresults for the first time a treatment reaches thelevel of convergence L We omit the two 00treatments since they do not converge to thesame levels as the other treatments In terms ofplayer 1 2018 achieves the fastest conver-gence for all levels L The picture for player 2convergence however is more subtle First forthe 20 and 18 treatments the speeds of con-vergence to all L are remarkably similarWhile 2040 lags the other treatments interms of achieving 80-percent -equilibriumplay for player 2 it is the only treatment toachieve L 90 percent

Apart from convergence to equilibrium it isalso important to look at long-run welfare prop-erties We evaluate mechanism performance us-ing the efficiency measure in Section IV

Figure 5 summarizes the simulated and actualefficiency achieved by each of the treatmentsThe top panel reports simulated results whilethe bottom panel reports efficiency in the ex-perimental data In the long run supermodularand near-supermodular treatments continue todiffer from the 00 treatments with 20 supe-rior to 10 in the 00 treatments

VI Interpretation and Discussion

The underlying force for convergence ingames with strategic complementarities is thecombination of the slope of the best-responsefunctions and adaptive learning by players In

TABLE 9mdashSPEED OF CONVERGENCE IN SIMULATED DATA

(Threshold is the percent of players playing epsilon-equilibrium strategy Entry indicates round in which went overthreshold for good where ldquomdashrdquo indicates that treatment never achieved threshold)

Threshold

Player 1 Player 2

04 05 06 07 04 05 06 07 08 09

2018 54 83 126 316 38 52 62 87 127 mdash1020 129 221 mdash mdash 36 48 63 91 148 mdash2020 61 91 149 mdash 39 51 61 84 137 mdash2040 101 166 263 496 30 38 56 94 166 339

TABLE 8mdashRESULTS OF t-TESTS COMPARING LEVEL OF CONVERGENCE OF SIMULATIONS OF EXPERIMENTAL TREATMENTS

(Values are for round 500 and based on 1500 simulated games)

Treatment

Player 1 equilibrium price Player 2 equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0073 00822000 0188 2000 1000 0000 0213 2000 1000 00002018 0265 2018 2040 0187 0381 2018 1020 02641020 0211 1020 2000 0000 0376 1020 2020 00012020 0243 2020 1020 0000 0353 2020 2000 00002040 0259 2040 2020 0007 0442 2040 2018 0000

Treatment

Player 1 -equilibrium price Player 2 -equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0216 02322000 0519 2000 1000 0000 0573 2000 1000 00002018 0713 2018 2040 0045 0870 2018 1020 00081020 0584 1020 2000 0000 0857 1020 2020 00002020 0662 2020 1020 0000 0834 2020 2000 00002040 0701 2040 2020 0000 0925 2040 2018 0000

1529VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the compensation mechanisms the slope ofplayer 2rsquos best response function is determinedby As and determine the penalty for

mismatched prices we seriously consider thepossibility that convergence in supermodularand near-supermodular treatments to an equilib-

Simulation Data

75

80

85

90

95

100

50 150 250 350 450

Round

α10β00α20β00α20β18α10β20α20β20α20β40

Experimental Data

40

45

50

55

60

65

70

75

80

85

90

0 10 20 30 40 50 60

Round

Fra

ctio

n o

f M

axim

um

Wel

fare

α10β00α20β00α20β18α10β20α20β20α20β40

Fra

ctio

n o

f M

axim

um

Wel

fare

FIGURE 5 EFFICIENCY IN THE SIMULATED AND ACTUAL DATA

1530 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium where both players announce the sameprice is not due to strategic complementaritiesbut rather to increases in and making thepenalty terms more prominent and matching strat-egies more focal22 We thus have two hypoth-eses to explain the observed improvement inconvergence to equilibrium play in supermodu-lar and near-supermodular treatments a best-response hypothesis and a matching hypothesisIn this section we present evidence that theimprovement in convergence is due to payoff-relevant changes to best responses and not tomatching becoming more focal

While the focal point hypothesis suggeststhat players converge to a match it does notspecify the price at which they match In thefirst round of our experiments almost 50 per-cent of all prices were 20 and less than 10percent were in the -equilibrium range of[15 17] In the final rounds of our experi-ments play in the four supermodular or near-supermodular treatments converges stronglyto this range suggesting more than simplematching

By equation (4) player 1rsquos best response isalways to match regardless of p2 By theGeneral Incentive Hypothesis we expectmore matching behavior by player 1 as increases By equation (5) however match-ing is a best response for player 2 only inequilibrium Therefore data for player 2 canhelp us separate the best-response and match-ing hypotheses

We first investigate which model better ex-plains our experimental data We operationalizethe best response hypothesis by the stochasticfictitious play model of Section V and thematching hypothesis with a stochastic matchingmodel23 In both models the predicted player 1price is specified by equation (6) In the match-ing model player 2 plays this price with prob-ability 1 and one of the other 40 priceswith probability 40 We estimate using thefirst two sessions of each treatment24 We then

compare the performance of the matchingmodel with the sFP model using the hold-outsample Using the Wilcoxon signed-rank teststo compare the MSD scores in the hold-outsample we conclude that sFP explains player2rsquos behavior significantly better than the match-ing model (p-value 0006) We conclude thatbest response to a predicted price explains be-havior significantly better than the matchingmodel

We next investigate whether differences inthe cost of matching defined as the differencebetween matching and best-response profitscan explain all of the differences in match-ing behavior among treatments or whetherthere exists additional matching behaviorthat cannot be explained by payoff-relevantfactors

We examine the probability that player 2 triesto match the anticipated player 1 price We donot know what price player 2 anticipates Toestimate how players weigh history we cali-brate a matching model in a manner similar tothat outlined in Section V B We assume aplayer matches the price he anticipates givenby equation (6) and we find the discount ratefor each treatment that minimizes MSD25 Thisgives us the anticipated price that best explainsthe matching hypothesis Using this discountrate we calculate for each t 1 an anticipatedplayer 1 price p1 for each player 2 We alsocalculate the opportunity cost of matching p1equal to the difference between best-responseand matching profits This opportunity cost cap-tures the payoff relevant effects of and alsodepends on the price to be matched

We next regress the probability that player 2matches the anticipated price Pr(p2 p1) onln(Round) treatment dummies and treatmentdummies interacted with ln(Round) In onespecification we include the cost of matching asa regressor If parameter changes make match-ing more focal then changes in the cost ofmatching will not explain all of the changes inprobability of matching

22 We thank an anonymous referee and the co-editor forpointing this out

23 We do not report the results of the deterministicmatching model as it performs significantly worse than itsstochastic counterpart

24 Estimated in each block are as follows 1000 10 7 6 8 2000 7 6 5 4 2018 7

2 3 3 1020 5 3 3 2 2020 1 0 0 02040 2 0 0 1

25 In all treatments the calibrated discount rate is r 01

1531VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Table 10 presents the results of two probitmodels with clustering at the individual level Inthe first specification we do not include the costof matching as a regressor In this specificationthe coefficient for D00 is negative and signifi-cant thus reducing from 20 to 0 decreases theprobability that player 2 will try to match player1rsquos price Including the cost of matching in theregression however we find that neither thedummies nor their interactions are signifi-cantmdashthe coefficient for the previously signifi-cant D00 dummy falls almost by a half whilethe precision of the estimate stays about thesame The coefficient on the cost of matchinghowever is significant and negative Thisfinding suggests that the rise in matching that

occurs when we increase from 0 to 20 is notbecause matching is more focal but ratherbecause an increase of decreases the cost ofmatching26

VII Concluding Remarks

The appeal of games with strategic comple-mentarities is simple as long as players areadaptive learners the game will in the limitconverge to the set bounded by the largest andsmallest Nash equilibria This convergence de-pends on neither initial conditions nor the as-sumption of a particular learning modelUnfortunately while many competitive envi-ronments are supermodular many are not Inthis study we examine whether there are non-supermodular games which have the same con-vergence properties as supermodular ones Wealso study whether there exists a clear conver-gence ranking among games with strategiccomplementarities

Our results confirm the findings of previousexperimental studies that supermodular gamesperform significantly better than games far be-low the supermodular threshold From a littlebelow the threshold to the threshold howeverthe change in convergence level is statisticallyinsignificant This results suggests that in thecontext of mechanism design the designerneed not be overly concerned with setting pa-rameters that are firmly above the supermodularthreshold close is just as good It also enlargesthe set of robustly stable games For examplethe Falkinger mechanism (Falkinger 1996) isnot supermodular but close These results sug-gest that near-supermodular games perform likesupermodular ones

Our next result concerns convergence perfor-mance within the class of games with strategiccomplementarities Variations in the degree ofcomplementarities have no significant effect onperformance within the 60 experimental roundsOur simulations suggest however an increased

26 We consider other specifications of anticipated priceincluding the averages of the last one two three and fourprices seen In only one specification was a dummy or itsinteraction with ln(Round) significant In that specificationthe coefficient for D18 is positive and its interaction withln(Round) is negative a finding which is not consistent witha focal point hypothesis

TABLE 10mdashPLAYER 2 MATCHING HYPOTHESIS

(Probit models with individual-level clustering comparingmodels with and without the cost to Player 2 of matching

the anticipated price of Player 1)

Model

Dependent variableProbability player 2 priceequals anticipated player 1

price

(1) (2)

D10 0014 0017(0043) (0041)

D00 0077 0043(0035) (0036)

D18 0046 0032(0053) (0056)

D40 0008 0041(0061) (0042)

ln(Round) 0041 0028(0011) (0010)

Match Cost 0001(0000)

ln(Round)D10 0014 0012(0012) (0012)

ln(Round)D00 0003 0000(0013) (0012)

ln(Round)D18 0006 0002(0021) (0020)

ln(Round)D40 0001 0015(0018) (0016)

Observations 9588 9588Log pseudo-likelihood 324321 317716

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses are adjusted for cluster-ing at the individual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummyvariable for treatment y Excluded dummy is D2020Match Cost is the difference in cents of matching andbest-responding to anticipated Player 1 price

1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 3: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

players to take higher actions as well Thesegames include supermodular games in whichthe incremental return to any player from in-creasing her strategy is a nondecreasing func-tion of the strategy choices of other players(increasing differences) Furthermore if a play-errsquos strategy space has more than one dimen-sion components of a playerrsquos strategy arecomplements (supermodularity) Membershipin this class of games is easy to check Indeedfor smooth functions in n letting Pi be thestrategy space and i be the payoff function ofplayer i the following theorem characterizesincreasing differences and supermodularity

THEOREM 1 (Topkis 1978) Let i be twicecontinuously differentiable on Pi Then i hasincreasing differences in (pi pj) if and only if2ipihpjl 0 for all i j and all 1 h ki and all 1 l kj and i is supermodular inpi if and only if 2ipihpil 0 for all i and all1 h l ki

Increasing differences means that an increasein the strategy of player irsquos rivals raises hermarginal utility of playing a high strategy Thesupermodularity requirement ensures comple-mentarity among components of a playerrsquosstrategies and is automatically satisfied in aone-dimensional strategy space Note that asupermodular game is a game with strategiccomplementarities but the converse is nottrue

Supermodular games have interesting theo-retical properties In particular they are robustlystable Milgrom and Roberts (1990) prove thatin these games learning algorithms consistentwith adaptive learning converge to the setbounded by the largest and the smallest Nashequilibrium strategy profiles Intuitively a se-quence is consistent with adaptive learning ifplayers ldquoeventually abandon strategies that per-form consistently badly in the sense that thereexists some other strategy that performs strictlyand uniformly better against every combinationof what the competitors have played in thenot-too-distant pastrdquo (Milgrom and Roberts1990) This includes numerous learning dynam-ics such as Bayesian learning fictitious playadaptive learning and Cournot best replyWhile strategic complementarity is sufficientfor convergence it is not a necessary condition

Thus while games with strategic complemen-tarities ought to converge robustly to the Nashequilibrium games without strategic comple-mentarities may also converge under specificlearning algorithms Whether these specificlearning algorithms are a realistic description ofhuman learning is an empirical question

While the theory on games with strategiccomplementarities predicts convergence toequilibrium it does not address four practicalissues First as the parameters of a game ap-proach the threshold of strategic complementa-rities does play converge gradually or abruptlySecond is convergence faster further past thethreshold Third how important is strategiccomplementarity compared to other features ofa game that might also induce convergence toequilibrium Last for supermodular games withmultiple Nash equilibria will players learn tocoordinate on a particular equilibrium Wechoose a game that allows us to answer the firstthree questions The fourth question has beenaddressed by John Van Huyck et al (1990) andJames Cox and Mark Walker (1998)3

Specifically we use the compensation mech-anism to study the role of strategic comple-mentarities in learning and convergence toequilibrium play In the mechanism each of twoplayers offers to compensate the other for theldquocostsrdquo of the efficient choice Assume thatwhen player 1rsquos production equals x her netprofit is rx c(x) where r is the market priceand c is a differentiable positive increasingand convex cost function Production causes anexternality on player 2 whose payoff is e(x)

3 Cox and Walker (1998) study whether subjects canlearn to play Cournot duopoly strategies in games with twokinds of interior Nash equilibrium Their type I duopoly hasa stable interior Nash equilibrium under Cournot best-replydynamics and therefore is dominance solvable (Herve Mou-lin 1984) Their type II duopoly has an unstable interiorNash equilibrium and two boundary equilibria underCournot best-reply dynamics and therefore is not domi-nance solvable They found that after a few periods subjectsdid play stable interior dominance-solvable equilibria butthey did not play the unstable interior equilibria nor theboundary equilibria It is interesting to note that theseduopoly games are submodular games Being two-playergames they are also supermodular (Rabah Amir 1996)Results of Cox and Walker (1998) illustrate the importanceof uniqueness together with supermodularity in inducingconvergence

1507VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

also assumed to be differentiable positive in-creasing and convex The mechanism is a two-staged game where the unique subgame-perfectNash equilibrium induces the Pareto-efficientoutcome of x such that r e(x) c(x) In thefirst stage (the announcement stage) player 1announces p1 a per-unit subsidy to be paid toplayer 2 while player 2 simultaneously an-nounces p2 a per-unit tax to be paid by player 1Announcements are revealed to both players Inthe second stage (the production stage) player 1chooses a production level x The payoff toplayer 1 is 1 rx c(x) p2x (p1 p2)2 while the payoff to player 2 is 2 p1x e( x) where 0 is a free punishmentparameter chosen by the designer

We study a generalized version of the com-pensation mechanism (Cheng 1998) whichadds a punishment term (p1 p2)2 toplayer 2rsquos payoff function thus making the pay-off functions

(1) 1 rx cx p2 x p1 p2 2

2 p1x ex p1 p22

Using the generalized version we solve thegame by backward induction In the productionstage player 1 chooses the quantity that solvesmaxx rx c(x) p2x (p1 p2)2 The firstorder condition r c(x) p2 0 character-izes the best response in the second stage x(p2)In the announcement stage player 1 solvesmaxp1

rx c(x) p2x (p1 p2)2 The firstorder condition is

(2)1

p1 2p1 p2 0

which yields the best response function forplayer 1 as p1 p2 Player 2 simultaneouslysolves maxp2

p1x(p2) e(x(p2)) (p1 p2)2 The first order condition is

(3)2

p2 p1 x p2 exx p2

2p1 p2 0

which characterizes player 2rsquos best responsefunction

The unique subgame-perfect equilibrium hasp1 p2 p where p is the Pigovian taxwhich induces the efficient quantity x As theequilibrium does not depend on the value of it holds for the original version where 0When is set appropriately however thegeneralized version is a supermodular mecha-nism The following proposition characterizesthe necessary and sufficient condition forsupermodularity

PROPOSITION 1 (Cheng 1998) The gener-alized version of the compensation mechanismis supermodular if and only if 0 and 1

2x(p2)

The proof is simple First as the strategyspace is one-dimensional the supermodularitycondition is automatically satisfied Second weuse Theorem 1 to check for increasing differ-ences For player 1 from equation (2) we have21p1p2 2 0 while for player 2 fromequation (3) we have 22p1p2 x(p2) 2 Therefore 22p1p2 0 if and only if 1

2x(p2)

To obtain analytical solutions we use a qua-dratic cost function c(x) cx2 where c 0and a quadratic externality function e(x) ex2where e 0 We now summarize the bestresponse functions equilibrium solutions andstability analysis in this environment

PROPOSITION 2 Quadratic cost and exter-nality functions yield the following character-izations

(i) The best response functions for players 1and 2 are

(4) p1 p2

(5) p2

1

4c

e

4c2

p1

er

4c2

e

4c2

and

x max0r p2

2c

(ii) The subgame-perfect Nash equilibrium ischaracterized as

1508 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

p1 p2 x

er

e c

er

e c

r

2e c

(iii) If players follow Cournot best-reply dy-namics (p1 p2) is a globally asymptoti-cally stable equilibrium of the continuoustime dynamical system for any 0 and 0

(iv) The game is supermodular if and only if 0 and 14c

PROOFSee Appendix

The best-response functions presented in part(i) of Proposition 2 reveal interesting incentivesWhile player 1 has an incentive to always matchplayer 2rsquos price player 2 has an incentive tomatch only when player 1 plays the equilibriumstrategy Also at the threshold for strategiccomplementarity 14c player 2 has a dom-inant strategy p2 er(e c) p2 Part (iii) ofProposition 2 extends Cheng (1998) whoshows that the original version of the compen-sation mechanism ( 0) is globally stableunder continuous-time Cournot best-reply dy-namics4 Cournot best-reply is however a rel-atively poor description of human learning(Richard Boylan and Mahmoud El-Gamal1993) Therefore global stability under Cournotbest reply for any 0 does not imply equi-librium convergence among human subjectsPart (iv) characterizes the threshold for strategiccomplementarity in our experiment a more ro-bust stability criterion than that characterized bypart (iii)

An intuition for how strategic complementa-rities affect the outcome of adaptive learningcan be gained from analysis of the best responsefunctions equations (4) and (5) While player1rsquos best response function is always upwardsloping player 2rsquos best response function equa-tion (5) is nondecreasing if and only if 14c ie when the game is supermodular Be-yond the threshold for strategic complementar-ity both best-response functions are upward

sloping and they intersect at the equilibrium Itis easy to verify graphically that adaptive learn-ers eg Cournot best reply will converge tothe equilibrium regardless of where they start

To examine how which is unrelated tostrategic complementarity might affect behav-ior we observe that when player 1 deviatesfrom the best response by ie p1 p2 his profit loss is 1 2 This profit losswhich is proportional to the magnitude of isthe deviation cost for player 1 Based on previ-ous experimental evidence the incentive to de-viate from best response decreases when thedeviation cost increases Chen and Plott (1996)call it the General Incentive Hypothesis ie theerror of game theoretic models decreases as theincentive to best-respond increases We there-fore expect that an increase in improvesplayer 1rsquos convergence to equilibrium Whenplayer 1 plays equilibrium strategy player 2rsquosbest response is to play equilibrium strategy aswell Therefore we expect that an increase in might improve player 2rsquos convergence to equi-librium as well It is not clear however whetherthe -effects systematically change the effectsof the supermodularity parameter We rely onexperimental data to test the interaction of the-effects on -effects

II Experimental Design

Our experimental design reflects both theo-retical and technical considerations Specifi-cally we chose an environment that allowssignificant latitude in varying the free parame-ters to better assess the performance of thecompensation mechanism around the thresholdof strategic complementarity We describe thisenvironment and the experimental proceduresbelow

A The Economic Environment

We use the general payoff functions pre-sented in equation (1) with quadratic cost andexternality functions to obtain analytical solu-tions c(x) cx2 and e(x) ex2 We use thefollowing parameters c 180 e 140 r 24 From Proposition 2 the subgame-perfectNash equilibrium is (p1 p2 x) (16 16 320)and the threshold for strategic complementarityis 14c 20

4 Cheng (1998) also shows that the original mechanismis locally stable under discrete-time Cournot best-reply dy-namics

1509VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

In the experiment each player chooses pi 0 1 40 Without the compensation mech-anism the profit-maximizing production level isx r2c 960 three times higher than theefficient level To reduce the complexity ofplayer 1rsquos problem we use a grid size of 10 forthe quantity and truncate the strategy space ieplayer 1 chooses X 0 1 50 where X x10 The payoff functions presented to the sub-jects are adjusted accordingly Truncating thestrategy space to X 50 also reduces the pos-sibility of player 2rsquos bankruptcy To reducepayoff asymmetry we give player 1 a lump-sum payment of 250 points each round There-fore the equilibrium payoffs under themechanism for the two players are 1 1530and 2 2560

The functional forms and specific parametervalues are chosen for several reasons Firstequilibrium solutions are integers Secondequilibrium prices and quantities do not lie inthe center of the strategy space thus avoidingequilibrium convergence as a result of focalpoints Third there is a salient gap betweenefficiency with and without the mechanismWithout the mechanism the profit-maximizingproduction level is X 50 resulting in anefficiency level of 684 percent5 With themechanism the system achieves 100-percentefficiency in equilibrium Finally since thethreshold for strategic complementarity is 20 there is a large number of integer values tochoose from both below and above thethreshold

To study how strategic complementarity af-fects equilibrium convergence we keep 20and vary 0 18 20 and 40 To studywhether affects convergence we also keep 10 and vary 0 20

B Experimental Procedures

Our experiment involves 12 players persessionmdashsix player 1s (called Red players in the

instructions) and six player 2s (Blue players)Each player remains the same type throughoutthe experiment At the beginning of each ses-sion subjects randomly draw a PC terminalnumber Each then sits in front of the corre-sponding terminal and is given printed instruc-tions After the instructions are read aloudsubjects are encouraged to ask questions Theinstruction period lasts between 15 and 30minutes

Each round a player 1 is randomly matchedwith a player 2 Subjects are randomly re-matched each round to minimize repeated gameeffects The random rematching protocol alsominimizes the possibility that players colludeon a high-subsidy and low-tax outcome6 Eachsession consists of 60 rounds As we are inter-ested in learning there are no practice roundsEach round consists of two stages

(i) Announcement Stage Each player simulta-neously and independently chooses a pricepi 0 1 40

(ii) Production Stage After (p1 p2) are chosenplayer 1rsquos computer displays player 2rsquosprice and a payoff table showing her payofffor each X 0 1 50 Player 1 thenchooses a quantity X The server calculatespayoffs and sends each player his payoffthe quantity chosen and the prices submit-ted by him and his match

To summarize each subject knows both pay-off functions the choices made each round byhimself and his match as well as his per-periodand cumulative payoffs At any point subjectshave ready access to all of this information Themechanism is thus implemented as a game ofcomplete information We do not know how-ever how subjects processed this informationnor do we know their beliefs about the rational-

5 We compute the efficiency level by using the ratio oftotal earnings without the mechanism and total earningswith the mechanism Without the mechanism at the pro-duction level of X 50 (or x 500) total earning is 1 2 (rx cx2) ex2 2625 Therefore the efficiencylevel is 2625(1280 2560) 0684

6 If the players could commit to maximizing joint profitsby choosing p1 and p2 cooperatively and x noncoopera-tively they would choose by equation (1) (p1 p2 x) (4( )er[4( )(e c) 1] r(4e( ) 1)[4( )(e c) 1] 2r( )[4( )(e c) 1]) With ourchoice of parameters we get p1 24 p2 12 for 20and 0 p1 192 p2 144 for 20 and 20and p1 18 p2 15 for 20 and 40 etc If inaddition they could choose (p1 p2 x) cooperatively then forour parameter values we get (p1 p2 x) (min480[3( ) 20] 40 min960( )[3( ) 20] 50)

1510 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

ity of others Both of these factors introduceuncertainty in the environment

Table 1 presents features of the experimentalsessions including parameters number of inde-pendent sessions in each treatment whether themechanism is supermodular in that treatmentand equilibrium prices and quantities Overall27 independent computerized sessions wereconducted in the Research Center for GroupDynamics (RCGD) lab at the University ofMichigan from April to July 2001 and in April2003 We used zTree to program our experi-ments Our subjects were students from the Uni-versity of Michigan No subject was used inmore than one session yielding 324 subjectsEach session lasted approximately one-and-a-half hours The exchange rate for all treatmentswas one dollar for 4250 points The averageearnings was $2282 Data are available fromthe authors upon request

III Hypotheses

Given the design above we next identify ourhypotheses To do so we first define and discusstwo measures of convergence the level andspeed7 In theory convergence implies that allplayers play the stage game equilibrium and nodeviation is observed This is not realistic how-ever in an experimental setting Therefore wedefine the following measures

DEFINITION 1 The level of convergence atround t L(t) is measured by the proportion ofNash equilibrium play in that round The levelof convergence for a block of rounds Lb(t1 t2)measures the average proportion of Nash equi-librium play between rounds t1 and t2 ie Lb(t1t2) yentt1

t2 L(t)(t2 t1 1) where 0 t1 t2 T and T is the total number of rounds

We define the level of convergence for both around and a block of rounds The block conver-gence measure smooths out inter-roundvariation

Ideally the speed of convergence shouldmeasure how quickly all players converge toequilibrium strategies In our experimental set-ting however we never observe perfect con-vergence We therefore use a more generaldefinition for the speed of convergence

DEFINITION 2 For a given level of conver-gence L (0 1] the speed of convergence ismeasured by the first round in which the level ofconvergence reaches L and does not subse-quently drop below this level ie such thatL(t) L for any t

We use Definition 2 for analysis of conver-gence speed in simulations Due to oscillation inexperimental data Definition 2 is not practicalWe use the following observation which en-ables us to measure convergence speed usingestimates for the slope of L(t) L(t) L(t 1) L(t) obtained in regression analysis LetLy(t) be the convergence level of treatment y at

7 We thank anonymous referees for suggesting this sep-aration and appropriate measures

TABLE 1mdashFEATURES OF EXPERIMENTAL SESSIONS

Parameters and treatments Properties Equilibrium

Supermodular (p1 p2 X)10 20

1000 20000 (4 sessions) (5 sessions)

2018 18 (4 sessions) No (16 16 32)

1020 202020 (5 sessions) (5 sessions)

204040 (4 sessions) Yes (16 16 32)

1511VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

time t We now relate the slope of L(t) and theinitial level of convergence L(1) to the speed ofconvergence

OBSERVATION 1 If L1(1) L2(1) andL1(t) L2(t) for all t [1 T 1] thengiven any L (0 1] the first treatment con-verges more quickly than the second treatment

Based on theories presented in Section I wenow form our hypotheses about the level andspeed of convergence While theories of strate-gic complementarities do not make any predic-tions about the speed of convergence we formour hypotheses based on previous experimentsthat incidentally address games with strategiccomplementarities

HYPOTHESIS 1 When 20 increasing from 0 to 20 significantly increases (a) the leveland (b) the speed of convergence

Hypothesis 1 is based on the theoretical predic-tion that games with strategic complementari-ties converge to the unique Nash equilibrium aswell as on previous experimental findings thatsupermodular games perform robustly betterthan their non-supermodular counterparts

HYPOTHESIS 2 When 20 increasing from 0 to 18 significantly increases (a) the leveland (b) the speed of convergence

Hypothesis 2 is based on the findings ofFalkinger et al (2000) that average play is closeto equilibrium when the free parameter isslightly below the supermodular threshold

HYPOTHESIS 3 When 20 increasing from 18 to 20 significantly increases (a) thelevel and (b) the speed of convergence

Since we have not found any previous experi-mental studies that compare the performance ofgames with strategic complementarities withthose near the threshold Hypothesis 3 is purespeculation

HYPOTHESIS 4 When 20 increasing from 20 to 40 does not significantly in-crease either (a) the level or (b) the speed ofconvergence

Since we have not found any previous experi-mental studies within the class of games withstrategic complementarities Hypotheses 4 isagain our speculation

HYPOTHESIS 5 When 0 or 20 increas-ing from 10 to 20 significantly increases (a)the level and (b) the speed of convergence

HYPOTHESIS 6 Changing from 10 to 20significantly increases the improvement in (a)the level and (b) the speed of convergence thatresults from increasing from 0 to 20

Hypothesis 5 is based on experimental findingssupporting the General Incentive HypothesisHypothesis 6 is our speculation

Hypotheses 1 through 6 are concerned withonly one measure of performance convergenceto equilibrium Other measures such as effi-ciency and budget balance can be largely de-rived from convergence patterns Thereforealthough we omit the formal hypotheses wepresent results regarding these measures in Sec-tions IV and V

IV Experimental Results

In this section we compare the performanceof the mechanism as we vary and At theindividual level we look at both the level andspeed of convergence to subgame-perfect equi-librium and near equilibrium in each of the sixdifferent treatments At the aggregate level weexamine the efficiency and budget imbalancegenerated by each treatment In the followingdiscussion we focus on prices rather than onquantity Recall that player 1rsquos best response inthe production stage is uniquely determined byp2 and that player 1 has all of the informationneeded to select this best response In our ex-periment deviations in the production stagetend to be small (the average absolute deviationis less than 1 in 23 of 27 sessions) and does notdiffer significantly among treatments There-fore it is not surprising that in all of our anal-yses the results for quantities largely mirror theresults for player 2rsquos price8

Recall that subgame perfect Nash equilib-

8 Tables are available from the authors upon request

1512 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium for a stage game is (p1 p2 X) (16 1632) Since the strategy space in this experimentis rather large and the payoff function is rela-tively flat near equilibrium a small deviationfrom equilibrium is not very costly For exam-ple in the 2020 treatment a one-unit unilat-eral deviation from equilibrium prices costsplayer 1 $0005 and player 2 $0014 Thereforewe check the -equilibrium play by looking atthe proportion of price announcements within1 of the equilibrium price and the quantityannouncement within 4 of the equilibriumquantity since a one-unit change in p2 results ina four-unit best-response quantity changeTherefore the -equilibrium prediction is (-p1-p2 -x) (15 16 17 15 16 1728 32 36)

Figures 1 and 2 contain box and whiskersplots for the prices of each treatment for all 60rounds by players 1 and 2 respectively The boxrepresents the ranges of the twenty-fifth andseventy-fifth percentiles of prices while thewhiskers extend to the minimum and maximumprices in each round The horizontal line withineach box represents the median price Com-pared with the 00 treatments equilibriumprice convergence is clearly more pronouncedin the supermodular and near-supermodulartreatments

To analyze the performance of the compen-sation mechanism we first compare the conver-gence level achieved in the last 20 rounds ofeach treatment Table 2 reports the level ofconvergence (Lb(41 60)) to subgame-perfectNash equilibrium (panels A and B) and -Nashequilibrium (panels C and D) for each sessionunder each of the six different treatments aswell as the alternative hypotheses and the cor-responding p-values of one-tailed permutationtests9 under the null hypothesis that the con-vergence levels in the two treatments (column

[8]) are the same While the proportion of-equilibrium play is much higher than the pro-portion of equilibrium play the results of thepermutation tests largely follow similar pat-terns We now formally test our hypothesesregarding convergence level Parts (i) to (iii) ofResult 1 present the effects of the degree ofstrategic complementarity (-effects) Parts (iv)and (v) present effects due to changes in (-effects)

RESULT 1 (Level of Convergence in theLast 20 Rounds)

(i) When 20 increasing from 0 to 18and from 0 to 20 significantly10 increasesthe level of convergence in p1 p2 and -p2

(ii) When 20 increasing from 18 to 20does not change the level of convergencesignificantly

(iii) When 20 increasing from 20 to 40does not change the level of convergencesignificantly

(iv) When 0 increasing from 10 to 20weakly increases the level of convergencein p2 and significantly increases the levelof convergence in -p2 but has no signif-icant effects on p1 or -p1

(v) When 20 increasing from 10 to 20weakly increases the level of convergencein p1 but not in -p1 p2 or -p2

SUPPORT The last two columns of Table2 report the corresponding alternative hypothe-ses and permutation test results

By part (i) of Result 1 we accept Hypotheses1(a) and 2(a) Part (i) also confirms previousexperimental findings that supermodular gamesperform significantly better than those far fromthe supermodular threshold Furthermore near-supermodular games also perform significantlybetter than those far from the threshold

By part (ii) however we reject Hypothesis3(a) This is the first experimental result thatshows that from a little below the supermodularthreshold ( 18) to the threshold ( 20)

9 The permutation test also known as the Fisher random-ization test is a nonparametric version of a difference oftwo means t-test (See eg Sidney Siegel and N JohnCastellan 1988) The idea is simple and intuitive by pool-ing all independent observations the p-value is the exactprobability of observing a separation between the two treat-ments as the one observed when the pooled observations arerandomly divided into two equal-sized groups This testuses all of the information in the sample and thus has100-percent power-efficiency It is among the most power-ful of all statistical tests

10 When presenting results throughout the paper wefollow the convention that a significance level of 5 percentor less is significant while a significance level between 5percent and 10 percent is weakly significant

1513VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

improvement in convergence level is statisti-cally insignificant In other words we do notsee a dramatic improvement at the thresholdThis implies that the performance of near-supermodular games such as the Falkingermechanism ought to be comparable to that ofsupermodular games

By contrast we accept Hypothesis 4(a) bypart (iii) This is the first experimental resultsystematically comparing convergence levels ofsupermodular games where theory is silentThe convergence level does not significantlyimprove as increases from the threshold 20to 40 Therefore the marginal returns for being

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β00 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β00 Player 1

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β20 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β20 Player 1

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α20β18 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β40 Player 1

Pri

ce A

nn

ou

nce

d

Round

FIGURE 1 DISTRIBUTION OF ANNOUNCED PRICES IN EXPERIMENTAL TREATMENTS PLAYER 1

1514 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

ldquomore supermodularrdquo diminish once the payoffsbecome supermodular

Proposition 2 predicts convergence underCournot best reply for any 0 Howeverthere is a significant difference in convergencelevel as increases from 0 to 18 20 and

beyond We investigate in Section V whetherthis difference persists in the long run

While parts (i) to (iii) present the -effectsparts (iv) and (v) examine the -effects and wepartially accept Hypothesis 5(a) Recall fromequation (5) and subsequent discussions that at

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β00 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β00 Player 2

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β20 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β20 Player 2

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α20β18 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β40 Player 2

Pri

ce A

nn

ou

nce

d

Round

FIGURE 2 DISTRIBUTION OF ANNOUNCED PRICES IN EXPERIMENTAL TREATMENTS PLAYER 2

1515VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the threshold of strategic complementarity 20 player 2rsquos Nash equilibrium strategy is alsoa dominant strategy The finding of no -effecton player 2rsquos equilibrium play when 20 isconsistent with this observation

To determine the effects of strategic comple-mentarities and other factors on convergencespeed and to investigate further their effects onconvergence level we use probit models withclustering at the individual level The resultsfrom these models are presented in Table 3 Thedependent variable is -p1 in specifications (1)and (3) and -p2 in specifications (2) and (4)

In specifications (1) and (2) the independentvariables are treatment dummies Dy wherey 1000 2000 18 1020 and 40ln(Round) and a constant We use dummies fordifferent values of and rather than directparameter values to avoid assuming a linearrelationship of parameter effects and omit thedummy for 2020 Therefore in these twospecifications restricting learning speed to bethe same across treatments the estimated coef-ficient of Dy captures the difference in theconvergence level between treatments y and2020 Results from these specifications are

TABLE 2mdashLEVEL OF CONVERGENCE IN THE LAST 20 ROUNDS

Panel A Proportion of player 1 Nash equilibrium price (p1) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0000 0008 0158 0008 0044 1000 2000 03972000 0000 0083 0083 0042 0058 0053 2000 2020 00082018 0125 0117 0100 0192 0133 2000 2018 00081020 0242 0067 0067 0067 0208 0130 1020 2020 00642020 0325 0267 0208 0058 0367 0245 2018 2020 00642040 0175 0400 0158 0133 0217 2020 2040 0643

Panel B Proportion of player 2 Nash equilibrium price (p2) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0025 0042 0025 0067 0040 1000 2000 00562000 0067 0083 0033 0075 0050 0062 2000 2020 00322018 0133 0292 0108 0258 0198 2000 2018 00081020 0392 0108 0175 0067 0667 0282 1020 2020 05042020 0308 0117 0483 0017 0450 0275 2018 2020 02382040 0325 0492 0233 0233 0321 2020 2040 0365

Panel C Proportion of player 1 -Nash equilibrium price (-p1) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0108 0200 0350 0158 0204 1000 2000 01832000 0067 0458 0358 0183 0425 0298 2000 2020 00872018 0317 0625 0300 0583 0456 2000 2018 01031020 0475 0200 0300 0375 0492 0368 1020 2020 01942020 0450 0558 0475 0133 0775 0478 2018 2020 04442040 0458 0492 0375 0442 0442 2020 2040 0643

Panel D Proportion of player 2 -Nash equilibrium price (-p2) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0133 0200 0242 0225 0200 1000 2000 00162000 0300 0400 0200 0275 0333 0302 2000 2020 00162018 0717 0667 0342 0700 0606 2000 2018 00161020 0650 0517 0542 0400 0892 0600 1020 2020 05642020 0533 0592 0642 0225 0900 0578 2018 2020 05562040 0825 0792 0467 0517 0650 2020 2040 0294

Note Significant at 10-percent level 5-percent level 1-percent level

1516 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

largely consistent with Result 1 which uses amore conservative test The coefficients ofln(Round) are both positive and highly signifi-cant indicating that players learn to play equilib-rium strategies over time The concave functionalform ln(Round) which yields a better log-likelihood than either the linear or quadratic func-tional form indicates that learning is rapid at thebeginning and decreases over time We will ex-amine learning in more detail in Section V

In specifications (3) and (4) we useln(Round) interaction of each of the treatmentdummies with ln(Round) and a constant asindependent variables The interaction term al-lows different slopes for different treatmentsCompared with the coefficient of ln(Round) the

coefficient for the interaction term Dyln(Round) captures the slope differences be-tween treatment y and 2020 By Observation1 as we cannot reject the hypotheses that initialround probability of equilibrium play is thesame across all treatments11 we can comparethe slope of L(t) L(t) between different treat-ments From the probit specification L(t) 13(constant Db ln(t)) where D is the 1 5vector of treatment dummies and b is the 5 1vector of estimated coefficients we can derive

11 When including treatment dummies in this specifica-tion Wald tests on the hypothesis that the treatment dummycoefficients are all zero yield p-values of 02703 for -p1and 03383 for -p2

TABLE 3mdashCONVERGENCE SPEED PROBIT MODELS WITH CLUSTERING AT INDIVIDUAL LEVEL

Dependent variable -Nash equilibrium play

(1) -Nashprice 1

(2) -Nashprice 2

(3) -Nashprice 1

(4) -Nashprice 2

D1000 0143 0268(0048) (0050)

D2000 0090 0217(0060) (0057)

D18 0005 0004(0071) (0069)

D1020 0080 0025(0060) (0073)

D40 0000 0078(0068) (0078)

ln(Round) 0115 0167 0131 0183(0014) (0017) (0021) (0022)

D1000ln(Round) 0050 0095(0019) (0022)

D2000ln(Round) 0031 0073(0020) (0021)

D18ln(Round) 0001 0004(0020) (0021)

D1020ln(Round) 0024 0008(0019) (0022)

D40ln(Round) 0002 0023(0019) (0024)

Observations 9720 9720 9720 9720Number of groups 162 162 162 162Log pseudo-likelihood 5562890 5824055 5557424 5793013

D1000 D2000 150 131D1000ln(Round) D2000ln(Round) 133 123D1020 D1000 D2000 005 107D1020ln(Round) D1000ln(Round)

D2000ln(Round)005 103

Notes Coefficients are probability derivatives Robust standard errors in parentheses are adjusted for clustering at theindividual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummy variable fortreatment y Excluded dummy is D2020 The bottom panel presents null hypotheses and Wald 2(1) test statistics

1517VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the slope of the probability function for treat-ment y with respect to t Ly(t) 13(con-stant Db)byt All coefficients reported inTable 3 are increments of probability deriva-tives 13(constant Db)by compared to thebaseline case of 2020

RESULT 2 (Speed of Convergence)

(i) When 20 increasing from 0 to 18and from 0 to 20 significantly increases thespeed of convergence in -p1 and -p2

(ii) When 20 increasing from 18 to 20has no significant effect on convergencespeed

(iii) When 20 increasing from 20 to 40has no significant effect on convergencespeed

(iv) When 0 increasing from 10 to 20has no significant effects on convergencespeed

(v) When 20 increasing from 10 to 20has no significant effects on convergencespeed

SUPPORT Models (3) and (4) in Table 3 re-port analyses on convergence speed For eachindependent variable the coefficients standarderrors (in parentheses) and significance levelsare reported The second Wald test looks atwhether the coefficient of D1000ln(Round)equals that of D2000ln(Round)

Part (i) of Result 2 supports Hypotheses 1(b)and 2(b) while by part (ii) we reject Hypothesis3(b) Part (iii) supports Hypothesis 4(b) Parts(iv) and (v) reject Hypothesis 5(b) Result 2provides the first empirical evidence on the roleof strategic complementarity and the speed ofconvergence

Although part (ii) indicates that increasing from 18 to 20 does not significantly changeconvergence speed we now investigate whetherthere are any differences between 18 and thesupermodular treatments In particular we com-pare 2018 with 2020 and 204012 InResult 1 we show that these treatments achieve

the same convergence level Also we cannotreject the hypothesis that round-one prices foreach player are drawn for the same distributionfor each treatment13 As the treatments all startand converge to similar levels of equilibriumplay we use a more flexible functional form tolook for differences in speed

In Table 4 we report the results of probitregressions comparing 2018 with the two20 supermodular treatments We use Roundln(Round) and their interactions with 2018as independent variables to allow different con-vergence speeds to the same level of conver-gence In model (1) the dependent variable isplayer 1 -equilibrium play There is no signif-icant difference between 2018 and the 20supermodular treatments In model (2) the de-pendent variable is player 2 -equilibrium playThere are significant differences between thenear-supermodular and 20 supermodular treat-ments In early rounds ln(Round) is large relativeto Round The negative and weakly significantcoefficient on D2018ln(Round) implies a lowerprobability of early-round equilibrium play in

12 We omit 1020 from this analysis First it does notachieve the same -p1 convergence level as the other su-permodular treatments Second omitting it avoids the pos-sibility of an -effect

13 Kolmogorov-Smirnov test result tables are availableby request

TABLE 4mdashCONVERGENCE SPEED

Probit models with individual-level clustering comparing2018 with the 20 supermodular treatments achieving

similar convergence levels

Dependent variable -Nashequilibrium play

(1) -Nashprice 1

(2) -Nashprice 2

ln(Round) 0054 0216(0040) (0043)

Round 0004 0001(0002) (0002)

D2018ln(Round) 0013 0066(0032) (0035)

D2018Round 0001 0006(0002) (0003)

Observations 4680 4680Log pseudo-likelihood 287907 299385

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses Significant at 10-percent level 5-percent level 1-percent level Dy isthe dummy variable for treatment y Excluded dummies areD2020 and D2040

1518 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

2018 relative to the supermodular treatmentsThe 18 treatment does catch up to these treat-ments which is reflected in the positive andsignificant coefficient on D2018 interactedwith Round This result suggests a differencebetween supermodular and near-supermodulartreatments while they achieve similar conver-gence levels the supermodular treatments per-form better in early rounds and thus mightreach certain convergence levels faster thantheir near-supermodular counterparts

The previous discussion examines the sepa-rate effects of strategic complementarity (-effects) and (-effects) on convergenceVarying allows us however to test the im-portance of strategic complementarity relativeto other features of mechanism design Having afull factorial design in the parameters 1020 and 0 20 we can study whether affects the role of strategic complementarity Inparticular as increases from 0 to 20 weexpect improvement in convergence level andspeed We study whether this improvementchanges when increases from 10 to 20

The last two Wald tests in the bottom panelof Table 3 separately examine the -effectson the change in convergence level and speedresulting from increasing from 0 to 20(-effects on -effects) Changing from 10 to20 does not significantly change the improve-ment in either convergence level or speed Thisis the first empirical result examining the ef-fects of other factors on the role of strategiccomplementarity

So far we have discussed the performance ofthe mechanism relative to equilibrium predic-tions and individual behavior We now turn togroup-level welfare results Since the compen-sation mechanism balances the budget only in

equilibrium total payoffs off the equilibriumpath can be only weakly related to efficientpayoffs Therefore we use two separate mea-sures to capture welfare implications an effi-ciency measure and a measure of budgetimbalance

We first define the per-round efficiency mea-sure which includes neither the taxsubsidy northe penalty terms ie

et rxt cxt ext

rx cx ex

where x is the efficient quantity The efficiencyachieved in a block e(t1 t2) where 0 t1 t2 60 is then defined as

et1 t2 t t1

t2 et

t2 t1 1

RESULT 3 (Efficiency in the Last 20Rounds)

(i) When 20 increasing from 0 to 18significantly improves efficiency while in-creasing from 0 to 20 weakly improvesefficiency

(ii) When 20 increasing from 18 to 20has no significant effect on efficiency

(iii) When 20 increasing from 20 to 40weakly improves efficiency

(iv) When 0 increasing from 10 to 20weakly improves efficiency

(v) When 20 increasing from 10 to 20has no significant effect on efficiency

SUPPORT Table 5 presents the efficiencymeasure for each session in each treatment

TABLE 5mdashEFFICIENCY MEASURE

Efficiency Last 20 rounds

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0600 0671 0804 0858 0733 1000 2000 00872000 0650 0870 0907 0873 0825 0825 2000 2020 00952018 0963 0952 0889 0959 0941 2000 2018 00161020 0958 0919 0905 0881 0984 0929 1020 2020 07142020 0888 0937 0939 0780 0971 0903 2018 2020 08332040 0973 0962 0943 0949 0957 2020 2040 0064

Note Significant at 10-percent level 5-percent level 1-percent level

1519VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

in the last 20 rounds the alternative hypoth-eses and the results of one-tailed permutationtests

Result 3 is largely consistent with Result 1indicating that supermodular and near-supermodular mechanisms induce a signifi-cantly higher proportion of equilibrium playthan mechanisms far from the supermodularthreshold The new finding is that increasing from the threshold 20 to 40 weakly improvesefficiency

Second the budget surplus at round t is thesum of the penalty terms plus tax minus sub-sidy in that round ie s(t) ( )(p1(t) p2(t))2 p2(t)x(t) p1(t)x(t) Examining thesession-level budget surplus across all roundswe find the following results First 22 out of 27sessions result in a budget surplus This comesfrom a combination of our choice of parametersand the dynamics of play Second the onlysignificant difference across treatments is thatbudget surpluses under 2018 and 2020are significantly lower than those under2040 ( p-values 0029 and 0024 respec-tively) This finding points to a potential costof driving up the punishment parameter Inother words before the system equilibrates ahigh punishment parameter can worsen thebudget imbalance problem inherent in themechanism

Overall Results 1 through 3 suggest the fol-lowing observations regarding games with stra-tegic complementarities First in terms ofconvergence level and speed supermodular andnear-supermodular games perform significantlybetter than those far under the threshold Sec-ond while there is no significant differencebetween supermodular and near-supermodulargames in terms of convergence level early per-formance differences may lead to supermodulartreatments reaching certain convergence levelsmore quickly Third beyond the threshold in-creasing has no significant effect on either thelevel or the speed of convergence but has thedisadvantage of risking higher budget imbal-ance Last for a given increasing haspartial effects on convergence level but nosignificant effect on convergence speed Tocheck the persistence of these experimental re-sults in the long run we use simulations inSection V

V Simulation Results Continued Dynamics

Our experiment examines the relationship be-tween strategic complementarity and conver-gence to equilibrium Learning theory predictslong-run convergence Figures 1 and 2 showthat convergence continues to improve in laterrounds for several treatments suggesting con-tinued dynamics Due to time attention andresource constraints it was not feasible to runthe experiment for much longer than 60rounds Therefore we rely on simulations tostudy continued convergence beyond 60rounds

To do so we look for a learning algorithmwhich when calibrated closely approximatesthe observed dynamic paths over 60 roundsThe large empirical literature on learning ingames (see eg Camerer 2003 for a survey)suggests many models Our interest here is notto compare the performance of various learningmodels We look for learning models that matchtwo criteria First we require a model that per-forms well in a variety of experimental gamesSecond given our experiment has complete in-formation about the payoff structure the modelneeds to incorporate this information In thefollowing subsections we first introduce threelearning models which meet our criteria Wethen look at how well the learning models pre-dict the experimental data by calibrating eachalgorithm on a subset of the experimental dataand validating the models on a hold-out sampleWe then report the forecasting results using oneof the calibrated algorithms

A Three Learning Models

The models we examine are stochastic ficti-tious play with discounting (hereafter shortenedas sFP) (Yin-Wong Cheung and Daniel Fried-man 1997 Fudenberg and Levine 1998) func-tional EWA (fEWA) (Teck-Hua Ho et al2001) and the payoff assessment learning model(PA) (Rajiv Sarin and Farshid Vahid 1999)We now give a brief overview of each modelInterested readers are referred to the originalsfor complete descriptions

The particular version of sFP that we use islogistic fictitious play (see eg Fudenberg andLevine 1998) A player predicts the matchrsquosprice in the next round pi(t 1) according to

1520 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(6) pi t 1

pi t yen 1

t 1

rpi t

1 yen 1

t 1

r

for some discount factor r [0 1] Note r 0corresponds to the Cournot best reply assess-ment pi(t 1) pi(t) When r 1 it yieldsthe standard fictitious play assessment Theusual adaptive learning model assumes 0 r 1All observations influence the expected state butmore recent observations have greater weight

As opposed to standard fictitious play stochas-tic fictitious play allows decision randomizationand thus better captures the human learning pro-cess Omitting time subscripts the probability thata player announces price pi is given by

(7) Pr pipi exp13 pi pi

yenj 0

40

exp13 pj pj

Given a predicted price a player is thus morelikely to play strategies that yield higher pay-offs How much more likely is determined by 13the sensitivity parameter As 13 increases theprobability of a best response to pi increases

The second model we consider fEWA is aone-parameter variant of the experience-weighted attraction (EWA) model (Camererand Ho 1999) In this model strategy probabil-ities are determined by logit probabilities simi-lar to equation (7) with actual payoffs ((pipi)) replaced by strategy attraction In bothvariants strategy attraction Ai

j(t) and an expe-rience weight N(t) are updated after everyperiod The experience weight is updated ac-cording to N(t) (1 ) N(t 1) 1where is the change-detection parameter and controls exploration (low ) versus exploita-tion Attractions are updated according to thefollowing rule

(8) Aijt

Nt 1Ai

jt 1 1 I pij pi ti pj pi t

Nt

where the indicator function I(pij pi(t)) equals

one if pij pi(t) and zero otherwise and [0

1] is the imagination weight In EWA all pa-rameters are estimated whereas in fEWA theseparameters (except for 13) are endogenously de-termined by the following functions Thechange-detector function i(t) is given by

(9) i t 1 5 j 1

mi Ipij pi t

1

yen 1

t

I pij pi t

t 2

This function will be close to 1 when recenthistory resembles previous history The imagi-nation weight i(t) equals i(t) while equalsthe Gini coefficient of previous choice frequen-cies EWA models encompass a variety offamiliar learning models cumulative reinforce-ment learning ( 0 1 N(0) 1)weighted reinforcement learning ( 0 0N(0) 1) weighted fictitious play ( 1 0) standard fictitious play ( 1 0)and Cournot best reply ( 1 1)

Finally we introduce the main componentsof the PA model For simplicity we omit allsubscripts that represent player i and let j(t)represent the actual payoff of strategy j in roundt Since the game has a large strategy space weincorporate similarity functions into the modelto represent agent use of strategy similarity Asstrategies in this game are naturally ordered bytheir labels we use the Bartlett similarity func-tion fjk(h t) to denote the similarity betweenthe played strategy k and an unplayed strategyj at period t

fjk h t 1 j kh if j k h0 otherwise

In this function the parameter h determines theh 1 unplayed strategies on either side of theplayed strategy to be updated When h 1fjk(1 t) degenerates into an indicator functionequal to one if strategy j is chosen in round t andzero otherwise

The PA model assumes that a player is amyopic subjective maximizer That is hechooses strategies based on assessed payoffs

1521VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

and does not explicitly take into account thelikelihood of alternate states Let uj(t) denotethe subjective assessment of strategy pj at timet and r the discount factor Payoff assessmentsare updated through a weighted average of hisprevious assessments and the payoff he actuallyobtains at time t If strategy k is chosen at timet then

(10) uj t 1 1 rfjk h tuj t

rfjk h tk t j

Each period the assessed strategy payoffs aresubject to zero-mean symmetrically distributedshocks zj(t) The decision maker chooses on thebasis of his shock-distorted subjective assess-ments uj(t) uj(t) zj(t) At time t he choosesstrategy pk if

(11) uk t uj t 0 pj pk

Note that mood shocks affect only his choicesand not the manner in which assessments areupdated

B Calibration

Literature assessing the performance oflearning models contains two approaches to cal-ibration and validation The first approach cal-ibrates the model on the first t rounds andvalidates on the remaining rounds The secondapproach uses half of the sessions in each treat-ment to calibrate and the other half to validateWe choose the latter approach for two reasonsFirst this approach is feasible as we have mul-tiple independent sessions for each treatmentSecond we need not assume that the parametersof later rounds are the same as those in earlierrounds We thus calibrate the parameters ofeach model in blocks of 15 rounds using theexperimental data from the first two sessions ofeach treatment We then evaluate each model bymeasuring how well the parameterized modelpredicts play in the remaining two or threesessions

For parameter estimation we conduct MonteCarlo simulations designed to replicate thecharacteristics of the experimental settings In

all calibrations we exclude the last two rounds(59 and 60) to avoid any end-of-game effectsWe then compare the simulated paths with theexperimental data to find those parameters thatminimize the mean-squared deviation (MSD)scores Since the final distributions of our pricedata are unimodal the simulated mean is aninformative statistic and is well captured byMSD (Ernan Haruvy and Dale Stahl 2000) Inall simulations we use the k-period-aheadrather than the one-period-ahead approach14 be-cause we are interested in forecasting the long-run mechanism performance In doing so wechoose k 10 15 20 30 and 58 We look atblocks of 10 15 20 and 30 rounds because asplayers gain information and experience infor-mation use may change over time We use k 15 rather than the other values because it bestcaptures the actual dynamics in the experimen-tal data

Each simulation consists of 1500 games(18000 players) and the following steps

(i) Simulated players are randomly matchedinto pairs at the beginning of each round

(ii) Simulated players select price announce-ments(a) Initial round Almost half of all play-

ers selected a first-round price of 20(44 percent of player 1s and 43 percentof player 2s) There are also consider-able spikes in the frequency of select-ing other prices divisible by 515 Basedon Kolmogorov-Smirnov tests on theactual round-one price distribution wereject the null hypotheses of uniformdistribution (d 0250 p-value 0000) and normal distribution (d 0381 p-value 0000) We thus fol-low the convention (eg Ho et al2001) and use the actual first-roundempirical distribution of choices togenerate the first-round choices

(b) Subsequent rounds Simulated playerstrategies are determined via equation(7) for sFP and fEWA and equation(11) for PA

14 Ido Erev and Haruvy (2000) discuss the tradeoffs ofthe two approaches

15 For example the seven most frequently selected pricesare divisible by 5

1522 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iii) Simulated player 1rsquos quantity choice isbased on the following steps(a) Determine each player 1rsquos best re-

sponse to p2(b) Determine whether player 1 will devi-

ate from the best response via the ac-tual probability of errors for eachblock

(c) If yes deviate via the actual mean andstandard deviation for the block Oth-erwise play the best response

(iv) Payoffs are determined by the payoff func-tion of the compensation mechanismequation (1) for each treatment

(v) Assessments are updated according toequation (8) for fEWA and equation (10)for PA

(vi) Proceed to the next round

The discount factor r [0 1] is searched ata grid size of 01 The parameter 13 is searchedat a grid size of 01 in the interval [15 105] forfEWA and [15 250] for sFP The size of thesimilarity window h [1 10] is searched at agrid size of 1 Mood shocks z are drawn from

a uniform distribution16 on an interval [a a]where a is searched on [0 500] with a step sizeof 50 For all parameters intervals and gridsizes are determined by payoff magnitude

Table 6 reports the calibrated parameters(discount factor sensitivity parameter moodshock interval and similarity window size) forthe first two sessions of each treatment in 15-round blocks17 Estimated parameters for thesupermodular and near-supermodular treatmentsare consistent with the increased level of con-vergence over time while the 00 treatmentsare not The second column (sFP) reports thebest-fit parameters for the stochastic fictitiousplay model With the exception of treatment2000 the discount factor is close to 10

16 Chen and Yuri Khoroshilov (2003) compare threeversions of PA models where shocks were drawn fromuniform logistic and double exponential distributions ontwo sets of experimental data and find that the performanceof the three versions of PA were statistically indistinguish-able

17 We also calibrate all parameters for the entire 60rounds but do not report results due to space limitations

TABLE 6mdashCALIBRATION OF THREE LEARNING MODELS IN FIFTEEN-ROUND BLOCKS

Model sFP fEWA PA

Block 1 2 3 4 1 2 3 4 1 2 3 4

Discount rate (r) Discount rate (r)1000 10 07 09 10 01 00 09 102000 07 06 01 01 05 10 02 012018 10 10 09 09 02 10 03 021020 10 10 10 09 01 03 06 032020 10 10 10 09 02 01 05 022040 10 10 10 07 01 10 04 03

Sensitivity (13) Sensitivity (13) Shock interval (a)1000 15 44 45 28 16 45 44 37 500 500 250 502000 15 43 140 180 17 40 51 74 50 100 150 502018 16 80 115 183 15 53 62 95 50 300 500 1501020 19 59 86 138 18 47 62 82 100 500 450 3002020 28 56 81 112 24 38 61 67 200 300 350 3502040 34 65 140 205 32 49 67 99 150 50 150 50

Similarity window (h)1000 1 1 10 92000 10 10 9 62018 5 4 6 11020 1 1 8 32020 2 2 7 12040 1 3 2 2

1523VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

indicating that players keep track of all past playwhen forming beliefs about opponentsrsquo nextmove18 Given these beliefs the increasing sen-sitivity parameter 13 for all treatments except1000 indicates that the likelihood of bestresponse increases over time The third columnreports calibration results for the fEWA modelAgain with the exception of treatment 1000the parameter 13 increases over time The finalcolumn reports calibration of parameters in thePA model For each treatment except 1000 amiddle block has the highest discount factorindicating more weight on new informationabout a strategyrsquos performance The second pa-rameter in the PA model represents the upperbound of the interval from which shocks aredrawn a Experimentation should decrease inthe final rounds Indeed for all treatments theestimated mood-shock ranges (weakly) de-crease from the third to the last block Finallythe decreasing similarity windows from thethird to final block indicate decreasing strategyspillover Relatively large discount factors inthe third block combined with relatively largesimilarity windows flatten the payoff assess-ments around the most recently played strate-gies consistent with the local experimentation(or relatively stabilized play) observed in thedata

C Validation

Using the parameters calibrated on the firsttwo sessions of each treatment we next com-pare the performance of the three learning mod-els in predicting play in the hold-out sampleFor comparison we also present the perfor-mance of two static models The random choicemodel assumes that each player randomlychooses any strategy with equal probability forall rounds This model incorporates only num-ber of strategies and thus provides a minimumstandard for a dynamic learning model Theequilibrium model assumes that each playerplays the subgame perfect Nash equilibriumevery round Its fit conveys the same informa-tion as the proportion of equilibrium play pre-

sented in Section IV but with a different metric(MSD)

Table 7 presents each modelrsquos MSD scoresfor each hold-out session Recall that two ofeach treatmentrsquos four or five independent ses-sions are used for calibration and the rest forvalidation The results indicate that all threelearning models perform significantly betterthan the random choice model (p-value 001one-sided permutation tests) and by a largermargin significantly better than the equilibriummodel (p-value 001 one-sided permutationtests) While the equilibrium model does a poorjob of explaining overall experimental data itsperformance improvement over time19 justifiesthe use of learning models to explain the dy-namics of play Within each of the top threepanels (ie learning models) session-levelMSD scores are lower for the supermodular andnear-supermodular treatments indicating thateach learning model does a better job explainingbehavior in those treatments Indeed in the em-pirical learning literature learning models fitbetter in experiments with better equilibriumconvergence (see eg Chen and Khoroshilov2003)

RESULT 4 (Comparison of Learning Mod-els) The performance of PA is weakly betterthan that of fEWA and strictly better than that ofsFP The performances of fEWA and sFP arenot significantly different

SUPPORT Table 7 reports the MSD scores foreach independent session in the hold-out sampleunder each model The Wilcoxon signed-rankstests show that the MSD scores under PA areweakly lower than those under fEWA (z 0085 p-value 0068) and strictly lowerthan those under sFP (z 0028 p-value 0023) Using the same test we cannot reject thehypothesis that fEWA and sFP yield the sameMSD scores (z 0662 p-value 0508)

Although the performance of PA is onlyweakly better than that of fEWA the overallMSD scores for PA are lower than those for

18 As the discount factor is zero in Cournot best replywe can reject this model based on our estimation of thediscount factors

19 We omit the table to show the performance of theequilibrium model in blocks of 15 rounds as the informa-tion is repetitive with Figures 1 and 2

1524 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

fEWA for every treatment Therefore we usePA for forecasting beyond 60 rounds

Figure 3 presents the simulated path of thecalibrated PA model and compares it with theactual data in the hold-out sample by superim-posing the simulated mean (black line) plus andminus one standard deviation (grey lines) on theactual mean (black box) and standard deviation(error bars) The simulation does a good job oftracking both the mean and the standard devia-tion of the actual data In treatments 1000and 2040 however the reduction in variancelags that seen in the middle rounds of theexperiment

D Forecasting

We now report the results from the PAmodel As the strategic complementarity predic-tions concern long-run performance we use thecalibrated parameters for the first 58 rounds tosimulate play in later rounds This exercise al-

lows us to study convergence and efficiency inlong but finite horizons

In all forecasting exercises we base all post-58-round parameters on those from the lastblock (calibrated from rounds 46 to 58) Sincewe do not know how these parameters mightchange beyond 60 rounds we use three differ-ent specifications First we retain all final-blockparameters Second we exponentially decay insubsequent blocks the probability of deviationfrom the best response in the production stage20

Third we exponentially decay the shock inter-val in subsequent blocks As all three specifica-tions yield qualitatively similar results wereport results from only the third specificationdue to space limitations21 In presenting the

20 In block k 4 we use the probability of deviationPrk minPr4(k 4)2 108 We set a lower bound of108 to avoid the division-by-zero problem

21 Different sequences of random numbers produceslightly different parameters estimates While the conver-

TABLE 7mdashVALIDATION ON HOLD-OUT SESSIONS

Session

Stochastic fictitious play

1000 2000 2018 1020 2020 2040

1 0955 0934 0940 0930 0888 09402 0960 0946 0890 0917 0973 08783 0948 0883 0873Overall 0957 0943 0915 0910 0912 0909

fEWA

1 0956 0934 0942 0928 0887 09482 0960 0946 0890 0922 0978 08863 0946 0879 0871Overall 0958 0942 0916 0910 0912 0917

Payoff assessment

1 0952 0933 0914 0941 0888 09162 0957 0944 0899 0922 0917 08983 0947 0894 0903Overall 0955 0941 0907 0919 0902 0907

Equilibrium play

1 1867 1853 1833 1833 1489 17922 1889 1919 1606 1839 1953 16693 1917 1447 1539Overall 1878 1896 1719 1706 1660 1731

Random play

Overall 0976 0976 0976 0976 0976 0976

1525VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

results we use the shorthand notation x y todenote a measure under treatment x is signifi-cantly higher than that under treatment y at the5-percent level or less and x y to denote thatthe measures are not significantly different atthe 5-percent level

Figure 4 reports the simulated proportion of-equilibrium play We report only the resultsfor the first 500 rounds since the dynamics donot change much thereafter In our simulationall treatments reach higher convergence levelsthan those achieved in the 60 rounds of the

experiment Since the simulated variance reduc-tion lags that of the experiment round 60 resultsare not achieved until approximately 90 roundsof simulation We further note that these im-provements slow down after 200 rounds Theproportion of -equilibrium play is bounded by72 percent for player 1 and 93 percent for player2 We now outline the simulation results

RESULT 5 (Level of Convergence in Round500) In the simulated data for player 1 atround 500 we have the following level of con-vergence ranking in

(i) p1 2018 2040 2020 1020 2000 1000

(ii) -p1 2018 2040 2020 1020 2000 1000

(iii) p2 2040 2018 1020 2020 2000 1000 and

gence level of a given treatment for a given set of parameterestimates is not affected by the sequence of random num-bers this convergence level is somewhat sensitive to theexact parameter calibration We therefore conduct four dif-ferent calibrations for each treatment and select the param-eters that yield the highest level The relative rankings oftreatments are quite robust with respect to ordinal selected

FIGURE 3 SIMULATED DYNAMIC PATH VERSUS ACTUAL DATA

1526 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iv) -p2 2040 2018 1020 2020 2000 1000

SUPPORT The fourth and seventh columns ofTable 8 report p-values for t-tests of the preced-ing alternative hypotheses for players 1 and 2respectively

Comparing Results 1 and 5 we first note thatthe four supermodular and near-supermodulartreatments continue to dominate the 00 treat-ments In fact the simulations suggest that thegap remains constant compared with 2000and actually increases compared with 1000Unlike Result 1 however the four dominanttreatments differ In particular 2018 per-forms better in terms of player 1rsquos convergenceand 2040 in terms of player 2rsquos although thedifferences within the top four treatments aresmaller than the differences between these four

and the 00 treatments In addition an increasein significantly improves convergence by alarge margin (40 to 80 percent) for both playerswhen 0

It is instructive to compare these results withthose concerning the speed of convergence inour experimental treatments In terms of con-vergence to -p2 our regression results (Table4) indicate that 2018rsquos performance in laterrounds enables it to achieve the same conver-gence level as the supermodular treatmentsThis trend continues in our simulations as2018 performs robustly well compared to thesupermodular treatments Likewise in our anal-ysis of the experimental results the conver-gence speed of 2040 was not significantlydifferent from those of the other dominant treat-ments In our simulations however this treat-ment dominates all treatments in player 2equilibrium play

FIGURE 3 CONTINUED

1527VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

In our simulations the convergence improve-ment due to increasing from 0 to 20 doesdepend on (-effect on -effect) An increasein significantly decreases the improvement in

convergence level (all p-values for one-sidedt-tests are less than 001) Due to the relativelypoor performance of 1000 in our simula-tions increasing from 10 to 20 when 0

Player 1

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500Round

Ep

silo

n-e

qu

ilib

riu

m p

lay

α10β00α20β00α20β18α10β20α20β20α20β40

Player 2

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500

Round

Ep

silo

n-e

qu

ilib

riu

m p

lay α10β00

α20β00α20β18α10β20α20β20α20β40

FIGURE 4 SIMULATED -EQUILIBRIUM PLAY

1528 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

improves convergence more dramatically inround 500 than in rounds 40 to 60 In the longrun an increase in is a partial substitute for anincrease in

We now use definition 2 to examine the con-vergence speed in the long run Table 9 presentsresults for the first time a treatment reaches thelevel of convergence L We omit the two 00treatments since they do not converge to thesame levels as the other treatments In terms ofplayer 1 2018 achieves the fastest conver-gence for all levels L The picture for player 2convergence however is more subtle First forthe 20 and 18 treatments the speeds of con-vergence to all L are remarkably similarWhile 2040 lags the other treatments interms of achieving 80-percent -equilibriumplay for player 2 it is the only treatment toachieve L 90 percent

Apart from convergence to equilibrium it isalso important to look at long-run welfare prop-erties We evaluate mechanism performance us-ing the efficiency measure in Section IV

Figure 5 summarizes the simulated and actualefficiency achieved by each of the treatmentsThe top panel reports simulated results whilethe bottom panel reports efficiency in the ex-perimental data In the long run supermodularand near-supermodular treatments continue todiffer from the 00 treatments with 20 supe-rior to 10 in the 00 treatments

VI Interpretation and Discussion

The underlying force for convergence ingames with strategic complementarities is thecombination of the slope of the best-responsefunctions and adaptive learning by players In

TABLE 9mdashSPEED OF CONVERGENCE IN SIMULATED DATA

(Threshold is the percent of players playing epsilon-equilibrium strategy Entry indicates round in which went overthreshold for good where ldquomdashrdquo indicates that treatment never achieved threshold)

Threshold

Player 1 Player 2

04 05 06 07 04 05 06 07 08 09

2018 54 83 126 316 38 52 62 87 127 mdash1020 129 221 mdash mdash 36 48 63 91 148 mdash2020 61 91 149 mdash 39 51 61 84 137 mdash2040 101 166 263 496 30 38 56 94 166 339

TABLE 8mdashRESULTS OF t-TESTS COMPARING LEVEL OF CONVERGENCE OF SIMULATIONS OF EXPERIMENTAL TREATMENTS

(Values are for round 500 and based on 1500 simulated games)

Treatment

Player 1 equilibrium price Player 2 equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0073 00822000 0188 2000 1000 0000 0213 2000 1000 00002018 0265 2018 2040 0187 0381 2018 1020 02641020 0211 1020 2000 0000 0376 1020 2020 00012020 0243 2020 1020 0000 0353 2020 2000 00002040 0259 2040 2020 0007 0442 2040 2018 0000

Treatment

Player 1 -equilibrium price Player 2 -equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0216 02322000 0519 2000 1000 0000 0573 2000 1000 00002018 0713 2018 2040 0045 0870 2018 1020 00081020 0584 1020 2000 0000 0857 1020 2020 00002020 0662 2020 1020 0000 0834 2020 2000 00002040 0701 2040 2020 0000 0925 2040 2018 0000

1529VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the compensation mechanisms the slope ofplayer 2rsquos best response function is determinedby As and determine the penalty for

mismatched prices we seriously consider thepossibility that convergence in supermodularand near-supermodular treatments to an equilib-

Simulation Data

75

80

85

90

95

100

50 150 250 350 450

Round

α10β00α20β00α20β18α10β20α20β20α20β40

Experimental Data

40

45

50

55

60

65

70

75

80

85

90

0 10 20 30 40 50 60

Round

Fra

ctio

n o

f M

axim

um

Wel

fare

α10β00α20β00α20β18α10β20α20β20α20β40

Fra

ctio

n o

f M

axim

um

Wel

fare

FIGURE 5 EFFICIENCY IN THE SIMULATED AND ACTUAL DATA

1530 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium where both players announce the sameprice is not due to strategic complementaritiesbut rather to increases in and making thepenalty terms more prominent and matching strat-egies more focal22 We thus have two hypoth-eses to explain the observed improvement inconvergence to equilibrium play in supermodu-lar and near-supermodular treatments a best-response hypothesis and a matching hypothesisIn this section we present evidence that theimprovement in convergence is due to payoff-relevant changes to best responses and not tomatching becoming more focal

While the focal point hypothesis suggeststhat players converge to a match it does notspecify the price at which they match In thefirst round of our experiments almost 50 per-cent of all prices were 20 and less than 10percent were in the -equilibrium range of[15 17] In the final rounds of our experi-ments play in the four supermodular or near-supermodular treatments converges stronglyto this range suggesting more than simplematching

By equation (4) player 1rsquos best response isalways to match regardless of p2 By theGeneral Incentive Hypothesis we expectmore matching behavior by player 1 as increases By equation (5) however match-ing is a best response for player 2 only inequilibrium Therefore data for player 2 canhelp us separate the best-response and match-ing hypotheses

We first investigate which model better ex-plains our experimental data We operationalizethe best response hypothesis by the stochasticfictitious play model of Section V and thematching hypothesis with a stochastic matchingmodel23 In both models the predicted player 1price is specified by equation (6) In the match-ing model player 2 plays this price with prob-ability 1 and one of the other 40 priceswith probability 40 We estimate using thefirst two sessions of each treatment24 We then

compare the performance of the matchingmodel with the sFP model using the hold-outsample Using the Wilcoxon signed-rank teststo compare the MSD scores in the hold-outsample we conclude that sFP explains player2rsquos behavior significantly better than the match-ing model (p-value 0006) We conclude thatbest response to a predicted price explains be-havior significantly better than the matchingmodel

We next investigate whether differences inthe cost of matching defined as the differencebetween matching and best-response profitscan explain all of the differences in match-ing behavior among treatments or whetherthere exists additional matching behaviorthat cannot be explained by payoff-relevantfactors

We examine the probability that player 2 triesto match the anticipated player 1 price We donot know what price player 2 anticipates Toestimate how players weigh history we cali-brate a matching model in a manner similar tothat outlined in Section V B We assume aplayer matches the price he anticipates givenby equation (6) and we find the discount ratefor each treatment that minimizes MSD25 Thisgives us the anticipated price that best explainsthe matching hypothesis Using this discountrate we calculate for each t 1 an anticipatedplayer 1 price p1 for each player 2 We alsocalculate the opportunity cost of matching p1equal to the difference between best-responseand matching profits This opportunity cost cap-tures the payoff relevant effects of and alsodepends on the price to be matched

We next regress the probability that player 2matches the anticipated price Pr(p2 p1) onln(Round) treatment dummies and treatmentdummies interacted with ln(Round) In onespecification we include the cost of matching asa regressor If parameter changes make match-ing more focal then changes in the cost ofmatching will not explain all of the changes inprobability of matching

22 We thank an anonymous referee and the co-editor forpointing this out

23 We do not report the results of the deterministicmatching model as it performs significantly worse than itsstochastic counterpart

24 Estimated in each block are as follows 1000 10 7 6 8 2000 7 6 5 4 2018 7

2 3 3 1020 5 3 3 2 2020 1 0 0 02040 2 0 0 1

25 In all treatments the calibrated discount rate is r 01

1531VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Table 10 presents the results of two probitmodels with clustering at the individual level Inthe first specification we do not include the costof matching as a regressor In this specificationthe coefficient for D00 is negative and signifi-cant thus reducing from 20 to 0 decreases theprobability that player 2 will try to match player1rsquos price Including the cost of matching in theregression however we find that neither thedummies nor their interactions are signifi-cantmdashthe coefficient for the previously signifi-cant D00 dummy falls almost by a half whilethe precision of the estimate stays about thesame The coefficient on the cost of matchinghowever is significant and negative Thisfinding suggests that the rise in matching that

occurs when we increase from 0 to 20 is notbecause matching is more focal but ratherbecause an increase of decreases the cost ofmatching26

VII Concluding Remarks

The appeal of games with strategic comple-mentarities is simple as long as players areadaptive learners the game will in the limitconverge to the set bounded by the largest andsmallest Nash equilibria This convergence de-pends on neither initial conditions nor the as-sumption of a particular learning modelUnfortunately while many competitive envi-ronments are supermodular many are not Inthis study we examine whether there are non-supermodular games which have the same con-vergence properties as supermodular ones Wealso study whether there exists a clear conver-gence ranking among games with strategiccomplementarities

Our results confirm the findings of previousexperimental studies that supermodular gamesperform significantly better than games far be-low the supermodular threshold From a littlebelow the threshold to the threshold howeverthe change in convergence level is statisticallyinsignificant This results suggests that in thecontext of mechanism design the designerneed not be overly concerned with setting pa-rameters that are firmly above the supermodularthreshold close is just as good It also enlargesthe set of robustly stable games For examplethe Falkinger mechanism (Falkinger 1996) isnot supermodular but close These results sug-gest that near-supermodular games perform likesupermodular ones

Our next result concerns convergence perfor-mance within the class of games with strategiccomplementarities Variations in the degree ofcomplementarities have no significant effect onperformance within the 60 experimental roundsOur simulations suggest however an increased

26 We consider other specifications of anticipated priceincluding the averages of the last one two three and fourprices seen In only one specification was a dummy or itsinteraction with ln(Round) significant In that specificationthe coefficient for D18 is positive and its interaction withln(Round) is negative a finding which is not consistent witha focal point hypothesis

TABLE 10mdashPLAYER 2 MATCHING HYPOTHESIS

(Probit models with individual-level clustering comparingmodels with and without the cost to Player 2 of matching

the anticipated price of Player 1)

Model

Dependent variableProbability player 2 priceequals anticipated player 1

price

(1) (2)

D10 0014 0017(0043) (0041)

D00 0077 0043(0035) (0036)

D18 0046 0032(0053) (0056)

D40 0008 0041(0061) (0042)

ln(Round) 0041 0028(0011) (0010)

Match Cost 0001(0000)

ln(Round)D10 0014 0012(0012) (0012)

ln(Round)D00 0003 0000(0013) (0012)

ln(Round)D18 0006 0002(0021) (0020)

ln(Round)D40 0001 0015(0018) (0016)

Observations 9588 9588Log pseudo-likelihood 324321 317716

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses are adjusted for cluster-ing at the individual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummyvariable for treatment y Excluded dummy is D2020Match Cost is the difference in cents of matching andbest-responding to anticipated Player 1 price

1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 4: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

also assumed to be differentiable positive in-creasing and convex The mechanism is a two-staged game where the unique subgame-perfectNash equilibrium induces the Pareto-efficientoutcome of x such that r e(x) c(x) In thefirst stage (the announcement stage) player 1announces p1 a per-unit subsidy to be paid toplayer 2 while player 2 simultaneously an-nounces p2 a per-unit tax to be paid by player 1Announcements are revealed to both players Inthe second stage (the production stage) player 1chooses a production level x The payoff toplayer 1 is 1 rx c(x) p2x (p1 p2)2 while the payoff to player 2 is 2 p1x e( x) where 0 is a free punishmentparameter chosen by the designer

We study a generalized version of the com-pensation mechanism (Cheng 1998) whichadds a punishment term (p1 p2)2 toplayer 2rsquos payoff function thus making the pay-off functions

(1) 1 rx cx p2 x p1 p2 2

2 p1x ex p1 p22

Using the generalized version we solve thegame by backward induction In the productionstage player 1 chooses the quantity that solvesmaxx rx c(x) p2x (p1 p2)2 The firstorder condition r c(x) p2 0 character-izes the best response in the second stage x(p2)In the announcement stage player 1 solvesmaxp1

rx c(x) p2x (p1 p2)2 The firstorder condition is

(2)1

p1 2p1 p2 0

which yields the best response function forplayer 1 as p1 p2 Player 2 simultaneouslysolves maxp2

p1x(p2) e(x(p2)) (p1 p2)2 The first order condition is

(3)2

p2 p1 x p2 exx p2

2p1 p2 0

which characterizes player 2rsquos best responsefunction

The unique subgame-perfect equilibrium hasp1 p2 p where p is the Pigovian taxwhich induces the efficient quantity x As theequilibrium does not depend on the value of it holds for the original version where 0When is set appropriately however thegeneralized version is a supermodular mecha-nism The following proposition characterizesthe necessary and sufficient condition forsupermodularity

PROPOSITION 1 (Cheng 1998) The gener-alized version of the compensation mechanismis supermodular if and only if 0 and 1

2x(p2)

The proof is simple First as the strategyspace is one-dimensional the supermodularitycondition is automatically satisfied Second weuse Theorem 1 to check for increasing differ-ences For player 1 from equation (2) we have21p1p2 2 0 while for player 2 fromequation (3) we have 22p1p2 x(p2) 2 Therefore 22p1p2 0 if and only if 1

2x(p2)

To obtain analytical solutions we use a qua-dratic cost function c(x) cx2 where c 0and a quadratic externality function e(x) ex2where e 0 We now summarize the bestresponse functions equilibrium solutions andstability analysis in this environment

PROPOSITION 2 Quadratic cost and exter-nality functions yield the following character-izations

(i) The best response functions for players 1and 2 are

(4) p1 p2

(5) p2

1

4c

e

4c2

p1

er

4c2

e

4c2

and

x max0r p2

2c

(ii) The subgame-perfect Nash equilibrium ischaracterized as

1508 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

p1 p2 x

er

e c

er

e c

r

2e c

(iii) If players follow Cournot best-reply dy-namics (p1 p2) is a globally asymptoti-cally stable equilibrium of the continuoustime dynamical system for any 0 and 0

(iv) The game is supermodular if and only if 0 and 14c

PROOFSee Appendix

The best-response functions presented in part(i) of Proposition 2 reveal interesting incentivesWhile player 1 has an incentive to always matchplayer 2rsquos price player 2 has an incentive tomatch only when player 1 plays the equilibriumstrategy Also at the threshold for strategiccomplementarity 14c player 2 has a dom-inant strategy p2 er(e c) p2 Part (iii) ofProposition 2 extends Cheng (1998) whoshows that the original version of the compen-sation mechanism ( 0) is globally stableunder continuous-time Cournot best-reply dy-namics4 Cournot best-reply is however a rel-atively poor description of human learning(Richard Boylan and Mahmoud El-Gamal1993) Therefore global stability under Cournotbest reply for any 0 does not imply equi-librium convergence among human subjectsPart (iv) characterizes the threshold for strategiccomplementarity in our experiment a more ro-bust stability criterion than that characterized bypart (iii)

An intuition for how strategic complementa-rities affect the outcome of adaptive learningcan be gained from analysis of the best responsefunctions equations (4) and (5) While player1rsquos best response function is always upwardsloping player 2rsquos best response function equa-tion (5) is nondecreasing if and only if 14c ie when the game is supermodular Be-yond the threshold for strategic complementar-ity both best-response functions are upward

sloping and they intersect at the equilibrium Itis easy to verify graphically that adaptive learn-ers eg Cournot best reply will converge tothe equilibrium regardless of where they start

To examine how which is unrelated tostrategic complementarity might affect behav-ior we observe that when player 1 deviatesfrom the best response by ie p1 p2 his profit loss is 1 2 This profit losswhich is proportional to the magnitude of isthe deviation cost for player 1 Based on previ-ous experimental evidence the incentive to de-viate from best response decreases when thedeviation cost increases Chen and Plott (1996)call it the General Incentive Hypothesis ie theerror of game theoretic models decreases as theincentive to best-respond increases We there-fore expect that an increase in improvesplayer 1rsquos convergence to equilibrium Whenplayer 1 plays equilibrium strategy player 2rsquosbest response is to play equilibrium strategy aswell Therefore we expect that an increase in might improve player 2rsquos convergence to equi-librium as well It is not clear however whetherthe -effects systematically change the effectsof the supermodularity parameter We rely onexperimental data to test the interaction of the-effects on -effects

II Experimental Design

Our experimental design reflects both theo-retical and technical considerations Specifi-cally we chose an environment that allowssignificant latitude in varying the free parame-ters to better assess the performance of thecompensation mechanism around the thresholdof strategic complementarity We describe thisenvironment and the experimental proceduresbelow

A The Economic Environment

We use the general payoff functions pre-sented in equation (1) with quadratic cost andexternality functions to obtain analytical solu-tions c(x) cx2 and e(x) ex2 We use thefollowing parameters c 180 e 140 r 24 From Proposition 2 the subgame-perfectNash equilibrium is (p1 p2 x) (16 16 320)and the threshold for strategic complementarityis 14c 20

4 Cheng (1998) also shows that the original mechanismis locally stable under discrete-time Cournot best-reply dy-namics

1509VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

In the experiment each player chooses pi 0 1 40 Without the compensation mech-anism the profit-maximizing production level isx r2c 960 three times higher than theefficient level To reduce the complexity ofplayer 1rsquos problem we use a grid size of 10 forthe quantity and truncate the strategy space ieplayer 1 chooses X 0 1 50 where X x10 The payoff functions presented to the sub-jects are adjusted accordingly Truncating thestrategy space to X 50 also reduces the pos-sibility of player 2rsquos bankruptcy To reducepayoff asymmetry we give player 1 a lump-sum payment of 250 points each round There-fore the equilibrium payoffs under themechanism for the two players are 1 1530and 2 2560

The functional forms and specific parametervalues are chosen for several reasons Firstequilibrium solutions are integers Secondequilibrium prices and quantities do not lie inthe center of the strategy space thus avoidingequilibrium convergence as a result of focalpoints Third there is a salient gap betweenefficiency with and without the mechanismWithout the mechanism the profit-maximizingproduction level is X 50 resulting in anefficiency level of 684 percent5 With themechanism the system achieves 100-percentefficiency in equilibrium Finally since thethreshold for strategic complementarity is 20 there is a large number of integer values tochoose from both below and above thethreshold

To study how strategic complementarity af-fects equilibrium convergence we keep 20and vary 0 18 20 and 40 To studywhether affects convergence we also keep 10 and vary 0 20

B Experimental Procedures

Our experiment involves 12 players persessionmdashsix player 1s (called Red players in the

instructions) and six player 2s (Blue players)Each player remains the same type throughoutthe experiment At the beginning of each ses-sion subjects randomly draw a PC terminalnumber Each then sits in front of the corre-sponding terminal and is given printed instruc-tions After the instructions are read aloudsubjects are encouraged to ask questions Theinstruction period lasts between 15 and 30minutes

Each round a player 1 is randomly matchedwith a player 2 Subjects are randomly re-matched each round to minimize repeated gameeffects The random rematching protocol alsominimizes the possibility that players colludeon a high-subsidy and low-tax outcome6 Eachsession consists of 60 rounds As we are inter-ested in learning there are no practice roundsEach round consists of two stages

(i) Announcement Stage Each player simulta-neously and independently chooses a pricepi 0 1 40

(ii) Production Stage After (p1 p2) are chosenplayer 1rsquos computer displays player 2rsquosprice and a payoff table showing her payofffor each X 0 1 50 Player 1 thenchooses a quantity X The server calculatespayoffs and sends each player his payoffthe quantity chosen and the prices submit-ted by him and his match

To summarize each subject knows both pay-off functions the choices made each round byhimself and his match as well as his per-periodand cumulative payoffs At any point subjectshave ready access to all of this information Themechanism is thus implemented as a game ofcomplete information We do not know how-ever how subjects processed this informationnor do we know their beliefs about the rational-

5 We compute the efficiency level by using the ratio oftotal earnings without the mechanism and total earningswith the mechanism Without the mechanism at the pro-duction level of X 50 (or x 500) total earning is 1 2 (rx cx2) ex2 2625 Therefore the efficiencylevel is 2625(1280 2560) 0684

6 If the players could commit to maximizing joint profitsby choosing p1 and p2 cooperatively and x noncoopera-tively they would choose by equation (1) (p1 p2 x) (4( )er[4( )(e c) 1] r(4e( ) 1)[4( )(e c) 1] 2r( )[4( )(e c) 1]) With ourchoice of parameters we get p1 24 p2 12 for 20and 0 p1 192 p2 144 for 20 and 20and p1 18 p2 15 for 20 and 40 etc If inaddition they could choose (p1 p2 x) cooperatively then forour parameter values we get (p1 p2 x) (min480[3( ) 20] 40 min960( )[3( ) 20] 50)

1510 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

ity of others Both of these factors introduceuncertainty in the environment

Table 1 presents features of the experimentalsessions including parameters number of inde-pendent sessions in each treatment whether themechanism is supermodular in that treatmentand equilibrium prices and quantities Overall27 independent computerized sessions wereconducted in the Research Center for GroupDynamics (RCGD) lab at the University ofMichigan from April to July 2001 and in April2003 We used zTree to program our experi-ments Our subjects were students from the Uni-versity of Michigan No subject was used inmore than one session yielding 324 subjectsEach session lasted approximately one-and-a-half hours The exchange rate for all treatmentswas one dollar for 4250 points The averageearnings was $2282 Data are available fromthe authors upon request

III Hypotheses

Given the design above we next identify ourhypotheses To do so we first define and discusstwo measures of convergence the level andspeed7 In theory convergence implies that allplayers play the stage game equilibrium and nodeviation is observed This is not realistic how-ever in an experimental setting Therefore wedefine the following measures

DEFINITION 1 The level of convergence atround t L(t) is measured by the proportion ofNash equilibrium play in that round The levelof convergence for a block of rounds Lb(t1 t2)measures the average proportion of Nash equi-librium play between rounds t1 and t2 ie Lb(t1t2) yentt1

t2 L(t)(t2 t1 1) where 0 t1 t2 T and T is the total number of rounds

We define the level of convergence for both around and a block of rounds The block conver-gence measure smooths out inter-roundvariation

Ideally the speed of convergence shouldmeasure how quickly all players converge toequilibrium strategies In our experimental set-ting however we never observe perfect con-vergence We therefore use a more generaldefinition for the speed of convergence

DEFINITION 2 For a given level of conver-gence L (0 1] the speed of convergence ismeasured by the first round in which the level ofconvergence reaches L and does not subse-quently drop below this level ie such thatL(t) L for any t

We use Definition 2 for analysis of conver-gence speed in simulations Due to oscillation inexperimental data Definition 2 is not practicalWe use the following observation which en-ables us to measure convergence speed usingestimates for the slope of L(t) L(t) L(t 1) L(t) obtained in regression analysis LetLy(t) be the convergence level of treatment y at

7 We thank anonymous referees for suggesting this sep-aration and appropriate measures

TABLE 1mdashFEATURES OF EXPERIMENTAL SESSIONS

Parameters and treatments Properties Equilibrium

Supermodular (p1 p2 X)10 20

1000 20000 (4 sessions) (5 sessions)

2018 18 (4 sessions) No (16 16 32)

1020 202020 (5 sessions) (5 sessions)

204040 (4 sessions) Yes (16 16 32)

1511VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

time t We now relate the slope of L(t) and theinitial level of convergence L(1) to the speed ofconvergence

OBSERVATION 1 If L1(1) L2(1) andL1(t) L2(t) for all t [1 T 1] thengiven any L (0 1] the first treatment con-verges more quickly than the second treatment

Based on theories presented in Section I wenow form our hypotheses about the level andspeed of convergence While theories of strate-gic complementarities do not make any predic-tions about the speed of convergence we formour hypotheses based on previous experimentsthat incidentally address games with strategiccomplementarities

HYPOTHESIS 1 When 20 increasing from 0 to 20 significantly increases (a) the leveland (b) the speed of convergence

Hypothesis 1 is based on the theoretical predic-tion that games with strategic complementari-ties converge to the unique Nash equilibrium aswell as on previous experimental findings thatsupermodular games perform robustly betterthan their non-supermodular counterparts

HYPOTHESIS 2 When 20 increasing from 0 to 18 significantly increases (a) the leveland (b) the speed of convergence

Hypothesis 2 is based on the findings ofFalkinger et al (2000) that average play is closeto equilibrium when the free parameter isslightly below the supermodular threshold

HYPOTHESIS 3 When 20 increasing from 18 to 20 significantly increases (a) thelevel and (b) the speed of convergence

Since we have not found any previous experi-mental studies that compare the performance ofgames with strategic complementarities withthose near the threshold Hypothesis 3 is purespeculation

HYPOTHESIS 4 When 20 increasing from 20 to 40 does not significantly in-crease either (a) the level or (b) the speed ofconvergence

Since we have not found any previous experi-mental studies within the class of games withstrategic complementarities Hypotheses 4 isagain our speculation

HYPOTHESIS 5 When 0 or 20 increas-ing from 10 to 20 significantly increases (a)the level and (b) the speed of convergence

HYPOTHESIS 6 Changing from 10 to 20significantly increases the improvement in (a)the level and (b) the speed of convergence thatresults from increasing from 0 to 20

Hypothesis 5 is based on experimental findingssupporting the General Incentive HypothesisHypothesis 6 is our speculation

Hypotheses 1 through 6 are concerned withonly one measure of performance convergenceto equilibrium Other measures such as effi-ciency and budget balance can be largely de-rived from convergence patterns Thereforealthough we omit the formal hypotheses wepresent results regarding these measures in Sec-tions IV and V

IV Experimental Results

In this section we compare the performanceof the mechanism as we vary and At theindividual level we look at both the level andspeed of convergence to subgame-perfect equi-librium and near equilibrium in each of the sixdifferent treatments At the aggregate level weexamine the efficiency and budget imbalancegenerated by each treatment In the followingdiscussion we focus on prices rather than onquantity Recall that player 1rsquos best response inthe production stage is uniquely determined byp2 and that player 1 has all of the informationneeded to select this best response In our ex-periment deviations in the production stagetend to be small (the average absolute deviationis less than 1 in 23 of 27 sessions) and does notdiffer significantly among treatments There-fore it is not surprising that in all of our anal-yses the results for quantities largely mirror theresults for player 2rsquos price8

Recall that subgame perfect Nash equilib-

8 Tables are available from the authors upon request

1512 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium for a stage game is (p1 p2 X) (16 1632) Since the strategy space in this experimentis rather large and the payoff function is rela-tively flat near equilibrium a small deviationfrom equilibrium is not very costly For exam-ple in the 2020 treatment a one-unit unilat-eral deviation from equilibrium prices costsplayer 1 $0005 and player 2 $0014 Thereforewe check the -equilibrium play by looking atthe proportion of price announcements within1 of the equilibrium price and the quantityannouncement within 4 of the equilibriumquantity since a one-unit change in p2 results ina four-unit best-response quantity changeTherefore the -equilibrium prediction is (-p1-p2 -x) (15 16 17 15 16 1728 32 36)

Figures 1 and 2 contain box and whiskersplots for the prices of each treatment for all 60rounds by players 1 and 2 respectively The boxrepresents the ranges of the twenty-fifth andseventy-fifth percentiles of prices while thewhiskers extend to the minimum and maximumprices in each round The horizontal line withineach box represents the median price Com-pared with the 00 treatments equilibriumprice convergence is clearly more pronouncedin the supermodular and near-supermodulartreatments

To analyze the performance of the compen-sation mechanism we first compare the conver-gence level achieved in the last 20 rounds ofeach treatment Table 2 reports the level ofconvergence (Lb(41 60)) to subgame-perfectNash equilibrium (panels A and B) and -Nashequilibrium (panels C and D) for each sessionunder each of the six different treatments aswell as the alternative hypotheses and the cor-responding p-values of one-tailed permutationtests9 under the null hypothesis that the con-vergence levels in the two treatments (column

[8]) are the same While the proportion of-equilibrium play is much higher than the pro-portion of equilibrium play the results of thepermutation tests largely follow similar pat-terns We now formally test our hypothesesregarding convergence level Parts (i) to (iii) ofResult 1 present the effects of the degree ofstrategic complementarity (-effects) Parts (iv)and (v) present effects due to changes in (-effects)

RESULT 1 (Level of Convergence in theLast 20 Rounds)

(i) When 20 increasing from 0 to 18and from 0 to 20 significantly10 increasesthe level of convergence in p1 p2 and -p2

(ii) When 20 increasing from 18 to 20does not change the level of convergencesignificantly

(iii) When 20 increasing from 20 to 40does not change the level of convergencesignificantly

(iv) When 0 increasing from 10 to 20weakly increases the level of convergencein p2 and significantly increases the levelof convergence in -p2 but has no signif-icant effects on p1 or -p1

(v) When 20 increasing from 10 to 20weakly increases the level of convergencein p1 but not in -p1 p2 or -p2

SUPPORT The last two columns of Table2 report the corresponding alternative hypothe-ses and permutation test results

By part (i) of Result 1 we accept Hypotheses1(a) and 2(a) Part (i) also confirms previousexperimental findings that supermodular gamesperform significantly better than those far fromthe supermodular threshold Furthermore near-supermodular games also perform significantlybetter than those far from the threshold

By part (ii) however we reject Hypothesis3(a) This is the first experimental result thatshows that from a little below the supermodularthreshold ( 18) to the threshold ( 20)

9 The permutation test also known as the Fisher random-ization test is a nonparametric version of a difference oftwo means t-test (See eg Sidney Siegel and N JohnCastellan 1988) The idea is simple and intuitive by pool-ing all independent observations the p-value is the exactprobability of observing a separation between the two treat-ments as the one observed when the pooled observations arerandomly divided into two equal-sized groups This testuses all of the information in the sample and thus has100-percent power-efficiency It is among the most power-ful of all statistical tests

10 When presenting results throughout the paper wefollow the convention that a significance level of 5 percentor less is significant while a significance level between 5percent and 10 percent is weakly significant

1513VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

improvement in convergence level is statisti-cally insignificant In other words we do notsee a dramatic improvement at the thresholdThis implies that the performance of near-supermodular games such as the Falkingermechanism ought to be comparable to that ofsupermodular games

By contrast we accept Hypothesis 4(a) bypart (iii) This is the first experimental resultsystematically comparing convergence levels ofsupermodular games where theory is silentThe convergence level does not significantlyimprove as increases from the threshold 20to 40 Therefore the marginal returns for being

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β00 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β00 Player 1

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β20 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β20 Player 1

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α20β18 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β40 Player 1

Pri

ce A

nn

ou

nce

d

Round

FIGURE 1 DISTRIBUTION OF ANNOUNCED PRICES IN EXPERIMENTAL TREATMENTS PLAYER 1

1514 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

ldquomore supermodularrdquo diminish once the payoffsbecome supermodular

Proposition 2 predicts convergence underCournot best reply for any 0 Howeverthere is a significant difference in convergencelevel as increases from 0 to 18 20 and

beyond We investigate in Section V whetherthis difference persists in the long run

While parts (i) to (iii) present the -effectsparts (iv) and (v) examine the -effects and wepartially accept Hypothesis 5(a) Recall fromequation (5) and subsequent discussions that at

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β00 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β00 Player 2

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β20 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β20 Player 2

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α20β18 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β40 Player 2

Pri

ce A

nn

ou

nce

d

Round

FIGURE 2 DISTRIBUTION OF ANNOUNCED PRICES IN EXPERIMENTAL TREATMENTS PLAYER 2

1515VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the threshold of strategic complementarity 20 player 2rsquos Nash equilibrium strategy is alsoa dominant strategy The finding of no -effecton player 2rsquos equilibrium play when 20 isconsistent with this observation

To determine the effects of strategic comple-mentarities and other factors on convergencespeed and to investigate further their effects onconvergence level we use probit models withclustering at the individual level The resultsfrom these models are presented in Table 3 Thedependent variable is -p1 in specifications (1)and (3) and -p2 in specifications (2) and (4)

In specifications (1) and (2) the independentvariables are treatment dummies Dy wherey 1000 2000 18 1020 and 40ln(Round) and a constant We use dummies fordifferent values of and rather than directparameter values to avoid assuming a linearrelationship of parameter effects and omit thedummy for 2020 Therefore in these twospecifications restricting learning speed to bethe same across treatments the estimated coef-ficient of Dy captures the difference in theconvergence level between treatments y and2020 Results from these specifications are

TABLE 2mdashLEVEL OF CONVERGENCE IN THE LAST 20 ROUNDS

Panel A Proportion of player 1 Nash equilibrium price (p1) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0000 0008 0158 0008 0044 1000 2000 03972000 0000 0083 0083 0042 0058 0053 2000 2020 00082018 0125 0117 0100 0192 0133 2000 2018 00081020 0242 0067 0067 0067 0208 0130 1020 2020 00642020 0325 0267 0208 0058 0367 0245 2018 2020 00642040 0175 0400 0158 0133 0217 2020 2040 0643

Panel B Proportion of player 2 Nash equilibrium price (p2) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0025 0042 0025 0067 0040 1000 2000 00562000 0067 0083 0033 0075 0050 0062 2000 2020 00322018 0133 0292 0108 0258 0198 2000 2018 00081020 0392 0108 0175 0067 0667 0282 1020 2020 05042020 0308 0117 0483 0017 0450 0275 2018 2020 02382040 0325 0492 0233 0233 0321 2020 2040 0365

Panel C Proportion of player 1 -Nash equilibrium price (-p1) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0108 0200 0350 0158 0204 1000 2000 01832000 0067 0458 0358 0183 0425 0298 2000 2020 00872018 0317 0625 0300 0583 0456 2000 2018 01031020 0475 0200 0300 0375 0492 0368 1020 2020 01942020 0450 0558 0475 0133 0775 0478 2018 2020 04442040 0458 0492 0375 0442 0442 2020 2040 0643

Panel D Proportion of player 2 -Nash equilibrium price (-p2) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0133 0200 0242 0225 0200 1000 2000 00162000 0300 0400 0200 0275 0333 0302 2000 2020 00162018 0717 0667 0342 0700 0606 2000 2018 00161020 0650 0517 0542 0400 0892 0600 1020 2020 05642020 0533 0592 0642 0225 0900 0578 2018 2020 05562040 0825 0792 0467 0517 0650 2020 2040 0294

Note Significant at 10-percent level 5-percent level 1-percent level

1516 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

largely consistent with Result 1 which uses amore conservative test The coefficients ofln(Round) are both positive and highly signifi-cant indicating that players learn to play equilib-rium strategies over time The concave functionalform ln(Round) which yields a better log-likelihood than either the linear or quadratic func-tional form indicates that learning is rapid at thebeginning and decreases over time We will ex-amine learning in more detail in Section V

In specifications (3) and (4) we useln(Round) interaction of each of the treatmentdummies with ln(Round) and a constant asindependent variables The interaction term al-lows different slopes for different treatmentsCompared with the coefficient of ln(Round) the

coefficient for the interaction term Dyln(Round) captures the slope differences be-tween treatment y and 2020 By Observation1 as we cannot reject the hypotheses that initialround probability of equilibrium play is thesame across all treatments11 we can comparethe slope of L(t) L(t) between different treat-ments From the probit specification L(t) 13(constant Db ln(t)) where D is the 1 5vector of treatment dummies and b is the 5 1vector of estimated coefficients we can derive

11 When including treatment dummies in this specifica-tion Wald tests on the hypothesis that the treatment dummycoefficients are all zero yield p-values of 02703 for -p1and 03383 for -p2

TABLE 3mdashCONVERGENCE SPEED PROBIT MODELS WITH CLUSTERING AT INDIVIDUAL LEVEL

Dependent variable -Nash equilibrium play

(1) -Nashprice 1

(2) -Nashprice 2

(3) -Nashprice 1

(4) -Nashprice 2

D1000 0143 0268(0048) (0050)

D2000 0090 0217(0060) (0057)

D18 0005 0004(0071) (0069)

D1020 0080 0025(0060) (0073)

D40 0000 0078(0068) (0078)

ln(Round) 0115 0167 0131 0183(0014) (0017) (0021) (0022)

D1000ln(Round) 0050 0095(0019) (0022)

D2000ln(Round) 0031 0073(0020) (0021)

D18ln(Round) 0001 0004(0020) (0021)

D1020ln(Round) 0024 0008(0019) (0022)

D40ln(Round) 0002 0023(0019) (0024)

Observations 9720 9720 9720 9720Number of groups 162 162 162 162Log pseudo-likelihood 5562890 5824055 5557424 5793013

D1000 D2000 150 131D1000ln(Round) D2000ln(Round) 133 123D1020 D1000 D2000 005 107D1020ln(Round) D1000ln(Round)

D2000ln(Round)005 103

Notes Coefficients are probability derivatives Robust standard errors in parentheses are adjusted for clustering at theindividual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummy variable fortreatment y Excluded dummy is D2020 The bottom panel presents null hypotheses and Wald 2(1) test statistics

1517VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the slope of the probability function for treat-ment y with respect to t Ly(t) 13(con-stant Db)byt All coefficients reported inTable 3 are increments of probability deriva-tives 13(constant Db)by compared to thebaseline case of 2020

RESULT 2 (Speed of Convergence)

(i) When 20 increasing from 0 to 18and from 0 to 20 significantly increases thespeed of convergence in -p1 and -p2

(ii) When 20 increasing from 18 to 20has no significant effect on convergencespeed

(iii) When 20 increasing from 20 to 40has no significant effect on convergencespeed

(iv) When 0 increasing from 10 to 20has no significant effects on convergencespeed

(v) When 20 increasing from 10 to 20has no significant effects on convergencespeed

SUPPORT Models (3) and (4) in Table 3 re-port analyses on convergence speed For eachindependent variable the coefficients standarderrors (in parentheses) and significance levelsare reported The second Wald test looks atwhether the coefficient of D1000ln(Round)equals that of D2000ln(Round)

Part (i) of Result 2 supports Hypotheses 1(b)and 2(b) while by part (ii) we reject Hypothesis3(b) Part (iii) supports Hypothesis 4(b) Parts(iv) and (v) reject Hypothesis 5(b) Result 2provides the first empirical evidence on the roleof strategic complementarity and the speed ofconvergence

Although part (ii) indicates that increasing from 18 to 20 does not significantly changeconvergence speed we now investigate whetherthere are any differences between 18 and thesupermodular treatments In particular we com-pare 2018 with 2020 and 204012 InResult 1 we show that these treatments achieve

the same convergence level Also we cannotreject the hypothesis that round-one prices foreach player are drawn for the same distributionfor each treatment13 As the treatments all startand converge to similar levels of equilibriumplay we use a more flexible functional form tolook for differences in speed

In Table 4 we report the results of probitregressions comparing 2018 with the two20 supermodular treatments We use Roundln(Round) and their interactions with 2018as independent variables to allow different con-vergence speeds to the same level of conver-gence In model (1) the dependent variable isplayer 1 -equilibrium play There is no signif-icant difference between 2018 and the 20supermodular treatments In model (2) the de-pendent variable is player 2 -equilibrium playThere are significant differences between thenear-supermodular and 20 supermodular treat-ments In early rounds ln(Round) is large relativeto Round The negative and weakly significantcoefficient on D2018ln(Round) implies a lowerprobability of early-round equilibrium play in

12 We omit 1020 from this analysis First it does notachieve the same -p1 convergence level as the other su-permodular treatments Second omitting it avoids the pos-sibility of an -effect

13 Kolmogorov-Smirnov test result tables are availableby request

TABLE 4mdashCONVERGENCE SPEED

Probit models with individual-level clustering comparing2018 with the 20 supermodular treatments achieving

similar convergence levels

Dependent variable -Nashequilibrium play

(1) -Nashprice 1

(2) -Nashprice 2

ln(Round) 0054 0216(0040) (0043)

Round 0004 0001(0002) (0002)

D2018ln(Round) 0013 0066(0032) (0035)

D2018Round 0001 0006(0002) (0003)

Observations 4680 4680Log pseudo-likelihood 287907 299385

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses Significant at 10-percent level 5-percent level 1-percent level Dy isthe dummy variable for treatment y Excluded dummies areD2020 and D2040

1518 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

2018 relative to the supermodular treatmentsThe 18 treatment does catch up to these treat-ments which is reflected in the positive andsignificant coefficient on D2018 interactedwith Round This result suggests a differencebetween supermodular and near-supermodulartreatments while they achieve similar conver-gence levels the supermodular treatments per-form better in early rounds and thus mightreach certain convergence levels faster thantheir near-supermodular counterparts

The previous discussion examines the sepa-rate effects of strategic complementarity (-effects) and (-effects) on convergenceVarying allows us however to test the im-portance of strategic complementarity relativeto other features of mechanism design Having afull factorial design in the parameters 1020 and 0 20 we can study whether affects the role of strategic complementarity Inparticular as increases from 0 to 20 weexpect improvement in convergence level andspeed We study whether this improvementchanges when increases from 10 to 20

The last two Wald tests in the bottom panelof Table 3 separately examine the -effectson the change in convergence level and speedresulting from increasing from 0 to 20(-effects on -effects) Changing from 10 to20 does not significantly change the improve-ment in either convergence level or speed Thisis the first empirical result examining the ef-fects of other factors on the role of strategiccomplementarity

So far we have discussed the performance ofthe mechanism relative to equilibrium predic-tions and individual behavior We now turn togroup-level welfare results Since the compen-sation mechanism balances the budget only in

equilibrium total payoffs off the equilibriumpath can be only weakly related to efficientpayoffs Therefore we use two separate mea-sures to capture welfare implications an effi-ciency measure and a measure of budgetimbalance

We first define the per-round efficiency mea-sure which includes neither the taxsubsidy northe penalty terms ie

et rxt cxt ext

rx cx ex

where x is the efficient quantity The efficiencyachieved in a block e(t1 t2) where 0 t1 t2 60 is then defined as

et1 t2 t t1

t2 et

t2 t1 1

RESULT 3 (Efficiency in the Last 20Rounds)

(i) When 20 increasing from 0 to 18significantly improves efficiency while in-creasing from 0 to 20 weakly improvesefficiency

(ii) When 20 increasing from 18 to 20has no significant effect on efficiency

(iii) When 20 increasing from 20 to 40weakly improves efficiency

(iv) When 0 increasing from 10 to 20weakly improves efficiency

(v) When 20 increasing from 10 to 20has no significant effect on efficiency

SUPPORT Table 5 presents the efficiencymeasure for each session in each treatment

TABLE 5mdashEFFICIENCY MEASURE

Efficiency Last 20 rounds

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0600 0671 0804 0858 0733 1000 2000 00872000 0650 0870 0907 0873 0825 0825 2000 2020 00952018 0963 0952 0889 0959 0941 2000 2018 00161020 0958 0919 0905 0881 0984 0929 1020 2020 07142020 0888 0937 0939 0780 0971 0903 2018 2020 08332040 0973 0962 0943 0949 0957 2020 2040 0064

Note Significant at 10-percent level 5-percent level 1-percent level

1519VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

in the last 20 rounds the alternative hypoth-eses and the results of one-tailed permutationtests

Result 3 is largely consistent with Result 1indicating that supermodular and near-supermodular mechanisms induce a signifi-cantly higher proportion of equilibrium playthan mechanisms far from the supermodularthreshold The new finding is that increasing from the threshold 20 to 40 weakly improvesefficiency

Second the budget surplus at round t is thesum of the penalty terms plus tax minus sub-sidy in that round ie s(t) ( )(p1(t) p2(t))2 p2(t)x(t) p1(t)x(t) Examining thesession-level budget surplus across all roundswe find the following results First 22 out of 27sessions result in a budget surplus This comesfrom a combination of our choice of parametersand the dynamics of play Second the onlysignificant difference across treatments is thatbudget surpluses under 2018 and 2020are significantly lower than those under2040 ( p-values 0029 and 0024 respec-tively) This finding points to a potential costof driving up the punishment parameter Inother words before the system equilibrates ahigh punishment parameter can worsen thebudget imbalance problem inherent in themechanism

Overall Results 1 through 3 suggest the fol-lowing observations regarding games with stra-tegic complementarities First in terms ofconvergence level and speed supermodular andnear-supermodular games perform significantlybetter than those far under the threshold Sec-ond while there is no significant differencebetween supermodular and near-supermodulargames in terms of convergence level early per-formance differences may lead to supermodulartreatments reaching certain convergence levelsmore quickly Third beyond the threshold in-creasing has no significant effect on either thelevel or the speed of convergence but has thedisadvantage of risking higher budget imbal-ance Last for a given increasing haspartial effects on convergence level but nosignificant effect on convergence speed Tocheck the persistence of these experimental re-sults in the long run we use simulations inSection V

V Simulation Results Continued Dynamics

Our experiment examines the relationship be-tween strategic complementarity and conver-gence to equilibrium Learning theory predictslong-run convergence Figures 1 and 2 showthat convergence continues to improve in laterrounds for several treatments suggesting con-tinued dynamics Due to time attention andresource constraints it was not feasible to runthe experiment for much longer than 60rounds Therefore we rely on simulations tostudy continued convergence beyond 60rounds

To do so we look for a learning algorithmwhich when calibrated closely approximatesthe observed dynamic paths over 60 roundsThe large empirical literature on learning ingames (see eg Camerer 2003 for a survey)suggests many models Our interest here is notto compare the performance of various learningmodels We look for learning models that matchtwo criteria First we require a model that per-forms well in a variety of experimental gamesSecond given our experiment has complete in-formation about the payoff structure the modelneeds to incorporate this information In thefollowing subsections we first introduce threelearning models which meet our criteria Wethen look at how well the learning models pre-dict the experimental data by calibrating eachalgorithm on a subset of the experimental dataand validating the models on a hold-out sampleWe then report the forecasting results using oneof the calibrated algorithms

A Three Learning Models

The models we examine are stochastic ficti-tious play with discounting (hereafter shortenedas sFP) (Yin-Wong Cheung and Daniel Fried-man 1997 Fudenberg and Levine 1998) func-tional EWA (fEWA) (Teck-Hua Ho et al2001) and the payoff assessment learning model(PA) (Rajiv Sarin and Farshid Vahid 1999)We now give a brief overview of each modelInterested readers are referred to the originalsfor complete descriptions

The particular version of sFP that we use islogistic fictitious play (see eg Fudenberg andLevine 1998) A player predicts the matchrsquosprice in the next round pi(t 1) according to

1520 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(6) pi t 1

pi t yen 1

t 1

rpi t

1 yen 1

t 1

r

for some discount factor r [0 1] Note r 0corresponds to the Cournot best reply assess-ment pi(t 1) pi(t) When r 1 it yieldsthe standard fictitious play assessment Theusual adaptive learning model assumes 0 r 1All observations influence the expected state butmore recent observations have greater weight

As opposed to standard fictitious play stochas-tic fictitious play allows decision randomizationand thus better captures the human learning pro-cess Omitting time subscripts the probability thata player announces price pi is given by

(7) Pr pipi exp13 pi pi

yenj 0

40

exp13 pj pj

Given a predicted price a player is thus morelikely to play strategies that yield higher pay-offs How much more likely is determined by 13the sensitivity parameter As 13 increases theprobability of a best response to pi increases

The second model we consider fEWA is aone-parameter variant of the experience-weighted attraction (EWA) model (Camererand Ho 1999) In this model strategy probabil-ities are determined by logit probabilities simi-lar to equation (7) with actual payoffs ((pipi)) replaced by strategy attraction In bothvariants strategy attraction Ai

j(t) and an expe-rience weight N(t) are updated after everyperiod The experience weight is updated ac-cording to N(t) (1 ) N(t 1) 1where is the change-detection parameter and controls exploration (low ) versus exploita-tion Attractions are updated according to thefollowing rule

(8) Aijt

Nt 1Ai

jt 1 1 I pij pi ti pj pi t

Nt

where the indicator function I(pij pi(t)) equals

one if pij pi(t) and zero otherwise and [0

1] is the imagination weight In EWA all pa-rameters are estimated whereas in fEWA theseparameters (except for 13) are endogenously de-termined by the following functions Thechange-detector function i(t) is given by

(9) i t 1 5 j 1

mi Ipij pi t

1

yen 1

t

I pij pi t

t 2

This function will be close to 1 when recenthistory resembles previous history The imagi-nation weight i(t) equals i(t) while equalsthe Gini coefficient of previous choice frequen-cies EWA models encompass a variety offamiliar learning models cumulative reinforce-ment learning ( 0 1 N(0) 1)weighted reinforcement learning ( 0 0N(0) 1) weighted fictitious play ( 1 0) standard fictitious play ( 1 0)and Cournot best reply ( 1 1)

Finally we introduce the main componentsof the PA model For simplicity we omit allsubscripts that represent player i and let j(t)represent the actual payoff of strategy j in roundt Since the game has a large strategy space weincorporate similarity functions into the modelto represent agent use of strategy similarity Asstrategies in this game are naturally ordered bytheir labels we use the Bartlett similarity func-tion fjk(h t) to denote the similarity betweenthe played strategy k and an unplayed strategyj at period t

fjk h t 1 j kh if j k h0 otherwise

In this function the parameter h determines theh 1 unplayed strategies on either side of theplayed strategy to be updated When h 1fjk(1 t) degenerates into an indicator functionequal to one if strategy j is chosen in round t andzero otherwise

The PA model assumes that a player is amyopic subjective maximizer That is hechooses strategies based on assessed payoffs

1521VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

and does not explicitly take into account thelikelihood of alternate states Let uj(t) denotethe subjective assessment of strategy pj at timet and r the discount factor Payoff assessmentsare updated through a weighted average of hisprevious assessments and the payoff he actuallyobtains at time t If strategy k is chosen at timet then

(10) uj t 1 1 rfjk h tuj t

rfjk h tk t j

Each period the assessed strategy payoffs aresubject to zero-mean symmetrically distributedshocks zj(t) The decision maker chooses on thebasis of his shock-distorted subjective assess-ments uj(t) uj(t) zj(t) At time t he choosesstrategy pk if

(11) uk t uj t 0 pj pk

Note that mood shocks affect only his choicesand not the manner in which assessments areupdated

B Calibration

Literature assessing the performance oflearning models contains two approaches to cal-ibration and validation The first approach cal-ibrates the model on the first t rounds andvalidates on the remaining rounds The secondapproach uses half of the sessions in each treat-ment to calibrate and the other half to validateWe choose the latter approach for two reasonsFirst this approach is feasible as we have mul-tiple independent sessions for each treatmentSecond we need not assume that the parametersof later rounds are the same as those in earlierrounds We thus calibrate the parameters ofeach model in blocks of 15 rounds using theexperimental data from the first two sessions ofeach treatment We then evaluate each model bymeasuring how well the parameterized modelpredicts play in the remaining two or threesessions

For parameter estimation we conduct MonteCarlo simulations designed to replicate thecharacteristics of the experimental settings In

all calibrations we exclude the last two rounds(59 and 60) to avoid any end-of-game effectsWe then compare the simulated paths with theexperimental data to find those parameters thatminimize the mean-squared deviation (MSD)scores Since the final distributions of our pricedata are unimodal the simulated mean is aninformative statistic and is well captured byMSD (Ernan Haruvy and Dale Stahl 2000) Inall simulations we use the k-period-aheadrather than the one-period-ahead approach14 be-cause we are interested in forecasting the long-run mechanism performance In doing so wechoose k 10 15 20 30 and 58 We look atblocks of 10 15 20 and 30 rounds because asplayers gain information and experience infor-mation use may change over time We use k 15 rather than the other values because it bestcaptures the actual dynamics in the experimen-tal data

Each simulation consists of 1500 games(18000 players) and the following steps

(i) Simulated players are randomly matchedinto pairs at the beginning of each round

(ii) Simulated players select price announce-ments(a) Initial round Almost half of all play-

ers selected a first-round price of 20(44 percent of player 1s and 43 percentof player 2s) There are also consider-able spikes in the frequency of select-ing other prices divisible by 515 Basedon Kolmogorov-Smirnov tests on theactual round-one price distribution wereject the null hypotheses of uniformdistribution (d 0250 p-value 0000) and normal distribution (d 0381 p-value 0000) We thus fol-low the convention (eg Ho et al2001) and use the actual first-roundempirical distribution of choices togenerate the first-round choices

(b) Subsequent rounds Simulated playerstrategies are determined via equation(7) for sFP and fEWA and equation(11) for PA

14 Ido Erev and Haruvy (2000) discuss the tradeoffs ofthe two approaches

15 For example the seven most frequently selected pricesare divisible by 5

1522 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iii) Simulated player 1rsquos quantity choice isbased on the following steps(a) Determine each player 1rsquos best re-

sponse to p2(b) Determine whether player 1 will devi-

ate from the best response via the ac-tual probability of errors for eachblock

(c) If yes deviate via the actual mean andstandard deviation for the block Oth-erwise play the best response

(iv) Payoffs are determined by the payoff func-tion of the compensation mechanismequation (1) for each treatment

(v) Assessments are updated according toequation (8) for fEWA and equation (10)for PA

(vi) Proceed to the next round

The discount factor r [0 1] is searched ata grid size of 01 The parameter 13 is searchedat a grid size of 01 in the interval [15 105] forfEWA and [15 250] for sFP The size of thesimilarity window h [1 10] is searched at agrid size of 1 Mood shocks z are drawn from

a uniform distribution16 on an interval [a a]where a is searched on [0 500] with a step sizeof 50 For all parameters intervals and gridsizes are determined by payoff magnitude

Table 6 reports the calibrated parameters(discount factor sensitivity parameter moodshock interval and similarity window size) forthe first two sessions of each treatment in 15-round blocks17 Estimated parameters for thesupermodular and near-supermodular treatmentsare consistent with the increased level of con-vergence over time while the 00 treatmentsare not The second column (sFP) reports thebest-fit parameters for the stochastic fictitiousplay model With the exception of treatment2000 the discount factor is close to 10

16 Chen and Yuri Khoroshilov (2003) compare threeversions of PA models where shocks were drawn fromuniform logistic and double exponential distributions ontwo sets of experimental data and find that the performanceof the three versions of PA were statistically indistinguish-able

17 We also calibrate all parameters for the entire 60rounds but do not report results due to space limitations

TABLE 6mdashCALIBRATION OF THREE LEARNING MODELS IN FIFTEEN-ROUND BLOCKS

Model sFP fEWA PA

Block 1 2 3 4 1 2 3 4 1 2 3 4

Discount rate (r) Discount rate (r)1000 10 07 09 10 01 00 09 102000 07 06 01 01 05 10 02 012018 10 10 09 09 02 10 03 021020 10 10 10 09 01 03 06 032020 10 10 10 09 02 01 05 022040 10 10 10 07 01 10 04 03

Sensitivity (13) Sensitivity (13) Shock interval (a)1000 15 44 45 28 16 45 44 37 500 500 250 502000 15 43 140 180 17 40 51 74 50 100 150 502018 16 80 115 183 15 53 62 95 50 300 500 1501020 19 59 86 138 18 47 62 82 100 500 450 3002020 28 56 81 112 24 38 61 67 200 300 350 3502040 34 65 140 205 32 49 67 99 150 50 150 50

Similarity window (h)1000 1 1 10 92000 10 10 9 62018 5 4 6 11020 1 1 8 32020 2 2 7 12040 1 3 2 2

1523VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

indicating that players keep track of all past playwhen forming beliefs about opponentsrsquo nextmove18 Given these beliefs the increasing sen-sitivity parameter 13 for all treatments except1000 indicates that the likelihood of bestresponse increases over time The third columnreports calibration results for the fEWA modelAgain with the exception of treatment 1000the parameter 13 increases over time The finalcolumn reports calibration of parameters in thePA model For each treatment except 1000 amiddle block has the highest discount factorindicating more weight on new informationabout a strategyrsquos performance The second pa-rameter in the PA model represents the upperbound of the interval from which shocks aredrawn a Experimentation should decrease inthe final rounds Indeed for all treatments theestimated mood-shock ranges (weakly) de-crease from the third to the last block Finallythe decreasing similarity windows from thethird to final block indicate decreasing strategyspillover Relatively large discount factors inthe third block combined with relatively largesimilarity windows flatten the payoff assess-ments around the most recently played strate-gies consistent with the local experimentation(or relatively stabilized play) observed in thedata

C Validation

Using the parameters calibrated on the firsttwo sessions of each treatment we next com-pare the performance of the three learning mod-els in predicting play in the hold-out sampleFor comparison we also present the perfor-mance of two static models The random choicemodel assumes that each player randomlychooses any strategy with equal probability forall rounds This model incorporates only num-ber of strategies and thus provides a minimumstandard for a dynamic learning model Theequilibrium model assumes that each playerplays the subgame perfect Nash equilibriumevery round Its fit conveys the same informa-tion as the proportion of equilibrium play pre-

sented in Section IV but with a different metric(MSD)

Table 7 presents each modelrsquos MSD scoresfor each hold-out session Recall that two ofeach treatmentrsquos four or five independent ses-sions are used for calibration and the rest forvalidation The results indicate that all threelearning models perform significantly betterthan the random choice model (p-value 001one-sided permutation tests) and by a largermargin significantly better than the equilibriummodel (p-value 001 one-sided permutationtests) While the equilibrium model does a poorjob of explaining overall experimental data itsperformance improvement over time19 justifiesthe use of learning models to explain the dy-namics of play Within each of the top threepanels (ie learning models) session-levelMSD scores are lower for the supermodular andnear-supermodular treatments indicating thateach learning model does a better job explainingbehavior in those treatments Indeed in the em-pirical learning literature learning models fitbetter in experiments with better equilibriumconvergence (see eg Chen and Khoroshilov2003)

RESULT 4 (Comparison of Learning Mod-els) The performance of PA is weakly betterthan that of fEWA and strictly better than that ofsFP The performances of fEWA and sFP arenot significantly different

SUPPORT Table 7 reports the MSD scores foreach independent session in the hold-out sampleunder each model The Wilcoxon signed-rankstests show that the MSD scores under PA areweakly lower than those under fEWA (z 0085 p-value 0068) and strictly lowerthan those under sFP (z 0028 p-value 0023) Using the same test we cannot reject thehypothesis that fEWA and sFP yield the sameMSD scores (z 0662 p-value 0508)

Although the performance of PA is onlyweakly better than that of fEWA the overallMSD scores for PA are lower than those for

18 As the discount factor is zero in Cournot best replywe can reject this model based on our estimation of thediscount factors

19 We omit the table to show the performance of theequilibrium model in blocks of 15 rounds as the informa-tion is repetitive with Figures 1 and 2

1524 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

fEWA for every treatment Therefore we usePA for forecasting beyond 60 rounds

Figure 3 presents the simulated path of thecalibrated PA model and compares it with theactual data in the hold-out sample by superim-posing the simulated mean (black line) plus andminus one standard deviation (grey lines) on theactual mean (black box) and standard deviation(error bars) The simulation does a good job oftracking both the mean and the standard devia-tion of the actual data In treatments 1000and 2040 however the reduction in variancelags that seen in the middle rounds of theexperiment

D Forecasting

We now report the results from the PAmodel As the strategic complementarity predic-tions concern long-run performance we use thecalibrated parameters for the first 58 rounds tosimulate play in later rounds This exercise al-

lows us to study convergence and efficiency inlong but finite horizons

In all forecasting exercises we base all post-58-round parameters on those from the lastblock (calibrated from rounds 46 to 58) Sincewe do not know how these parameters mightchange beyond 60 rounds we use three differ-ent specifications First we retain all final-blockparameters Second we exponentially decay insubsequent blocks the probability of deviationfrom the best response in the production stage20

Third we exponentially decay the shock inter-val in subsequent blocks As all three specifica-tions yield qualitatively similar results wereport results from only the third specificationdue to space limitations21 In presenting the

20 In block k 4 we use the probability of deviationPrk minPr4(k 4)2 108 We set a lower bound of108 to avoid the division-by-zero problem

21 Different sequences of random numbers produceslightly different parameters estimates While the conver-

TABLE 7mdashVALIDATION ON HOLD-OUT SESSIONS

Session

Stochastic fictitious play

1000 2000 2018 1020 2020 2040

1 0955 0934 0940 0930 0888 09402 0960 0946 0890 0917 0973 08783 0948 0883 0873Overall 0957 0943 0915 0910 0912 0909

fEWA

1 0956 0934 0942 0928 0887 09482 0960 0946 0890 0922 0978 08863 0946 0879 0871Overall 0958 0942 0916 0910 0912 0917

Payoff assessment

1 0952 0933 0914 0941 0888 09162 0957 0944 0899 0922 0917 08983 0947 0894 0903Overall 0955 0941 0907 0919 0902 0907

Equilibrium play

1 1867 1853 1833 1833 1489 17922 1889 1919 1606 1839 1953 16693 1917 1447 1539Overall 1878 1896 1719 1706 1660 1731

Random play

Overall 0976 0976 0976 0976 0976 0976

1525VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

results we use the shorthand notation x y todenote a measure under treatment x is signifi-cantly higher than that under treatment y at the5-percent level or less and x y to denote thatthe measures are not significantly different atthe 5-percent level

Figure 4 reports the simulated proportion of-equilibrium play We report only the resultsfor the first 500 rounds since the dynamics donot change much thereafter In our simulationall treatments reach higher convergence levelsthan those achieved in the 60 rounds of the

experiment Since the simulated variance reduc-tion lags that of the experiment round 60 resultsare not achieved until approximately 90 roundsof simulation We further note that these im-provements slow down after 200 rounds Theproportion of -equilibrium play is bounded by72 percent for player 1 and 93 percent for player2 We now outline the simulation results

RESULT 5 (Level of Convergence in Round500) In the simulated data for player 1 atround 500 we have the following level of con-vergence ranking in

(i) p1 2018 2040 2020 1020 2000 1000

(ii) -p1 2018 2040 2020 1020 2000 1000

(iii) p2 2040 2018 1020 2020 2000 1000 and

gence level of a given treatment for a given set of parameterestimates is not affected by the sequence of random num-bers this convergence level is somewhat sensitive to theexact parameter calibration We therefore conduct four dif-ferent calibrations for each treatment and select the param-eters that yield the highest level The relative rankings oftreatments are quite robust with respect to ordinal selected

FIGURE 3 SIMULATED DYNAMIC PATH VERSUS ACTUAL DATA

1526 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iv) -p2 2040 2018 1020 2020 2000 1000

SUPPORT The fourth and seventh columns ofTable 8 report p-values for t-tests of the preced-ing alternative hypotheses for players 1 and 2respectively

Comparing Results 1 and 5 we first note thatthe four supermodular and near-supermodulartreatments continue to dominate the 00 treat-ments In fact the simulations suggest that thegap remains constant compared with 2000and actually increases compared with 1000Unlike Result 1 however the four dominanttreatments differ In particular 2018 per-forms better in terms of player 1rsquos convergenceand 2040 in terms of player 2rsquos although thedifferences within the top four treatments aresmaller than the differences between these four

and the 00 treatments In addition an increasein significantly improves convergence by alarge margin (40 to 80 percent) for both playerswhen 0

It is instructive to compare these results withthose concerning the speed of convergence inour experimental treatments In terms of con-vergence to -p2 our regression results (Table4) indicate that 2018rsquos performance in laterrounds enables it to achieve the same conver-gence level as the supermodular treatmentsThis trend continues in our simulations as2018 performs robustly well compared to thesupermodular treatments Likewise in our anal-ysis of the experimental results the conver-gence speed of 2040 was not significantlydifferent from those of the other dominant treat-ments In our simulations however this treat-ment dominates all treatments in player 2equilibrium play

FIGURE 3 CONTINUED

1527VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

In our simulations the convergence improve-ment due to increasing from 0 to 20 doesdepend on (-effect on -effect) An increasein significantly decreases the improvement in

convergence level (all p-values for one-sidedt-tests are less than 001) Due to the relativelypoor performance of 1000 in our simula-tions increasing from 10 to 20 when 0

Player 1

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500Round

Ep

silo

n-e

qu

ilib

riu

m p

lay

α10β00α20β00α20β18α10β20α20β20α20β40

Player 2

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500

Round

Ep

silo

n-e

qu

ilib

riu

m p

lay α10β00

α20β00α20β18α10β20α20β20α20β40

FIGURE 4 SIMULATED -EQUILIBRIUM PLAY

1528 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

improves convergence more dramatically inround 500 than in rounds 40 to 60 In the longrun an increase in is a partial substitute for anincrease in

We now use definition 2 to examine the con-vergence speed in the long run Table 9 presentsresults for the first time a treatment reaches thelevel of convergence L We omit the two 00treatments since they do not converge to thesame levels as the other treatments In terms ofplayer 1 2018 achieves the fastest conver-gence for all levels L The picture for player 2convergence however is more subtle First forthe 20 and 18 treatments the speeds of con-vergence to all L are remarkably similarWhile 2040 lags the other treatments interms of achieving 80-percent -equilibriumplay for player 2 it is the only treatment toachieve L 90 percent

Apart from convergence to equilibrium it isalso important to look at long-run welfare prop-erties We evaluate mechanism performance us-ing the efficiency measure in Section IV

Figure 5 summarizes the simulated and actualefficiency achieved by each of the treatmentsThe top panel reports simulated results whilethe bottom panel reports efficiency in the ex-perimental data In the long run supermodularand near-supermodular treatments continue todiffer from the 00 treatments with 20 supe-rior to 10 in the 00 treatments

VI Interpretation and Discussion

The underlying force for convergence ingames with strategic complementarities is thecombination of the slope of the best-responsefunctions and adaptive learning by players In

TABLE 9mdashSPEED OF CONVERGENCE IN SIMULATED DATA

(Threshold is the percent of players playing epsilon-equilibrium strategy Entry indicates round in which went overthreshold for good where ldquomdashrdquo indicates that treatment never achieved threshold)

Threshold

Player 1 Player 2

04 05 06 07 04 05 06 07 08 09

2018 54 83 126 316 38 52 62 87 127 mdash1020 129 221 mdash mdash 36 48 63 91 148 mdash2020 61 91 149 mdash 39 51 61 84 137 mdash2040 101 166 263 496 30 38 56 94 166 339

TABLE 8mdashRESULTS OF t-TESTS COMPARING LEVEL OF CONVERGENCE OF SIMULATIONS OF EXPERIMENTAL TREATMENTS

(Values are for round 500 and based on 1500 simulated games)

Treatment

Player 1 equilibrium price Player 2 equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0073 00822000 0188 2000 1000 0000 0213 2000 1000 00002018 0265 2018 2040 0187 0381 2018 1020 02641020 0211 1020 2000 0000 0376 1020 2020 00012020 0243 2020 1020 0000 0353 2020 2000 00002040 0259 2040 2020 0007 0442 2040 2018 0000

Treatment

Player 1 -equilibrium price Player 2 -equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0216 02322000 0519 2000 1000 0000 0573 2000 1000 00002018 0713 2018 2040 0045 0870 2018 1020 00081020 0584 1020 2000 0000 0857 1020 2020 00002020 0662 2020 1020 0000 0834 2020 2000 00002040 0701 2040 2020 0000 0925 2040 2018 0000

1529VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the compensation mechanisms the slope ofplayer 2rsquos best response function is determinedby As and determine the penalty for

mismatched prices we seriously consider thepossibility that convergence in supermodularand near-supermodular treatments to an equilib-

Simulation Data

75

80

85

90

95

100

50 150 250 350 450

Round

α10β00α20β00α20β18α10β20α20β20α20β40

Experimental Data

40

45

50

55

60

65

70

75

80

85

90

0 10 20 30 40 50 60

Round

Fra

ctio

n o

f M

axim

um

Wel

fare

α10β00α20β00α20β18α10β20α20β20α20β40

Fra

ctio

n o

f M

axim

um

Wel

fare

FIGURE 5 EFFICIENCY IN THE SIMULATED AND ACTUAL DATA

1530 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium where both players announce the sameprice is not due to strategic complementaritiesbut rather to increases in and making thepenalty terms more prominent and matching strat-egies more focal22 We thus have two hypoth-eses to explain the observed improvement inconvergence to equilibrium play in supermodu-lar and near-supermodular treatments a best-response hypothesis and a matching hypothesisIn this section we present evidence that theimprovement in convergence is due to payoff-relevant changes to best responses and not tomatching becoming more focal

While the focal point hypothesis suggeststhat players converge to a match it does notspecify the price at which they match In thefirst round of our experiments almost 50 per-cent of all prices were 20 and less than 10percent were in the -equilibrium range of[15 17] In the final rounds of our experi-ments play in the four supermodular or near-supermodular treatments converges stronglyto this range suggesting more than simplematching

By equation (4) player 1rsquos best response isalways to match regardless of p2 By theGeneral Incentive Hypothesis we expectmore matching behavior by player 1 as increases By equation (5) however match-ing is a best response for player 2 only inequilibrium Therefore data for player 2 canhelp us separate the best-response and match-ing hypotheses

We first investigate which model better ex-plains our experimental data We operationalizethe best response hypothesis by the stochasticfictitious play model of Section V and thematching hypothesis with a stochastic matchingmodel23 In both models the predicted player 1price is specified by equation (6) In the match-ing model player 2 plays this price with prob-ability 1 and one of the other 40 priceswith probability 40 We estimate using thefirst two sessions of each treatment24 We then

compare the performance of the matchingmodel with the sFP model using the hold-outsample Using the Wilcoxon signed-rank teststo compare the MSD scores in the hold-outsample we conclude that sFP explains player2rsquos behavior significantly better than the match-ing model (p-value 0006) We conclude thatbest response to a predicted price explains be-havior significantly better than the matchingmodel

We next investigate whether differences inthe cost of matching defined as the differencebetween matching and best-response profitscan explain all of the differences in match-ing behavior among treatments or whetherthere exists additional matching behaviorthat cannot be explained by payoff-relevantfactors

We examine the probability that player 2 triesto match the anticipated player 1 price We donot know what price player 2 anticipates Toestimate how players weigh history we cali-brate a matching model in a manner similar tothat outlined in Section V B We assume aplayer matches the price he anticipates givenby equation (6) and we find the discount ratefor each treatment that minimizes MSD25 Thisgives us the anticipated price that best explainsthe matching hypothesis Using this discountrate we calculate for each t 1 an anticipatedplayer 1 price p1 for each player 2 We alsocalculate the opportunity cost of matching p1equal to the difference between best-responseand matching profits This opportunity cost cap-tures the payoff relevant effects of and alsodepends on the price to be matched

We next regress the probability that player 2matches the anticipated price Pr(p2 p1) onln(Round) treatment dummies and treatmentdummies interacted with ln(Round) In onespecification we include the cost of matching asa regressor If parameter changes make match-ing more focal then changes in the cost ofmatching will not explain all of the changes inprobability of matching

22 We thank an anonymous referee and the co-editor forpointing this out

23 We do not report the results of the deterministicmatching model as it performs significantly worse than itsstochastic counterpart

24 Estimated in each block are as follows 1000 10 7 6 8 2000 7 6 5 4 2018 7

2 3 3 1020 5 3 3 2 2020 1 0 0 02040 2 0 0 1

25 In all treatments the calibrated discount rate is r 01

1531VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Table 10 presents the results of two probitmodels with clustering at the individual level Inthe first specification we do not include the costof matching as a regressor In this specificationthe coefficient for D00 is negative and signifi-cant thus reducing from 20 to 0 decreases theprobability that player 2 will try to match player1rsquos price Including the cost of matching in theregression however we find that neither thedummies nor their interactions are signifi-cantmdashthe coefficient for the previously signifi-cant D00 dummy falls almost by a half whilethe precision of the estimate stays about thesame The coefficient on the cost of matchinghowever is significant and negative Thisfinding suggests that the rise in matching that

occurs when we increase from 0 to 20 is notbecause matching is more focal but ratherbecause an increase of decreases the cost ofmatching26

VII Concluding Remarks

The appeal of games with strategic comple-mentarities is simple as long as players areadaptive learners the game will in the limitconverge to the set bounded by the largest andsmallest Nash equilibria This convergence de-pends on neither initial conditions nor the as-sumption of a particular learning modelUnfortunately while many competitive envi-ronments are supermodular many are not Inthis study we examine whether there are non-supermodular games which have the same con-vergence properties as supermodular ones Wealso study whether there exists a clear conver-gence ranking among games with strategiccomplementarities

Our results confirm the findings of previousexperimental studies that supermodular gamesperform significantly better than games far be-low the supermodular threshold From a littlebelow the threshold to the threshold howeverthe change in convergence level is statisticallyinsignificant This results suggests that in thecontext of mechanism design the designerneed not be overly concerned with setting pa-rameters that are firmly above the supermodularthreshold close is just as good It also enlargesthe set of robustly stable games For examplethe Falkinger mechanism (Falkinger 1996) isnot supermodular but close These results sug-gest that near-supermodular games perform likesupermodular ones

Our next result concerns convergence perfor-mance within the class of games with strategiccomplementarities Variations in the degree ofcomplementarities have no significant effect onperformance within the 60 experimental roundsOur simulations suggest however an increased

26 We consider other specifications of anticipated priceincluding the averages of the last one two three and fourprices seen In only one specification was a dummy or itsinteraction with ln(Round) significant In that specificationthe coefficient for D18 is positive and its interaction withln(Round) is negative a finding which is not consistent witha focal point hypothesis

TABLE 10mdashPLAYER 2 MATCHING HYPOTHESIS

(Probit models with individual-level clustering comparingmodels with and without the cost to Player 2 of matching

the anticipated price of Player 1)

Model

Dependent variableProbability player 2 priceequals anticipated player 1

price

(1) (2)

D10 0014 0017(0043) (0041)

D00 0077 0043(0035) (0036)

D18 0046 0032(0053) (0056)

D40 0008 0041(0061) (0042)

ln(Round) 0041 0028(0011) (0010)

Match Cost 0001(0000)

ln(Round)D10 0014 0012(0012) (0012)

ln(Round)D00 0003 0000(0013) (0012)

ln(Round)D18 0006 0002(0021) (0020)

ln(Round)D40 0001 0015(0018) (0016)

Observations 9588 9588Log pseudo-likelihood 324321 317716

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses are adjusted for cluster-ing at the individual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummyvariable for treatment y Excluded dummy is D2020Match Cost is the difference in cents of matching andbest-responding to anticipated Player 1 price

1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 5: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

p1 p2 x

er

e c

er

e c

r

2e c

(iii) If players follow Cournot best-reply dy-namics (p1 p2) is a globally asymptoti-cally stable equilibrium of the continuoustime dynamical system for any 0 and 0

(iv) The game is supermodular if and only if 0 and 14c

PROOFSee Appendix

The best-response functions presented in part(i) of Proposition 2 reveal interesting incentivesWhile player 1 has an incentive to always matchplayer 2rsquos price player 2 has an incentive tomatch only when player 1 plays the equilibriumstrategy Also at the threshold for strategiccomplementarity 14c player 2 has a dom-inant strategy p2 er(e c) p2 Part (iii) ofProposition 2 extends Cheng (1998) whoshows that the original version of the compen-sation mechanism ( 0) is globally stableunder continuous-time Cournot best-reply dy-namics4 Cournot best-reply is however a rel-atively poor description of human learning(Richard Boylan and Mahmoud El-Gamal1993) Therefore global stability under Cournotbest reply for any 0 does not imply equi-librium convergence among human subjectsPart (iv) characterizes the threshold for strategiccomplementarity in our experiment a more ro-bust stability criterion than that characterized bypart (iii)

An intuition for how strategic complementa-rities affect the outcome of adaptive learningcan be gained from analysis of the best responsefunctions equations (4) and (5) While player1rsquos best response function is always upwardsloping player 2rsquos best response function equa-tion (5) is nondecreasing if and only if 14c ie when the game is supermodular Be-yond the threshold for strategic complementar-ity both best-response functions are upward

sloping and they intersect at the equilibrium Itis easy to verify graphically that adaptive learn-ers eg Cournot best reply will converge tothe equilibrium regardless of where they start

To examine how which is unrelated tostrategic complementarity might affect behav-ior we observe that when player 1 deviatesfrom the best response by ie p1 p2 his profit loss is 1 2 This profit losswhich is proportional to the magnitude of isthe deviation cost for player 1 Based on previ-ous experimental evidence the incentive to de-viate from best response decreases when thedeviation cost increases Chen and Plott (1996)call it the General Incentive Hypothesis ie theerror of game theoretic models decreases as theincentive to best-respond increases We there-fore expect that an increase in improvesplayer 1rsquos convergence to equilibrium Whenplayer 1 plays equilibrium strategy player 2rsquosbest response is to play equilibrium strategy aswell Therefore we expect that an increase in might improve player 2rsquos convergence to equi-librium as well It is not clear however whetherthe -effects systematically change the effectsof the supermodularity parameter We rely onexperimental data to test the interaction of the-effects on -effects

II Experimental Design

Our experimental design reflects both theo-retical and technical considerations Specifi-cally we chose an environment that allowssignificant latitude in varying the free parame-ters to better assess the performance of thecompensation mechanism around the thresholdof strategic complementarity We describe thisenvironment and the experimental proceduresbelow

A The Economic Environment

We use the general payoff functions pre-sented in equation (1) with quadratic cost andexternality functions to obtain analytical solu-tions c(x) cx2 and e(x) ex2 We use thefollowing parameters c 180 e 140 r 24 From Proposition 2 the subgame-perfectNash equilibrium is (p1 p2 x) (16 16 320)and the threshold for strategic complementarityis 14c 20

4 Cheng (1998) also shows that the original mechanismis locally stable under discrete-time Cournot best-reply dy-namics

1509VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

In the experiment each player chooses pi 0 1 40 Without the compensation mech-anism the profit-maximizing production level isx r2c 960 three times higher than theefficient level To reduce the complexity ofplayer 1rsquos problem we use a grid size of 10 forthe quantity and truncate the strategy space ieplayer 1 chooses X 0 1 50 where X x10 The payoff functions presented to the sub-jects are adjusted accordingly Truncating thestrategy space to X 50 also reduces the pos-sibility of player 2rsquos bankruptcy To reducepayoff asymmetry we give player 1 a lump-sum payment of 250 points each round There-fore the equilibrium payoffs under themechanism for the two players are 1 1530and 2 2560

The functional forms and specific parametervalues are chosen for several reasons Firstequilibrium solutions are integers Secondequilibrium prices and quantities do not lie inthe center of the strategy space thus avoidingequilibrium convergence as a result of focalpoints Third there is a salient gap betweenefficiency with and without the mechanismWithout the mechanism the profit-maximizingproduction level is X 50 resulting in anefficiency level of 684 percent5 With themechanism the system achieves 100-percentefficiency in equilibrium Finally since thethreshold for strategic complementarity is 20 there is a large number of integer values tochoose from both below and above thethreshold

To study how strategic complementarity af-fects equilibrium convergence we keep 20and vary 0 18 20 and 40 To studywhether affects convergence we also keep 10 and vary 0 20

B Experimental Procedures

Our experiment involves 12 players persessionmdashsix player 1s (called Red players in the

instructions) and six player 2s (Blue players)Each player remains the same type throughoutthe experiment At the beginning of each ses-sion subjects randomly draw a PC terminalnumber Each then sits in front of the corre-sponding terminal and is given printed instruc-tions After the instructions are read aloudsubjects are encouraged to ask questions Theinstruction period lasts between 15 and 30minutes

Each round a player 1 is randomly matchedwith a player 2 Subjects are randomly re-matched each round to minimize repeated gameeffects The random rematching protocol alsominimizes the possibility that players colludeon a high-subsidy and low-tax outcome6 Eachsession consists of 60 rounds As we are inter-ested in learning there are no practice roundsEach round consists of two stages

(i) Announcement Stage Each player simulta-neously and independently chooses a pricepi 0 1 40

(ii) Production Stage After (p1 p2) are chosenplayer 1rsquos computer displays player 2rsquosprice and a payoff table showing her payofffor each X 0 1 50 Player 1 thenchooses a quantity X The server calculatespayoffs and sends each player his payoffthe quantity chosen and the prices submit-ted by him and his match

To summarize each subject knows both pay-off functions the choices made each round byhimself and his match as well as his per-periodand cumulative payoffs At any point subjectshave ready access to all of this information Themechanism is thus implemented as a game ofcomplete information We do not know how-ever how subjects processed this informationnor do we know their beliefs about the rational-

5 We compute the efficiency level by using the ratio oftotal earnings without the mechanism and total earningswith the mechanism Without the mechanism at the pro-duction level of X 50 (or x 500) total earning is 1 2 (rx cx2) ex2 2625 Therefore the efficiencylevel is 2625(1280 2560) 0684

6 If the players could commit to maximizing joint profitsby choosing p1 and p2 cooperatively and x noncoopera-tively they would choose by equation (1) (p1 p2 x) (4( )er[4( )(e c) 1] r(4e( ) 1)[4( )(e c) 1] 2r( )[4( )(e c) 1]) With ourchoice of parameters we get p1 24 p2 12 for 20and 0 p1 192 p2 144 for 20 and 20and p1 18 p2 15 for 20 and 40 etc If inaddition they could choose (p1 p2 x) cooperatively then forour parameter values we get (p1 p2 x) (min480[3( ) 20] 40 min960( )[3( ) 20] 50)

1510 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

ity of others Both of these factors introduceuncertainty in the environment

Table 1 presents features of the experimentalsessions including parameters number of inde-pendent sessions in each treatment whether themechanism is supermodular in that treatmentand equilibrium prices and quantities Overall27 independent computerized sessions wereconducted in the Research Center for GroupDynamics (RCGD) lab at the University ofMichigan from April to July 2001 and in April2003 We used zTree to program our experi-ments Our subjects were students from the Uni-versity of Michigan No subject was used inmore than one session yielding 324 subjectsEach session lasted approximately one-and-a-half hours The exchange rate for all treatmentswas one dollar for 4250 points The averageearnings was $2282 Data are available fromthe authors upon request

III Hypotheses

Given the design above we next identify ourhypotheses To do so we first define and discusstwo measures of convergence the level andspeed7 In theory convergence implies that allplayers play the stage game equilibrium and nodeviation is observed This is not realistic how-ever in an experimental setting Therefore wedefine the following measures

DEFINITION 1 The level of convergence atround t L(t) is measured by the proportion ofNash equilibrium play in that round The levelof convergence for a block of rounds Lb(t1 t2)measures the average proportion of Nash equi-librium play between rounds t1 and t2 ie Lb(t1t2) yentt1

t2 L(t)(t2 t1 1) where 0 t1 t2 T and T is the total number of rounds

We define the level of convergence for both around and a block of rounds The block conver-gence measure smooths out inter-roundvariation

Ideally the speed of convergence shouldmeasure how quickly all players converge toequilibrium strategies In our experimental set-ting however we never observe perfect con-vergence We therefore use a more generaldefinition for the speed of convergence

DEFINITION 2 For a given level of conver-gence L (0 1] the speed of convergence ismeasured by the first round in which the level ofconvergence reaches L and does not subse-quently drop below this level ie such thatL(t) L for any t

We use Definition 2 for analysis of conver-gence speed in simulations Due to oscillation inexperimental data Definition 2 is not practicalWe use the following observation which en-ables us to measure convergence speed usingestimates for the slope of L(t) L(t) L(t 1) L(t) obtained in regression analysis LetLy(t) be the convergence level of treatment y at

7 We thank anonymous referees for suggesting this sep-aration and appropriate measures

TABLE 1mdashFEATURES OF EXPERIMENTAL SESSIONS

Parameters and treatments Properties Equilibrium

Supermodular (p1 p2 X)10 20

1000 20000 (4 sessions) (5 sessions)

2018 18 (4 sessions) No (16 16 32)

1020 202020 (5 sessions) (5 sessions)

204040 (4 sessions) Yes (16 16 32)

1511VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

time t We now relate the slope of L(t) and theinitial level of convergence L(1) to the speed ofconvergence

OBSERVATION 1 If L1(1) L2(1) andL1(t) L2(t) for all t [1 T 1] thengiven any L (0 1] the first treatment con-verges more quickly than the second treatment

Based on theories presented in Section I wenow form our hypotheses about the level andspeed of convergence While theories of strate-gic complementarities do not make any predic-tions about the speed of convergence we formour hypotheses based on previous experimentsthat incidentally address games with strategiccomplementarities

HYPOTHESIS 1 When 20 increasing from 0 to 20 significantly increases (a) the leveland (b) the speed of convergence

Hypothesis 1 is based on the theoretical predic-tion that games with strategic complementari-ties converge to the unique Nash equilibrium aswell as on previous experimental findings thatsupermodular games perform robustly betterthan their non-supermodular counterparts

HYPOTHESIS 2 When 20 increasing from 0 to 18 significantly increases (a) the leveland (b) the speed of convergence

Hypothesis 2 is based on the findings ofFalkinger et al (2000) that average play is closeto equilibrium when the free parameter isslightly below the supermodular threshold

HYPOTHESIS 3 When 20 increasing from 18 to 20 significantly increases (a) thelevel and (b) the speed of convergence

Since we have not found any previous experi-mental studies that compare the performance ofgames with strategic complementarities withthose near the threshold Hypothesis 3 is purespeculation

HYPOTHESIS 4 When 20 increasing from 20 to 40 does not significantly in-crease either (a) the level or (b) the speed ofconvergence

Since we have not found any previous experi-mental studies within the class of games withstrategic complementarities Hypotheses 4 isagain our speculation

HYPOTHESIS 5 When 0 or 20 increas-ing from 10 to 20 significantly increases (a)the level and (b) the speed of convergence

HYPOTHESIS 6 Changing from 10 to 20significantly increases the improvement in (a)the level and (b) the speed of convergence thatresults from increasing from 0 to 20

Hypothesis 5 is based on experimental findingssupporting the General Incentive HypothesisHypothesis 6 is our speculation

Hypotheses 1 through 6 are concerned withonly one measure of performance convergenceto equilibrium Other measures such as effi-ciency and budget balance can be largely de-rived from convergence patterns Thereforealthough we omit the formal hypotheses wepresent results regarding these measures in Sec-tions IV and V

IV Experimental Results

In this section we compare the performanceof the mechanism as we vary and At theindividual level we look at both the level andspeed of convergence to subgame-perfect equi-librium and near equilibrium in each of the sixdifferent treatments At the aggregate level weexamine the efficiency and budget imbalancegenerated by each treatment In the followingdiscussion we focus on prices rather than onquantity Recall that player 1rsquos best response inthe production stage is uniquely determined byp2 and that player 1 has all of the informationneeded to select this best response In our ex-periment deviations in the production stagetend to be small (the average absolute deviationis less than 1 in 23 of 27 sessions) and does notdiffer significantly among treatments There-fore it is not surprising that in all of our anal-yses the results for quantities largely mirror theresults for player 2rsquos price8

Recall that subgame perfect Nash equilib-

8 Tables are available from the authors upon request

1512 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium for a stage game is (p1 p2 X) (16 1632) Since the strategy space in this experimentis rather large and the payoff function is rela-tively flat near equilibrium a small deviationfrom equilibrium is not very costly For exam-ple in the 2020 treatment a one-unit unilat-eral deviation from equilibrium prices costsplayer 1 $0005 and player 2 $0014 Thereforewe check the -equilibrium play by looking atthe proportion of price announcements within1 of the equilibrium price and the quantityannouncement within 4 of the equilibriumquantity since a one-unit change in p2 results ina four-unit best-response quantity changeTherefore the -equilibrium prediction is (-p1-p2 -x) (15 16 17 15 16 1728 32 36)

Figures 1 and 2 contain box and whiskersplots for the prices of each treatment for all 60rounds by players 1 and 2 respectively The boxrepresents the ranges of the twenty-fifth andseventy-fifth percentiles of prices while thewhiskers extend to the minimum and maximumprices in each round The horizontal line withineach box represents the median price Com-pared with the 00 treatments equilibriumprice convergence is clearly more pronouncedin the supermodular and near-supermodulartreatments

To analyze the performance of the compen-sation mechanism we first compare the conver-gence level achieved in the last 20 rounds ofeach treatment Table 2 reports the level ofconvergence (Lb(41 60)) to subgame-perfectNash equilibrium (panels A and B) and -Nashequilibrium (panels C and D) for each sessionunder each of the six different treatments aswell as the alternative hypotheses and the cor-responding p-values of one-tailed permutationtests9 under the null hypothesis that the con-vergence levels in the two treatments (column

[8]) are the same While the proportion of-equilibrium play is much higher than the pro-portion of equilibrium play the results of thepermutation tests largely follow similar pat-terns We now formally test our hypothesesregarding convergence level Parts (i) to (iii) ofResult 1 present the effects of the degree ofstrategic complementarity (-effects) Parts (iv)and (v) present effects due to changes in (-effects)

RESULT 1 (Level of Convergence in theLast 20 Rounds)

(i) When 20 increasing from 0 to 18and from 0 to 20 significantly10 increasesthe level of convergence in p1 p2 and -p2

(ii) When 20 increasing from 18 to 20does not change the level of convergencesignificantly

(iii) When 20 increasing from 20 to 40does not change the level of convergencesignificantly

(iv) When 0 increasing from 10 to 20weakly increases the level of convergencein p2 and significantly increases the levelof convergence in -p2 but has no signif-icant effects on p1 or -p1

(v) When 20 increasing from 10 to 20weakly increases the level of convergencein p1 but not in -p1 p2 or -p2

SUPPORT The last two columns of Table2 report the corresponding alternative hypothe-ses and permutation test results

By part (i) of Result 1 we accept Hypotheses1(a) and 2(a) Part (i) also confirms previousexperimental findings that supermodular gamesperform significantly better than those far fromthe supermodular threshold Furthermore near-supermodular games also perform significantlybetter than those far from the threshold

By part (ii) however we reject Hypothesis3(a) This is the first experimental result thatshows that from a little below the supermodularthreshold ( 18) to the threshold ( 20)

9 The permutation test also known as the Fisher random-ization test is a nonparametric version of a difference oftwo means t-test (See eg Sidney Siegel and N JohnCastellan 1988) The idea is simple and intuitive by pool-ing all independent observations the p-value is the exactprobability of observing a separation between the two treat-ments as the one observed when the pooled observations arerandomly divided into two equal-sized groups This testuses all of the information in the sample and thus has100-percent power-efficiency It is among the most power-ful of all statistical tests

10 When presenting results throughout the paper wefollow the convention that a significance level of 5 percentor less is significant while a significance level between 5percent and 10 percent is weakly significant

1513VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

improvement in convergence level is statisti-cally insignificant In other words we do notsee a dramatic improvement at the thresholdThis implies that the performance of near-supermodular games such as the Falkingermechanism ought to be comparable to that ofsupermodular games

By contrast we accept Hypothesis 4(a) bypart (iii) This is the first experimental resultsystematically comparing convergence levels ofsupermodular games where theory is silentThe convergence level does not significantlyimprove as increases from the threshold 20to 40 Therefore the marginal returns for being

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β00 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β00 Player 1

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β20 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β20 Player 1

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α20β18 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β40 Player 1

Pri

ce A

nn

ou

nce

d

Round

FIGURE 1 DISTRIBUTION OF ANNOUNCED PRICES IN EXPERIMENTAL TREATMENTS PLAYER 1

1514 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

ldquomore supermodularrdquo diminish once the payoffsbecome supermodular

Proposition 2 predicts convergence underCournot best reply for any 0 Howeverthere is a significant difference in convergencelevel as increases from 0 to 18 20 and

beyond We investigate in Section V whetherthis difference persists in the long run

While parts (i) to (iii) present the -effectsparts (iv) and (v) examine the -effects and wepartially accept Hypothesis 5(a) Recall fromequation (5) and subsequent discussions that at

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β00 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β00 Player 2

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β20 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β20 Player 2

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α20β18 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β40 Player 2

Pri

ce A

nn

ou

nce

d

Round

FIGURE 2 DISTRIBUTION OF ANNOUNCED PRICES IN EXPERIMENTAL TREATMENTS PLAYER 2

1515VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the threshold of strategic complementarity 20 player 2rsquos Nash equilibrium strategy is alsoa dominant strategy The finding of no -effecton player 2rsquos equilibrium play when 20 isconsistent with this observation

To determine the effects of strategic comple-mentarities and other factors on convergencespeed and to investigate further their effects onconvergence level we use probit models withclustering at the individual level The resultsfrom these models are presented in Table 3 Thedependent variable is -p1 in specifications (1)and (3) and -p2 in specifications (2) and (4)

In specifications (1) and (2) the independentvariables are treatment dummies Dy wherey 1000 2000 18 1020 and 40ln(Round) and a constant We use dummies fordifferent values of and rather than directparameter values to avoid assuming a linearrelationship of parameter effects and omit thedummy for 2020 Therefore in these twospecifications restricting learning speed to bethe same across treatments the estimated coef-ficient of Dy captures the difference in theconvergence level between treatments y and2020 Results from these specifications are

TABLE 2mdashLEVEL OF CONVERGENCE IN THE LAST 20 ROUNDS

Panel A Proportion of player 1 Nash equilibrium price (p1) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0000 0008 0158 0008 0044 1000 2000 03972000 0000 0083 0083 0042 0058 0053 2000 2020 00082018 0125 0117 0100 0192 0133 2000 2018 00081020 0242 0067 0067 0067 0208 0130 1020 2020 00642020 0325 0267 0208 0058 0367 0245 2018 2020 00642040 0175 0400 0158 0133 0217 2020 2040 0643

Panel B Proportion of player 2 Nash equilibrium price (p2) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0025 0042 0025 0067 0040 1000 2000 00562000 0067 0083 0033 0075 0050 0062 2000 2020 00322018 0133 0292 0108 0258 0198 2000 2018 00081020 0392 0108 0175 0067 0667 0282 1020 2020 05042020 0308 0117 0483 0017 0450 0275 2018 2020 02382040 0325 0492 0233 0233 0321 2020 2040 0365

Panel C Proportion of player 1 -Nash equilibrium price (-p1) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0108 0200 0350 0158 0204 1000 2000 01832000 0067 0458 0358 0183 0425 0298 2000 2020 00872018 0317 0625 0300 0583 0456 2000 2018 01031020 0475 0200 0300 0375 0492 0368 1020 2020 01942020 0450 0558 0475 0133 0775 0478 2018 2020 04442040 0458 0492 0375 0442 0442 2020 2040 0643

Panel D Proportion of player 2 -Nash equilibrium price (-p2) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0133 0200 0242 0225 0200 1000 2000 00162000 0300 0400 0200 0275 0333 0302 2000 2020 00162018 0717 0667 0342 0700 0606 2000 2018 00161020 0650 0517 0542 0400 0892 0600 1020 2020 05642020 0533 0592 0642 0225 0900 0578 2018 2020 05562040 0825 0792 0467 0517 0650 2020 2040 0294

Note Significant at 10-percent level 5-percent level 1-percent level

1516 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

largely consistent with Result 1 which uses amore conservative test The coefficients ofln(Round) are both positive and highly signifi-cant indicating that players learn to play equilib-rium strategies over time The concave functionalform ln(Round) which yields a better log-likelihood than either the linear or quadratic func-tional form indicates that learning is rapid at thebeginning and decreases over time We will ex-amine learning in more detail in Section V

In specifications (3) and (4) we useln(Round) interaction of each of the treatmentdummies with ln(Round) and a constant asindependent variables The interaction term al-lows different slopes for different treatmentsCompared with the coefficient of ln(Round) the

coefficient for the interaction term Dyln(Round) captures the slope differences be-tween treatment y and 2020 By Observation1 as we cannot reject the hypotheses that initialround probability of equilibrium play is thesame across all treatments11 we can comparethe slope of L(t) L(t) between different treat-ments From the probit specification L(t) 13(constant Db ln(t)) where D is the 1 5vector of treatment dummies and b is the 5 1vector of estimated coefficients we can derive

11 When including treatment dummies in this specifica-tion Wald tests on the hypothesis that the treatment dummycoefficients are all zero yield p-values of 02703 for -p1and 03383 for -p2

TABLE 3mdashCONVERGENCE SPEED PROBIT MODELS WITH CLUSTERING AT INDIVIDUAL LEVEL

Dependent variable -Nash equilibrium play

(1) -Nashprice 1

(2) -Nashprice 2

(3) -Nashprice 1

(4) -Nashprice 2

D1000 0143 0268(0048) (0050)

D2000 0090 0217(0060) (0057)

D18 0005 0004(0071) (0069)

D1020 0080 0025(0060) (0073)

D40 0000 0078(0068) (0078)

ln(Round) 0115 0167 0131 0183(0014) (0017) (0021) (0022)

D1000ln(Round) 0050 0095(0019) (0022)

D2000ln(Round) 0031 0073(0020) (0021)

D18ln(Round) 0001 0004(0020) (0021)

D1020ln(Round) 0024 0008(0019) (0022)

D40ln(Round) 0002 0023(0019) (0024)

Observations 9720 9720 9720 9720Number of groups 162 162 162 162Log pseudo-likelihood 5562890 5824055 5557424 5793013

D1000 D2000 150 131D1000ln(Round) D2000ln(Round) 133 123D1020 D1000 D2000 005 107D1020ln(Round) D1000ln(Round)

D2000ln(Round)005 103

Notes Coefficients are probability derivatives Robust standard errors in parentheses are adjusted for clustering at theindividual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummy variable fortreatment y Excluded dummy is D2020 The bottom panel presents null hypotheses and Wald 2(1) test statistics

1517VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the slope of the probability function for treat-ment y with respect to t Ly(t) 13(con-stant Db)byt All coefficients reported inTable 3 are increments of probability deriva-tives 13(constant Db)by compared to thebaseline case of 2020

RESULT 2 (Speed of Convergence)

(i) When 20 increasing from 0 to 18and from 0 to 20 significantly increases thespeed of convergence in -p1 and -p2

(ii) When 20 increasing from 18 to 20has no significant effect on convergencespeed

(iii) When 20 increasing from 20 to 40has no significant effect on convergencespeed

(iv) When 0 increasing from 10 to 20has no significant effects on convergencespeed

(v) When 20 increasing from 10 to 20has no significant effects on convergencespeed

SUPPORT Models (3) and (4) in Table 3 re-port analyses on convergence speed For eachindependent variable the coefficients standarderrors (in parentheses) and significance levelsare reported The second Wald test looks atwhether the coefficient of D1000ln(Round)equals that of D2000ln(Round)

Part (i) of Result 2 supports Hypotheses 1(b)and 2(b) while by part (ii) we reject Hypothesis3(b) Part (iii) supports Hypothesis 4(b) Parts(iv) and (v) reject Hypothesis 5(b) Result 2provides the first empirical evidence on the roleof strategic complementarity and the speed ofconvergence

Although part (ii) indicates that increasing from 18 to 20 does not significantly changeconvergence speed we now investigate whetherthere are any differences between 18 and thesupermodular treatments In particular we com-pare 2018 with 2020 and 204012 InResult 1 we show that these treatments achieve

the same convergence level Also we cannotreject the hypothesis that round-one prices foreach player are drawn for the same distributionfor each treatment13 As the treatments all startand converge to similar levels of equilibriumplay we use a more flexible functional form tolook for differences in speed

In Table 4 we report the results of probitregressions comparing 2018 with the two20 supermodular treatments We use Roundln(Round) and their interactions with 2018as independent variables to allow different con-vergence speeds to the same level of conver-gence In model (1) the dependent variable isplayer 1 -equilibrium play There is no signif-icant difference between 2018 and the 20supermodular treatments In model (2) the de-pendent variable is player 2 -equilibrium playThere are significant differences between thenear-supermodular and 20 supermodular treat-ments In early rounds ln(Round) is large relativeto Round The negative and weakly significantcoefficient on D2018ln(Round) implies a lowerprobability of early-round equilibrium play in

12 We omit 1020 from this analysis First it does notachieve the same -p1 convergence level as the other su-permodular treatments Second omitting it avoids the pos-sibility of an -effect

13 Kolmogorov-Smirnov test result tables are availableby request

TABLE 4mdashCONVERGENCE SPEED

Probit models with individual-level clustering comparing2018 with the 20 supermodular treatments achieving

similar convergence levels

Dependent variable -Nashequilibrium play

(1) -Nashprice 1

(2) -Nashprice 2

ln(Round) 0054 0216(0040) (0043)

Round 0004 0001(0002) (0002)

D2018ln(Round) 0013 0066(0032) (0035)

D2018Round 0001 0006(0002) (0003)

Observations 4680 4680Log pseudo-likelihood 287907 299385

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses Significant at 10-percent level 5-percent level 1-percent level Dy isthe dummy variable for treatment y Excluded dummies areD2020 and D2040

1518 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

2018 relative to the supermodular treatmentsThe 18 treatment does catch up to these treat-ments which is reflected in the positive andsignificant coefficient on D2018 interactedwith Round This result suggests a differencebetween supermodular and near-supermodulartreatments while they achieve similar conver-gence levels the supermodular treatments per-form better in early rounds and thus mightreach certain convergence levels faster thantheir near-supermodular counterparts

The previous discussion examines the sepa-rate effects of strategic complementarity (-effects) and (-effects) on convergenceVarying allows us however to test the im-portance of strategic complementarity relativeto other features of mechanism design Having afull factorial design in the parameters 1020 and 0 20 we can study whether affects the role of strategic complementarity Inparticular as increases from 0 to 20 weexpect improvement in convergence level andspeed We study whether this improvementchanges when increases from 10 to 20

The last two Wald tests in the bottom panelof Table 3 separately examine the -effectson the change in convergence level and speedresulting from increasing from 0 to 20(-effects on -effects) Changing from 10 to20 does not significantly change the improve-ment in either convergence level or speed Thisis the first empirical result examining the ef-fects of other factors on the role of strategiccomplementarity

So far we have discussed the performance ofthe mechanism relative to equilibrium predic-tions and individual behavior We now turn togroup-level welfare results Since the compen-sation mechanism balances the budget only in

equilibrium total payoffs off the equilibriumpath can be only weakly related to efficientpayoffs Therefore we use two separate mea-sures to capture welfare implications an effi-ciency measure and a measure of budgetimbalance

We first define the per-round efficiency mea-sure which includes neither the taxsubsidy northe penalty terms ie

et rxt cxt ext

rx cx ex

where x is the efficient quantity The efficiencyachieved in a block e(t1 t2) where 0 t1 t2 60 is then defined as

et1 t2 t t1

t2 et

t2 t1 1

RESULT 3 (Efficiency in the Last 20Rounds)

(i) When 20 increasing from 0 to 18significantly improves efficiency while in-creasing from 0 to 20 weakly improvesefficiency

(ii) When 20 increasing from 18 to 20has no significant effect on efficiency

(iii) When 20 increasing from 20 to 40weakly improves efficiency

(iv) When 0 increasing from 10 to 20weakly improves efficiency

(v) When 20 increasing from 10 to 20has no significant effect on efficiency

SUPPORT Table 5 presents the efficiencymeasure for each session in each treatment

TABLE 5mdashEFFICIENCY MEASURE

Efficiency Last 20 rounds

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0600 0671 0804 0858 0733 1000 2000 00872000 0650 0870 0907 0873 0825 0825 2000 2020 00952018 0963 0952 0889 0959 0941 2000 2018 00161020 0958 0919 0905 0881 0984 0929 1020 2020 07142020 0888 0937 0939 0780 0971 0903 2018 2020 08332040 0973 0962 0943 0949 0957 2020 2040 0064

Note Significant at 10-percent level 5-percent level 1-percent level

1519VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

in the last 20 rounds the alternative hypoth-eses and the results of one-tailed permutationtests

Result 3 is largely consistent with Result 1indicating that supermodular and near-supermodular mechanisms induce a signifi-cantly higher proportion of equilibrium playthan mechanisms far from the supermodularthreshold The new finding is that increasing from the threshold 20 to 40 weakly improvesefficiency

Second the budget surplus at round t is thesum of the penalty terms plus tax minus sub-sidy in that round ie s(t) ( )(p1(t) p2(t))2 p2(t)x(t) p1(t)x(t) Examining thesession-level budget surplus across all roundswe find the following results First 22 out of 27sessions result in a budget surplus This comesfrom a combination of our choice of parametersand the dynamics of play Second the onlysignificant difference across treatments is thatbudget surpluses under 2018 and 2020are significantly lower than those under2040 ( p-values 0029 and 0024 respec-tively) This finding points to a potential costof driving up the punishment parameter Inother words before the system equilibrates ahigh punishment parameter can worsen thebudget imbalance problem inherent in themechanism

Overall Results 1 through 3 suggest the fol-lowing observations regarding games with stra-tegic complementarities First in terms ofconvergence level and speed supermodular andnear-supermodular games perform significantlybetter than those far under the threshold Sec-ond while there is no significant differencebetween supermodular and near-supermodulargames in terms of convergence level early per-formance differences may lead to supermodulartreatments reaching certain convergence levelsmore quickly Third beyond the threshold in-creasing has no significant effect on either thelevel or the speed of convergence but has thedisadvantage of risking higher budget imbal-ance Last for a given increasing haspartial effects on convergence level but nosignificant effect on convergence speed Tocheck the persistence of these experimental re-sults in the long run we use simulations inSection V

V Simulation Results Continued Dynamics

Our experiment examines the relationship be-tween strategic complementarity and conver-gence to equilibrium Learning theory predictslong-run convergence Figures 1 and 2 showthat convergence continues to improve in laterrounds for several treatments suggesting con-tinued dynamics Due to time attention andresource constraints it was not feasible to runthe experiment for much longer than 60rounds Therefore we rely on simulations tostudy continued convergence beyond 60rounds

To do so we look for a learning algorithmwhich when calibrated closely approximatesthe observed dynamic paths over 60 roundsThe large empirical literature on learning ingames (see eg Camerer 2003 for a survey)suggests many models Our interest here is notto compare the performance of various learningmodels We look for learning models that matchtwo criteria First we require a model that per-forms well in a variety of experimental gamesSecond given our experiment has complete in-formation about the payoff structure the modelneeds to incorporate this information In thefollowing subsections we first introduce threelearning models which meet our criteria Wethen look at how well the learning models pre-dict the experimental data by calibrating eachalgorithm on a subset of the experimental dataand validating the models on a hold-out sampleWe then report the forecasting results using oneof the calibrated algorithms

A Three Learning Models

The models we examine are stochastic ficti-tious play with discounting (hereafter shortenedas sFP) (Yin-Wong Cheung and Daniel Fried-man 1997 Fudenberg and Levine 1998) func-tional EWA (fEWA) (Teck-Hua Ho et al2001) and the payoff assessment learning model(PA) (Rajiv Sarin and Farshid Vahid 1999)We now give a brief overview of each modelInterested readers are referred to the originalsfor complete descriptions

The particular version of sFP that we use islogistic fictitious play (see eg Fudenberg andLevine 1998) A player predicts the matchrsquosprice in the next round pi(t 1) according to

1520 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(6) pi t 1

pi t yen 1

t 1

rpi t

1 yen 1

t 1

r

for some discount factor r [0 1] Note r 0corresponds to the Cournot best reply assess-ment pi(t 1) pi(t) When r 1 it yieldsthe standard fictitious play assessment Theusual adaptive learning model assumes 0 r 1All observations influence the expected state butmore recent observations have greater weight

As opposed to standard fictitious play stochas-tic fictitious play allows decision randomizationand thus better captures the human learning pro-cess Omitting time subscripts the probability thata player announces price pi is given by

(7) Pr pipi exp13 pi pi

yenj 0

40

exp13 pj pj

Given a predicted price a player is thus morelikely to play strategies that yield higher pay-offs How much more likely is determined by 13the sensitivity parameter As 13 increases theprobability of a best response to pi increases

The second model we consider fEWA is aone-parameter variant of the experience-weighted attraction (EWA) model (Camererand Ho 1999) In this model strategy probabil-ities are determined by logit probabilities simi-lar to equation (7) with actual payoffs ((pipi)) replaced by strategy attraction In bothvariants strategy attraction Ai

j(t) and an expe-rience weight N(t) are updated after everyperiod The experience weight is updated ac-cording to N(t) (1 ) N(t 1) 1where is the change-detection parameter and controls exploration (low ) versus exploita-tion Attractions are updated according to thefollowing rule

(8) Aijt

Nt 1Ai

jt 1 1 I pij pi ti pj pi t

Nt

where the indicator function I(pij pi(t)) equals

one if pij pi(t) and zero otherwise and [0

1] is the imagination weight In EWA all pa-rameters are estimated whereas in fEWA theseparameters (except for 13) are endogenously de-termined by the following functions Thechange-detector function i(t) is given by

(9) i t 1 5 j 1

mi Ipij pi t

1

yen 1

t

I pij pi t

t 2

This function will be close to 1 when recenthistory resembles previous history The imagi-nation weight i(t) equals i(t) while equalsthe Gini coefficient of previous choice frequen-cies EWA models encompass a variety offamiliar learning models cumulative reinforce-ment learning ( 0 1 N(0) 1)weighted reinforcement learning ( 0 0N(0) 1) weighted fictitious play ( 1 0) standard fictitious play ( 1 0)and Cournot best reply ( 1 1)

Finally we introduce the main componentsof the PA model For simplicity we omit allsubscripts that represent player i and let j(t)represent the actual payoff of strategy j in roundt Since the game has a large strategy space weincorporate similarity functions into the modelto represent agent use of strategy similarity Asstrategies in this game are naturally ordered bytheir labels we use the Bartlett similarity func-tion fjk(h t) to denote the similarity betweenthe played strategy k and an unplayed strategyj at period t

fjk h t 1 j kh if j k h0 otherwise

In this function the parameter h determines theh 1 unplayed strategies on either side of theplayed strategy to be updated When h 1fjk(1 t) degenerates into an indicator functionequal to one if strategy j is chosen in round t andzero otherwise

The PA model assumes that a player is amyopic subjective maximizer That is hechooses strategies based on assessed payoffs

1521VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

and does not explicitly take into account thelikelihood of alternate states Let uj(t) denotethe subjective assessment of strategy pj at timet and r the discount factor Payoff assessmentsare updated through a weighted average of hisprevious assessments and the payoff he actuallyobtains at time t If strategy k is chosen at timet then

(10) uj t 1 1 rfjk h tuj t

rfjk h tk t j

Each period the assessed strategy payoffs aresubject to zero-mean symmetrically distributedshocks zj(t) The decision maker chooses on thebasis of his shock-distorted subjective assess-ments uj(t) uj(t) zj(t) At time t he choosesstrategy pk if

(11) uk t uj t 0 pj pk

Note that mood shocks affect only his choicesand not the manner in which assessments areupdated

B Calibration

Literature assessing the performance oflearning models contains two approaches to cal-ibration and validation The first approach cal-ibrates the model on the first t rounds andvalidates on the remaining rounds The secondapproach uses half of the sessions in each treat-ment to calibrate and the other half to validateWe choose the latter approach for two reasonsFirst this approach is feasible as we have mul-tiple independent sessions for each treatmentSecond we need not assume that the parametersof later rounds are the same as those in earlierrounds We thus calibrate the parameters ofeach model in blocks of 15 rounds using theexperimental data from the first two sessions ofeach treatment We then evaluate each model bymeasuring how well the parameterized modelpredicts play in the remaining two or threesessions

For parameter estimation we conduct MonteCarlo simulations designed to replicate thecharacteristics of the experimental settings In

all calibrations we exclude the last two rounds(59 and 60) to avoid any end-of-game effectsWe then compare the simulated paths with theexperimental data to find those parameters thatminimize the mean-squared deviation (MSD)scores Since the final distributions of our pricedata are unimodal the simulated mean is aninformative statistic and is well captured byMSD (Ernan Haruvy and Dale Stahl 2000) Inall simulations we use the k-period-aheadrather than the one-period-ahead approach14 be-cause we are interested in forecasting the long-run mechanism performance In doing so wechoose k 10 15 20 30 and 58 We look atblocks of 10 15 20 and 30 rounds because asplayers gain information and experience infor-mation use may change over time We use k 15 rather than the other values because it bestcaptures the actual dynamics in the experimen-tal data

Each simulation consists of 1500 games(18000 players) and the following steps

(i) Simulated players are randomly matchedinto pairs at the beginning of each round

(ii) Simulated players select price announce-ments(a) Initial round Almost half of all play-

ers selected a first-round price of 20(44 percent of player 1s and 43 percentof player 2s) There are also consider-able spikes in the frequency of select-ing other prices divisible by 515 Basedon Kolmogorov-Smirnov tests on theactual round-one price distribution wereject the null hypotheses of uniformdistribution (d 0250 p-value 0000) and normal distribution (d 0381 p-value 0000) We thus fol-low the convention (eg Ho et al2001) and use the actual first-roundempirical distribution of choices togenerate the first-round choices

(b) Subsequent rounds Simulated playerstrategies are determined via equation(7) for sFP and fEWA and equation(11) for PA

14 Ido Erev and Haruvy (2000) discuss the tradeoffs ofthe two approaches

15 For example the seven most frequently selected pricesare divisible by 5

1522 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iii) Simulated player 1rsquos quantity choice isbased on the following steps(a) Determine each player 1rsquos best re-

sponse to p2(b) Determine whether player 1 will devi-

ate from the best response via the ac-tual probability of errors for eachblock

(c) If yes deviate via the actual mean andstandard deviation for the block Oth-erwise play the best response

(iv) Payoffs are determined by the payoff func-tion of the compensation mechanismequation (1) for each treatment

(v) Assessments are updated according toequation (8) for fEWA and equation (10)for PA

(vi) Proceed to the next round

The discount factor r [0 1] is searched ata grid size of 01 The parameter 13 is searchedat a grid size of 01 in the interval [15 105] forfEWA and [15 250] for sFP The size of thesimilarity window h [1 10] is searched at agrid size of 1 Mood shocks z are drawn from

a uniform distribution16 on an interval [a a]where a is searched on [0 500] with a step sizeof 50 For all parameters intervals and gridsizes are determined by payoff magnitude

Table 6 reports the calibrated parameters(discount factor sensitivity parameter moodshock interval and similarity window size) forthe first two sessions of each treatment in 15-round blocks17 Estimated parameters for thesupermodular and near-supermodular treatmentsare consistent with the increased level of con-vergence over time while the 00 treatmentsare not The second column (sFP) reports thebest-fit parameters for the stochastic fictitiousplay model With the exception of treatment2000 the discount factor is close to 10

16 Chen and Yuri Khoroshilov (2003) compare threeversions of PA models where shocks were drawn fromuniform logistic and double exponential distributions ontwo sets of experimental data and find that the performanceof the three versions of PA were statistically indistinguish-able

17 We also calibrate all parameters for the entire 60rounds but do not report results due to space limitations

TABLE 6mdashCALIBRATION OF THREE LEARNING MODELS IN FIFTEEN-ROUND BLOCKS

Model sFP fEWA PA

Block 1 2 3 4 1 2 3 4 1 2 3 4

Discount rate (r) Discount rate (r)1000 10 07 09 10 01 00 09 102000 07 06 01 01 05 10 02 012018 10 10 09 09 02 10 03 021020 10 10 10 09 01 03 06 032020 10 10 10 09 02 01 05 022040 10 10 10 07 01 10 04 03

Sensitivity (13) Sensitivity (13) Shock interval (a)1000 15 44 45 28 16 45 44 37 500 500 250 502000 15 43 140 180 17 40 51 74 50 100 150 502018 16 80 115 183 15 53 62 95 50 300 500 1501020 19 59 86 138 18 47 62 82 100 500 450 3002020 28 56 81 112 24 38 61 67 200 300 350 3502040 34 65 140 205 32 49 67 99 150 50 150 50

Similarity window (h)1000 1 1 10 92000 10 10 9 62018 5 4 6 11020 1 1 8 32020 2 2 7 12040 1 3 2 2

1523VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

indicating that players keep track of all past playwhen forming beliefs about opponentsrsquo nextmove18 Given these beliefs the increasing sen-sitivity parameter 13 for all treatments except1000 indicates that the likelihood of bestresponse increases over time The third columnreports calibration results for the fEWA modelAgain with the exception of treatment 1000the parameter 13 increases over time The finalcolumn reports calibration of parameters in thePA model For each treatment except 1000 amiddle block has the highest discount factorindicating more weight on new informationabout a strategyrsquos performance The second pa-rameter in the PA model represents the upperbound of the interval from which shocks aredrawn a Experimentation should decrease inthe final rounds Indeed for all treatments theestimated mood-shock ranges (weakly) de-crease from the third to the last block Finallythe decreasing similarity windows from thethird to final block indicate decreasing strategyspillover Relatively large discount factors inthe third block combined with relatively largesimilarity windows flatten the payoff assess-ments around the most recently played strate-gies consistent with the local experimentation(or relatively stabilized play) observed in thedata

C Validation

Using the parameters calibrated on the firsttwo sessions of each treatment we next com-pare the performance of the three learning mod-els in predicting play in the hold-out sampleFor comparison we also present the perfor-mance of two static models The random choicemodel assumes that each player randomlychooses any strategy with equal probability forall rounds This model incorporates only num-ber of strategies and thus provides a minimumstandard for a dynamic learning model Theequilibrium model assumes that each playerplays the subgame perfect Nash equilibriumevery round Its fit conveys the same informa-tion as the proportion of equilibrium play pre-

sented in Section IV but with a different metric(MSD)

Table 7 presents each modelrsquos MSD scoresfor each hold-out session Recall that two ofeach treatmentrsquos four or five independent ses-sions are used for calibration and the rest forvalidation The results indicate that all threelearning models perform significantly betterthan the random choice model (p-value 001one-sided permutation tests) and by a largermargin significantly better than the equilibriummodel (p-value 001 one-sided permutationtests) While the equilibrium model does a poorjob of explaining overall experimental data itsperformance improvement over time19 justifiesthe use of learning models to explain the dy-namics of play Within each of the top threepanels (ie learning models) session-levelMSD scores are lower for the supermodular andnear-supermodular treatments indicating thateach learning model does a better job explainingbehavior in those treatments Indeed in the em-pirical learning literature learning models fitbetter in experiments with better equilibriumconvergence (see eg Chen and Khoroshilov2003)

RESULT 4 (Comparison of Learning Mod-els) The performance of PA is weakly betterthan that of fEWA and strictly better than that ofsFP The performances of fEWA and sFP arenot significantly different

SUPPORT Table 7 reports the MSD scores foreach independent session in the hold-out sampleunder each model The Wilcoxon signed-rankstests show that the MSD scores under PA areweakly lower than those under fEWA (z 0085 p-value 0068) and strictly lowerthan those under sFP (z 0028 p-value 0023) Using the same test we cannot reject thehypothesis that fEWA and sFP yield the sameMSD scores (z 0662 p-value 0508)

Although the performance of PA is onlyweakly better than that of fEWA the overallMSD scores for PA are lower than those for

18 As the discount factor is zero in Cournot best replywe can reject this model based on our estimation of thediscount factors

19 We omit the table to show the performance of theequilibrium model in blocks of 15 rounds as the informa-tion is repetitive with Figures 1 and 2

1524 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

fEWA for every treatment Therefore we usePA for forecasting beyond 60 rounds

Figure 3 presents the simulated path of thecalibrated PA model and compares it with theactual data in the hold-out sample by superim-posing the simulated mean (black line) plus andminus one standard deviation (grey lines) on theactual mean (black box) and standard deviation(error bars) The simulation does a good job oftracking both the mean and the standard devia-tion of the actual data In treatments 1000and 2040 however the reduction in variancelags that seen in the middle rounds of theexperiment

D Forecasting

We now report the results from the PAmodel As the strategic complementarity predic-tions concern long-run performance we use thecalibrated parameters for the first 58 rounds tosimulate play in later rounds This exercise al-

lows us to study convergence and efficiency inlong but finite horizons

In all forecasting exercises we base all post-58-round parameters on those from the lastblock (calibrated from rounds 46 to 58) Sincewe do not know how these parameters mightchange beyond 60 rounds we use three differ-ent specifications First we retain all final-blockparameters Second we exponentially decay insubsequent blocks the probability of deviationfrom the best response in the production stage20

Third we exponentially decay the shock inter-val in subsequent blocks As all three specifica-tions yield qualitatively similar results wereport results from only the third specificationdue to space limitations21 In presenting the

20 In block k 4 we use the probability of deviationPrk minPr4(k 4)2 108 We set a lower bound of108 to avoid the division-by-zero problem

21 Different sequences of random numbers produceslightly different parameters estimates While the conver-

TABLE 7mdashVALIDATION ON HOLD-OUT SESSIONS

Session

Stochastic fictitious play

1000 2000 2018 1020 2020 2040

1 0955 0934 0940 0930 0888 09402 0960 0946 0890 0917 0973 08783 0948 0883 0873Overall 0957 0943 0915 0910 0912 0909

fEWA

1 0956 0934 0942 0928 0887 09482 0960 0946 0890 0922 0978 08863 0946 0879 0871Overall 0958 0942 0916 0910 0912 0917

Payoff assessment

1 0952 0933 0914 0941 0888 09162 0957 0944 0899 0922 0917 08983 0947 0894 0903Overall 0955 0941 0907 0919 0902 0907

Equilibrium play

1 1867 1853 1833 1833 1489 17922 1889 1919 1606 1839 1953 16693 1917 1447 1539Overall 1878 1896 1719 1706 1660 1731

Random play

Overall 0976 0976 0976 0976 0976 0976

1525VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

results we use the shorthand notation x y todenote a measure under treatment x is signifi-cantly higher than that under treatment y at the5-percent level or less and x y to denote thatthe measures are not significantly different atthe 5-percent level

Figure 4 reports the simulated proportion of-equilibrium play We report only the resultsfor the first 500 rounds since the dynamics donot change much thereafter In our simulationall treatments reach higher convergence levelsthan those achieved in the 60 rounds of the

experiment Since the simulated variance reduc-tion lags that of the experiment round 60 resultsare not achieved until approximately 90 roundsof simulation We further note that these im-provements slow down after 200 rounds Theproportion of -equilibrium play is bounded by72 percent for player 1 and 93 percent for player2 We now outline the simulation results

RESULT 5 (Level of Convergence in Round500) In the simulated data for player 1 atround 500 we have the following level of con-vergence ranking in

(i) p1 2018 2040 2020 1020 2000 1000

(ii) -p1 2018 2040 2020 1020 2000 1000

(iii) p2 2040 2018 1020 2020 2000 1000 and

gence level of a given treatment for a given set of parameterestimates is not affected by the sequence of random num-bers this convergence level is somewhat sensitive to theexact parameter calibration We therefore conduct four dif-ferent calibrations for each treatment and select the param-eters that yield the highest level The relative rankings oftreatments are quite robust with respect to ordinal selected

FIGURE 3 SIMULATED DYNAMIC PATH VERSUS ACTUAL DATA

1526 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iv) -p2 2040 2018 1020 2020 2000 1000

SUPPORT The fourth and seventh columns ofTable 8 report p-values for t-tests of the preced-ing alternative hypotheses for players 1 and 2respectively

Comparing Results 1 and 5 we first note thatthe four supermodular and near-supermodulartreatments continue to dominate the 00 treat-ments In fact the simulations suggest that thegap remains constant compared with 2000and actually increases compared with 1000Unlike Result 1 however the four dominanttreatments differ In particular 2018 per-forms better in terms of player 1rsquos convergenceand 2040 in terms of player 2rsquos although thedifferences within the top four treatments aresmaller than the differences between these four

and the 00 treatments In addition an increasein significantly improves convergence by alarge margin (40 to 80 percent) for both playerswhen 0

It is instructive to compare these results withthose concerning the speed of convergence inour experimental treatments In terms of con-vergence to -p2 our regression results (Table4) indicate that 2018rsquos performance in laterrounds enables it to achieve the same conver-gence level as the supermodular treatmentsThis trend continues in our simulations as2018 performs robustly well compared to thesupermodular treatments Likewise in our anal-ysis of the experimental results the conver-gence speed of 2040 was not significantlydifferent from those of the other dominant treat-ments In our simulations however this treat-ment dominates all treatments in player 2equilibrium play

FIGURE 3 CONTINUED

1527VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

In our simulations the convergence improve-ment due to increasing from 0 to 20 doesdepend on (-effect on -effect) An increasein significantly decreases the improvement in

convergence level (all p-values for one-sidedt-tests are less than 001) Due to the relativelypoor performance of 1000 in our simula-tions increasing from 10 to 20 when 0

Player 1

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500Round

Ep

silo

n-e

qu

ilib

riu

m p

lay

α10β00α20β00α20β18α10β20α20β20α20β40

Player 2

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500

Round

Ep

silo

n-e

qu

ilib

riu

m p

lay α10β00

α20β00α20β18α10β20α20β20α20β40

FIGURE 4 SIMULATED -EQUILIBRIUM PLAY

1528 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

improves convergence more dramatically inround 500 than in rounds 40 to 60 In the longrun an increase in is a partial substitute for anincrease in

We now use definition 2 to examine the con-vergence speed in the long run Table 9 presentsresults for the first time a treatment reaches thelevel of convergence L We omit the two 00treatments since they do not converge to thesame levels as the other treatments In terms ofplayer 1 2018 achieves the fastest conver-gence for all levels L The picture for player 2convergence however is more subtle First forthe 20 and 18 treatments the speeds of con-vergence to all L are remarkably similarWhile 2040 lags the other treatments interms of achieving 80-percent -equilibriumplay for player 2 it is the only treatment toachieve L 90 percent

Apart from convergence to equilibrium it isalso important to look at long-run welfare prop-erties We evaluate mechanism performance us-ing the efficiency measure in Section IV

Figure 5 summarizes the simulated and actualefficiency achieved by each of the treatmentsThe top panel reports simulated results whilethe bottom panel reports efficiency in the ex-perimental data In the long run supermodularand near-supermodular treatments continue todiffer from the 00 treatments with 20 supe-rior to 10 in the 00 treatments

VI Interpretation and Discussion

The underlying force for convergence ingames with strategic complementarities is thecombination of the slope of the best-responsefunctions and adaptive learning by players In

TABLE 9mdashSPEED OF CONVERGENCE IN SIMULATED DATA

(Threshold is the percent of players playing epsilon-equilibrium strategy Entry indicates round in which went overthreshold for good where ldquomdashrdquo indicates that treatment never achieved threshold)

Threshold

Player 1 Player 2

04 05 06 07 04 05 06 07 08 09

2018 54 83 126 316 38 52 62 87 127 mdash1020 129 221 mdash mdash 36 48 63 91 148 mdash2020 61 91 149 mdash 39 51 61 84 137 mdash2040 101 166 263 496 30 38 56 94 166 339

TABLE 8mdashRESULTS OF t-TESTS COMPARING LEVEL OF CONVERGENCE OF SIMULATIONS OF EXPERIMENTAL TREATMENTS

(Values are for round 500 and based on 1500 simulated games)

Treatment

Player 1 equilibrium price Player 2 equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0073 00822000 0188 2000 1000 0000 0213 2000 1000 00002018 0265 2018 2040 0187 0381 2018 1020 02641020 0211 1020 2000 0000 0376 1020 2020 00012020 0243 2020 1020 0000 0353 2020 2000 00002040 0259 2040 2020 0007 0442 2040 2018 0000

Treatment

Player 1 -equilibrium price Player 2 -equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0216 02322000 0519 2000 1000 0000 0573 2000 1000 00002018 0713 2018 2040 0045 0870 2018 1020 00081020 0584 1020 2000 0000 0857 1020 2020 00002020 0662 2020 1020 0000 0834 2020 2000 00002040 0701 2040 2020 0000 0925 2040 2018 0000

1529VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the compensation mechanisms the slope ofplayer 2rsquos best response function is determinedby As and determine the penalty for

mismatched prices we seriously consider thepossibility that convergence in supermodularand near-supermodular treatments to an equilib-

Simulation Data

75

80

85

90

95

100

50 150 250 350 450

Round

α10β00α20β00α20β18α10β20α20β20α20β40

Experimental Data

40

45

50

55

60

65

70

75

80

85

90

0 10 20 30 40 50 60

Round

Fra

ctio

n o

f M

axim

um

Wel

fare

α10β00α20β00α20β18α10β20α20β20α20β40

Fra

ctio

n o

f M

axim

um

Wel

fare

FIGURE 5 EFFICIENCY IN THE SIMULATED AND ACTUAL DATA

1530 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium where both players announce the sameprice is not due to strategic complementaritiesbut rather to increases in and making thepenalty terms more prominent and matching strat-egies more focal22 We thus have two hypoth-eses to explain the observed improvement inconvergence to equilibrium play in supermodu-lar and near-supermodular treatments a best-response hypothesis and a matching hypothesisIn this section we present evidence that theimprovement in convergence is due to payoff-relevant changes to best responses and not tomatching becoming more focal

While the focal point hypothesis suggeststhat players converge to a match it does notspecify the price at which they match In thefirst round of our experiments almost 50 per-cent of all prices were 20 and less than 10percent were in the -equilibrium range of[15 17] In the final rounds of our experi-ments play in the four supermodular or near-supermodular treatments converges stronglyto this range suggesting more than simplematching

By equation (4) player 1rsquos best response isalways to match regardless of p2 By theGeneral Incentive Hypothesis we expectmore matching behavior by player 1 as increases By equation (5) however match-ing is a best response for player 2 only inequilibrium Therefore data for player 2 canhelp us separate the best-response and match-ing hypotheses

We first investigate which model better ex-plains our experimental data We operationalizethe best response hypothesis by the stochasticfictitious play model of Section V and thematching hypothesis with a stochastic matchingmodel23 In both models the predicted player 1price is specified by equation (6) In the match-ing model player 2 plays this price with prob-ability 1 and one of the other 40 priceswith probability 40 We estimate using thefirst two sessions of each treatment24 We then

compare the performance of the matchingmodel with the sFP model using the hold-outsample Using the Wilcoxon signed-rank teststo compare the MSD scores in the hold-outsample we conclude that sFP explains player2rsquos behavior significantly better than the match-ing model (p-value 0006) We conclude thatbest response to a predicted price explains be-havior significantly better than the matchingmodel

We next investigate whether differences inthe cost of matching defined as the differencebetween matching and best-response profitscan explain all of the differences in match-ing behavior among treatments or whetherthere exists additional matching behaviorthat cannot be explained by payoff-relevantfactors

We examine the probability that player 2 triesto match the anticipated player 1 price We donot know what price player 2 anticipates Toestimate how players weigh history we cali-brate a matching model in a manner similar tothat outlined in Section V B We assume aplayer matches the price he anticipates givenby equation (6) and we find the discount ratefor each treatment that minimizes MSD25 Thisgives us the anticipated price that best explainsthe matching hypothesis Using this discountrate we calculate for each t 1 an anticipatedplayer 1 price p1 for each player 2 We alsocalculate the opportunity cost of matching p1equal to the difference between best-responseand matching profits This opportunity cost cap-tures the payoff relevant effects of and alsodepends on the price to be matched

We next regress the probability that player 2matches the anticipated price Pr(p2 p1) onln(Round) treatment dummies and treatmentdummies interacted with ln(Round) In onespecification we include the cost of matching asa regressor If parameter changes make match-ing more focal then changes in the cost ofmatching will not explain all of the changes inprobability of matching

22 We thank an anonymous referee and the co-editor forpointing this out

23 We do not report the results of the deterministicmatching model as it performs significantly worse than itsstochastic counterpart

24 Estimated in each block are as follows 1000 10 7 6 8 2000 7 6 5 4 2018 7

2 3 3 1020 5 3 3 2 2020 1 0 0 02040 2 0 0 1

25 In all treatments the calibrated discount rate is r 01

1531VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Table 10 presents the results of two probitmodels with clustering at the individual level Inthe first specification we do not include the costof matching as a regressor In this specificationthe coefficient for D00 is negative and signifi-cant thus reducing from 20 to 0 decreases theprobability that player 2 will try to match player1rsquos price Including the cost of matching in theregression however we find that neither thedummies nor their interactions are signifi-cantmdashthe coefficient for the previously signifi-cant D00 dummy falls almost by a half whilethe precision of the estimate stays about thesame The coefficient on the cost of matchinghowever is significant and negative Thisfinding suggests that the rise in matching that

occurs when we increase from 0 to 20 is notbecause matching is more focal but ratherbecause an increase of decreases the cost ofmatching26

VII Concluding Remarks

The appeal of games with strategic comple-mentarities is simple as long as players areadaptive learners the game will in the limitconverge to the set bounded by the largest andsmallest Nash equilibria This convergence de-pends on neither initial conditions nor the as-sumption of a particular learning modelUnfortunately while many competitive envi-ronments are supermodular many are not Inthis study we examine whether there are non-supermodular games which have the same con-vergence properties as supermodular ones Wealso study whether there exists a clear conver-gence ranking among games with strategiccomplementarities

Our results confirm the findings of previousexperimental studies that supermodular gamesperform significantly better than games far be-low the supermodular threshold From a littlebelow the threshold to the threshold howeverthe change in convergence level is statisticallyinsignificant This results suggests that in thecontext of mechanism design the designerneed not be overly concerned with setting pa-rameters that are firmly above the supermodularthreshold close is just as good It also enlargesthe set of robustly stable games For examplethe Falkinger mechanism (Falkinger 1996) isnot supermodular but close These results sug-gest that near-supermodular games perform likesupermodular ones

Our next result concerns convergence perfor-mance within the class of games with strategiccomplementarities Variations in the degree ofcomplementarities have no significant effect onperformance within the 60 experimental roundsOur simulations suggest however an increased

26 We consider other specifications of anticipated priceincluding the averages of the last one two three and fourprices seen In only one specification was a dummy or itsinteraction with ln(Round) significant In that specificationthe coefficient for D18 is positive and its interaction withln(Round) is negative a finding which is not consistent witha focal point hypothesis

TABLE 10mdashPLAYER 2 MATCHING HYPOTHESIS

(Probit models with individual-level clustering comparingmodels with and without the cost to Player 2 of matching

the anticipated price of Player 1)

Model

Dependent variableProbability player 2 priceequals anticipated player 1

price

(1) (2)

D10 0014 0017(0043) (0041)

D00 0077 0043(0035) (0036)

D18 0046 0032(0053) (0056)

D40 0008 0041(0061) (0042)

ln(Round) 0041 0028(0011) (0010)

Match Cost 0001(0000)

ln(Round)D10 0014 0012(0012) (0012)

ln(Round)D00 0003 0000(0013) (0012)

ln(Round)D18 0006 0002(0021) (0020)

ln(Round)D40 0001 0015(0018) (0016)

Observations 9588 9588Log pseudo-likelihood 324321 317716

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses are adjusted for cluster-ing at the individual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummyvariable for treatment y Excluded dummy is D2020Match Cost is the difference in cents of matching andbest-responding to anticipated Player 1 price

1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 6: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

In the experiment each player chooses pi 0 1 40 Without the compensation mech-anism the profit-maximizing production level isx r2c 960 three times higher than theefficient level To reduce the complexity ofplayer 1rsquos problem we use a grid size of 10 forthe quantity and truncate the strategy space ieplayer 1 chooses X 0 1 50 where X x10 The payoff functions presented to the sub-jects are adjusted accordingly Truncating thestrategy space to X 50 also reduces the pos-sibility of player 2rsquos bankruptcy To reducepayoff asymmetry we give player 1 a lump-sum payment of 250 points each round There-fore the equilibrium payoffs under themechanism for the two players are 1 1530and 2 2560

The functional forms and specific parametervalues are chosen for several reasons Firstequilibrium solutions are integers Secondequilibrium prices and quantities do not lie inthe center of the strategy space thus avoidingequilibrium convergence as a result of focalpoints Third there is a salient gap betweenefficiency with and without the mechanismWithout the mechanism the profit-maximizingproduction level is X 50 resulting in anefficiency level of 684 percent5 With themechanism the system achieves 100-percentefficiency in equilibrium Finally since thethreshold for strategic complementarity is 20 there is a large number of integer values tochoose from both below and above thethreshold

To study how strategic complementarity af-fects equilibrium convergence we keep 20and vary 0 18 20 and 40 To studywhether affects convergence we also keep 10 and vary 0 20

B Experimental Procedures

Our experiment involves 12 players persessionmdashsix player 1s (called Red players in the

instructions) and six player 2s (Blue players)Each player remains the same type throughoutthe experiment At the beginning of each ses-sion subjects randomly draw a PC terminalnumber Each then sits in front of the corre-sponding terminal and is given printed instruc-tions After the instructions are read aloudsubjects are encouraged to ask questions Theinstruction period lasts between 15 and 30minutes

Each round a player 1 is randomly matchedwith a player 2 Subjects are randomly re-matched each round to minimize repeated gameeffects The random rematching protocol alsominimizes the possibility that players colludeon a high-subsidy and low-tax outcome6 Eachsession consists of 60 rounds As we are inter-ested in learning there are no practice roundsEach round consists of two stages

(i) Announcement Stage Each player simulta-neously and independently chooses a pricepi 0 1 40

(ii) Production Stage After (p1 p2) are chosenplayer 1rsquos computer displays player 2rsquosprice and a payoff table showing her payofffor each X 0 1 50 Player 1 thenchooses a quantity X The server calculatespayoffs and sends each player his payoffthe quantity chosen and the prices submit-ted by him and his match

To summarize each subject knows both pay-off functions the choices made each round byhimself and his match as well as his per-periodand cumulative payoffs At any point subjectshave ready access to all of this information Themechanism is thus implemented as a game ofcomplete information We do not know how-ever how subjects processed this informationnor do we know their beliefs about the rational-

5 We compute the efficiency level by using the ratio oftotal earnings without the mechanism and total earningswith the mechanism Without the mechanism at the pro-duction level of X 50 (or x 500) total earning is 1 2 (rx cx2) ex2 2625 Therefore the efficiencylevel is 2625(1280 2560) 0684

6 If the players could commit to maximizing joint profitsby choosing p1 and p2 cooperatively and x noncoopera-tively they would choose by equation (1) (p1 p2 x) (4( )er[4( )(e c) 1] r(4e( ) 1)[4( )(e c) 1] 2r( )[4( )(e c) 1]) With ourchoice of parameters we get p1 24 p2 12 for 20and 0 p1 192 p2 144 for 20 and 20and p1 18 p2 15 for 20 and 40 etc If inaddition they could choose (p1 p2 x) cooperatively then forour parameter values we get (p1 p2 x) (min480[3( ) 20] 40 min960( )[3( ) 20] 50)

1510 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

ity of others Both of these factors introduceuncertainty in the environment

Table 1 presents features of the experimentalsessions including parameters number of inde-pendent sessions in each treatment whether themechanism is supermodular in that treatmentand equilibrium prices and quantities Overall27 independent computerized sessions wereconducted in the Research Center for GroupDynamics (RCGD) lab at the University ofMichigan from April to July 2001 and in April2003 We used zTree to program our experi-ments Our subjects were students from the Uni-versity of Michigan No subject was used inmore than one session yielding 324 subjectsEach session lasted approximately one-and-a-half hours The exchange rate for all treatmentswas one dollar for 4250 points The averageearnings was $2282 Data are available fromthe authors upon request

III Hypotheses

Given the design above we next identify ourhypotheses To do so we first define and discusstwo measures of convergence the level andspeed7 In theory convergence implies that allplayers play the stage game equilibrium and nodeviation is observed This is not realistic how-ever in an experimental setting Therefore wedefine the following measures

DEFINITION 1 The level of convergence atround t L(t) is measured by the proportion ofNash equilibrium play in that round The levelof convergence for a block of rounds Lb(t1 t2)measures the average proportion of Nash equi-librium play between rounds t1 and t2 ie Lb(t1t2) yentt1

t2 L(t)(t2 t1 1) where 0 t1 t2 T and T is the total number of rounds

We define the level of convergence for both around and a block of rounds The block conver-gence measure smooths out inter-roundvariation

Ideally the speed of convergence shouldmeasure how quickly all players converge toequilibrium strategies In our experimental set-ting however we never observe perfect con-vergence We therefore use a more generaldefinition for the speed of convergence

DEFINITION 2 For a given level of conver-gence L (0 1] the speed of convergence ismeasured by the first round in which the level ofconvergence reaches L and does not subse-quently drop below this level ie such thatL(t) L for any t

We use Definition 2 for analysis of conver-gence speed in simulations Due to oscillation inexperimental data Definition 2 is not practicalWe use the following observation which en-ables us to measure convergence speed usingestimates for the slope of L(t) L(t) L(t 1) L(t) obtained in regression analysis LetLy(t) be the convergence level of treatment y at

7 We thank anonymous referees for suggesting this sep-aration and appropriate measures

TABLE 1mdashFEATURES OF EXPERIMENTAL SESSIONS

Parameters and treatments Properties Equilibrium

Supermodular (p1 p2 X)10 20

1000 20000 (4 sessions) (5 sessions)

2018 18 (4 sessions) No (16 16 32)

1020 202020 (5 sessions) (5 sessions)

204040 (4 sessions) Yes (16 16 32)

1511VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

time t We now relate the slope of L(t) and theinitial level of convergence L(1) to the speed ofconvergence

OBSERVATION 1 If L1(1) L2(1) andL1(t) L2(t) for all t [1 T 1] thengiven any L (0 1] the first treatment con-verges more quickly than the second treatment

Based on theories presented in Section I wenow form our hypotheses about the level andspeed of convergence While theories of strate-gic complementarities do not make any predic-tions about the speed of convergence we formour hypotheses based on previous experimentsthat incidentally address games with strategiccomplementarities

HYPOTHESIS 1 When 20 increasing from 0 to 20 significantly increases (a) the leveland (b) the speed of convergence

Hypothesis 1 is based on the theoretical predic-tion that games with strategic complementari-ties converge to the unique Nash equilibrium aswell as on previous experimental findings thatsupermodular games perform robustly betterthan their non-supermodular counterparts

HYPOTHESIS 2 When 20 increasing from 0 to 18 significantly increases (a) the leveland (b) the speed of convergence

Hypothesis 2 is based on the findings ofFalkinger et al (2000) that average play is closeto equilibrium when the free parameter isslightly below the supermodular threshold

HYPOTHESIS 3 When 20 increasing from 18 to 20 significantly increases (a) thelevel and (b) the speed of convergence

Since we have not found any previous experi-mental studies that compare the performance ofgames with strategic complementarities withthose near the threshold Hypothesis 3 is purespeculation

HYPOTHESIS 4 When 20 increasing from 20 to 40 does not significantly in-crease either (a) the level or (b) the speed ofconvergence

Since we have not found any previous experi-mental studies within the class of games withstrategic complementarities Hypotheses 4 isagain our speculation

HYPOTHESIS 5 When 0 or 20 increas-ing from 10 to 20 significantly increases (a)the level and (b) the speed of convergence

HYPOTHESIS 6 Changing from 10 to 20significantly increases the improvement in (a)the level and (b) the speed of convergence thatresults from increasing from 0 to 20

Hypothesis 5 is based on experimental findingssupporting the General Incentive HypothesisHypothesis 6 is our speculation

Hypotheses 1 through 6 are concerned withonly one measure of performance convergenceto equilibrium Other measures such as effi-ciency and budget balance can be largely de-rived from convergence patterns Thereforealthough we omit the formal hypotheses wepresent results regarding these measures in Sec-tions IV and V

IV Experimental Results

In this section we compare the performanceof the mechanism as we vary and At theindividual level we look at both the level andspeed of convergence to subgame-perfect equi-librium and near equilibrium in each of the sixdifferent treatments At the aggregate level weexamine the efficiency and budget imbalancegenerated by each treatment In the followingdiscussion we focus on prices rather than onquantity Recall that player 1rsquos best response inthe production stage is uniquely determined byp2 and that player 1 has all of the informationneeded to select this best response In our ex-periment deviations in the production stagetend to be small (the average absolute deviationis less than 1 in 23 of 27 sessions) and does notdiffer significantly among treatments There-fore it is not surprising that in all of our anal-yses the results for quantities largely mirror theresults for player 2rsquos price8

Recall that subgame perfect Nash equilib-

8 Tables are available from the authors upon request

1512 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium for a stage game is (p1 p2 X) (16 1632) Since the strategy space in this experimentis rather large and the payoff function is rela-tively flat near equilibrium a small deviationfrom equilibrium is not very costly For exam-ple in the 2020 treatment a one-unit unilat-eral deviation from equilibrium prices costsplayer 1 $0005 and player 2 $0014 Thereforewe check the -equilibrium play by looking atthe proportion of price announcements within1 of the equilibrium price and the quantityannouncement within 4 of the equilibriumquantity since a one-unit change in p2 results ina four-unit best-response quantity changeTherefore the -equilibrium prediction is (-p1-p2 -x) (15 16 17 15 16 1728 32 36)

Figures 1 and 2 contain box and whiskersplots for the prices of each treatment for all 60rounds by players 1 and 2 respectively The boxrepresents the ranges of the twenty-fifth andseventy-fifth percentiles of prices while thewhiskers extend to the minimum and maximumprices in each round The horizontal line withineach box represents the median price Com-pared with the 00 treatments equilibriumprice convergence is clearly more pronouncedin the supermodular and near-supermodulartreatments

To analyze the performance of the compen-sation mechanism we first compare the conver-gence level achieved in the last 20 rounds ofeach treatment Table 2 reports the level ofconvergence (Lb(41 60)) to subgame-perfectNash equilibrium (panels A and B) and -Nashequilibrium (panels C and D) for each sessionunder each of the six different treatments aswell as the alternative hypotheses and the cor-responding p-values of one-tailed permutationtests9 under the null hypothesis that the con-vergence levels in the two treatments (column

[8]) are the same While the proportion of-equilibrium play is much higher than the pro-portion of equilibrium play the results of thepermutation tests largely follow similar pat-terns We now formally test our hypothesesregarding convergence level Parts (i) to (iii) ofResult 1 present the effects of the degree ofstrategic complementarity (-effects) Parts (iv)and (v) present effects due to changes in (-effects)

RESULT 1 (Level of Convergence in theLast 20 Rounds)

(i) When 20 increasing from 0 to 18and from 0 to 20 significantly10 increasesthe level of convergence in p1 p2 and -p2

(ii) When 20 increasing from 18 to 20does not change the level of convergencesignificantly

(iii) When 20 increasing from 20 to 40does not change the level of convergencesignificantly

(iv) When 0 increasing from 10 to 20weakly increases the level of convergencein p2 and significantly increases the levelof convergence in -p2 but has no signif-icant effects on p1 or -p1

(v) When 20 increasing from 10 to 20weakly increases the level of convergencein p1 but not in -p1 p2 or -p2

SUPPORT The last two columns of Table2 report the corresponding alternative hypothe-ses and permutation test results

By part (i) of Result 1 we accept Hypotheses1(a) and 2(a) Part (i) also confirms previousexperimental findings that supermodular gamesperform significantly better than those far fromthe supermodular threshold Furthermore near-supermodular games also perform significantlybetter than those far from the threshold

By part (ii) however we reject Hypothesis3(a) This is the first experimental result thatshows that from a little below the supermodularthreshold ( 18) to the threshold ( 20)

9 The permutation test also known as the Fisher random-ization test is a nonparametric version of a difference oftwo means t-test (See eg Sidney Siegel and N JohnCastellan 1988) The idea is simple and intuitive by pool-ing all independent observations the p-value is the exactprobability of observing a separation between the two treat-ments as the one observed when the pooled observations arerandomly divided into two equal-sized groups This testuses all of the information in the sample and thus has100-percent power-efficiency It is among the most power-ful of all statistical tests

10 When presenting results throughout the paper wefollow the convention that a significance level of 5 percentor less is significant while a significance level between 5percent and 10 percent is weakly significant

1513VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

improvement in convergence level is statisti-cally insignificant In other words we do notsee a dramatic improvement at the thresholdThis implies that the performance of near-supermodular games such as the Falkingermechanism ought to be comparable to that ofsupermodular games

By contrast we accept Hypothesis 4(a) bypart (iii) This is the first experimental resultsystematically comparing convergence levels ofsupermodular games where theory is silentThe convergence level does not significantlyimprove as increases from the threshold 20to 40 Therefore the marginal returns for being

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β00 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β00 Player 1

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β20 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β20 Player 1

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α20β18 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β40 Player 1

Pri

ce A

nn

ou

nce

d

Round

FIGURE 1 DISTRIBUTION OF ANNOUNCED PRICES IN EXPERIMENTAL TREATMENTS PLAYER 1

1514 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

ldquomore supermodularrdquo diminish once the payoffsbecome supermodular

Proposition 2 predicts convergence underCournot best reply for any 0 Howeverthere is a significant difference in convergencelevel as increases from 0 to 18 20 and

beyond We investigate in Section V whetherthis difference persists in the long run

While parts (i) to (iii) present the -effectsparts (iv) and (v) examine the -effects and wepartially accept Hypothesis 5(a) Recall fromequation (5) and subsequent discussions that at

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β00 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β00 Player 2

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β20 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β20 Player 2

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α20β18 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β40 Player 2

Pri

ce A

nn

ou

nce

d

Round

FIGURE 2 DISTRIBUTION OF ANNOUNCED PRICES IN EXPERIMENTAL TREATMENTS PLAYER 2

1515VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the threshold of strategic complementarity 20 player 2rsquos Nash equilibrium strategy is alsoa dominant strategy The finding of no -effecton player 2rsquos equilibrium play when 20 isconsistent with this observation

To determine the effects of strategic comple-mentarities and other factors on convergencespeed and to investigate further their effects onconvergence level we use probit models withclustering at the individual level The resultsfrom these models are presented in Table 3 Thedependent variable is -p1 in specifications (1)and (3) and -p2 in specifications (2) and (4)

In specifications (1) and (2) the independentvariables are treatment dummies Dy wherey 1000 2000 18 1020 and 40ln(Round) and a constant We use dummies fordifferent values of and rather than directparameter values to avoid assuming a linearrelationship of parameter effects and omit thedummy for 2020 Therefore in these twospecifications restricting learning speed to bethe same across treatments the estimated coef-ficient of Dy captures the difference in theconvergence level between treatments y and2020 Results from these specifications are

TABLE 2mdashLEVEL OF CONVERGENCE IN THE LAST 20 ROUNDS

Panel A Proportion of player 1 Nash equilibrium price (p1) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0000 0008 0158 0008 0044 1000 2000 03972000 0000 0083 0083 0042 0058 0053 2000 2020 00082018 0125 0117 0100 0192 0133 2000 2018 00081020 0242 0067 0067 0067 0208 0130 1020 2020 00642020 0325 0267 0208 0058 0367 0245 2018 2020 00642040 0175 0400 0158 0133 0217 2020 2040 0643

Panel B Proportion of player 2 Nash equilibrium price (p2) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0025 0042 0025 0067 0040 1000 2000 00562000 0067 0083 0033 0075 0050 0062 2000 2020 00322018 0133 0292 0108 0258 0198 2000 2018 00081020 0392 0108 0175 0067 0667 0282 1020 2020 05042020 0308 0117 0483 0017 0450 0275 2018 2020 02382040 0325 0492 0233 0233 0321 2020 2040 0365

Panel C Proportion of player 1 -Nash equilibrium price (-p1) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0108 0200 0350 0158 0204 1000 2000 01832000 0067 0458 0358 0183 0425 0298 2000 2020 00872018 0317 0625 0300 0583 0456 2000 2018 01031020 0475 0200 0300 0375 0492 0368 1020 2020 01942020 0450 0558 0475 0133 0775 0478 2018 2020 04442040 0458 0492 0375 0442 0442 2020 2040 0643

Panel D Proportion of player 2 -Nash equilibrium price (-p2) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0133 0200 0242 0225 0200 1000 2000 00162000 0300 0400 0200 0275 0333 0302 2000 2020 00162018 0717 0667 0342 0700 0606 2000 2018 00161020 0650 0517 0542 0400 0892 0600 1020 2020 05642020 0533 0592 0642 0225 0900 0578 2018 2020 05562040 0825 0792 0467 0517 0650 2020 2040 0294

Note Significant at 10-percent level 5-percent level 1-percent level

1516 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

largely consistent with Result 1 which uses amore conservative test The coefficients ofln(Round) are both positive and highly signifi-cant indicating that players learn to play equilib-rium strategies over time The concave functionalform ln(Round) which yields a better log-likelihood than either the linear or quadratic func-tional form indicates that learning is rapid at thebeginning and decreases over time We will ex-amine learning in more detail in Section V

In specifications (3) and (4) we useln(Round) interaction of each of the treatmentdummies with ln(Round) and a constant asindependent variables The interaction term al-lows different slopes for different treatmentsCompared with the coefficient of ln(Round) the

coefficient for the interaction term Dyln(Round) captures the slope differences be-tween treatment y and 2020 By Observation1 as we cannot reject the hypotheses that initialround probability of equilibrium play is thesame across all treatments11 we can comparethe slope of L(t) L(t) between different treat-ments From the probit specification L(t) 13(constant Db ln(t)) where D is the 1 5vector of treatment dummies and b is the 5 1vector of estimated coefficients we can derive

11 When including treatment dummies in this specifica-tion Wald tests on the hypothesis that the treatment dummycoefficients are all zero yield p-values of 02703 for -p1and 03383 for -p2

TABLE 3mdashCONVERGENCE SPEED PROBIT MODELS WITH CLUSTERING AT INDIVIDUAL LEVEL

Dependent variable -Nash equilibrium play

(1) -Nashprice 1

(2) -Nashprice 2

(3) -Nashprice 1

(4) -Nashprice 2

D1000 0143 0268(0048) (0050)

D2000 0090 0217(0060) (0057)

D18 0005 0004(0071) (0069)

D1020 0080 0025(0060) (0073)

D40 0000 0078(0068) (0078)

ln(Round) 0115 0167 0131 0183(0014) (0017) (0021) (0022)

D1000ln(Round) 0050 0095(0019) (0022)

D2000ln(Round) 0031 0073(0020) (0021)

D18ln(Round) 0001 0004(0020) (0021)

D1020ln(Round) 0024 0008(0019) (0022)

D40ln(Round) 0002 0023(0019) (0024)

Observations 9720 9720 9720 9720Number of groups 162 162 162 162Log pseudo-likelihood 5562890 5824055 5557424 5793013

D1000 D2000 150 131D1000ln(Round) D2000ln(Round) 133 123D1020 D1000 D2000 005 107D1020ln(Round) D1000ln(Round)

D2000ln(Round)005 103

Notes Coefficients are probability derivatives Robust standard errors in parentheses are adjusted for clustering at theindividual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummy variable fortreatment y Excluded dummy is D2020 The bottom panel presents null hypotheses and Wald 2(1) test statistics

1517VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the slope of the probability function for treat-ment y with respect to t Ly(t) 13(con-stant Db)byt All coefficients reported inTable 3 are increments of probability deriva-tives 13(constant Db)by compared to thebaseline case of 2020

RESULT 2 (Speed of Convergence)

(i) When 20 increasing from 0 to 18and from 0 to 20 significantly increases thespeed of convergence in -p1 and -p2

(ii) When 20 increasing from 18 to 20has no significant effect on convergencespeed

(iii) When 20 increasing from 20 to 40has no significant effect on convergencespeed

(iv) When 0 increasing from 10 to 20has no significant effects on convergencespeed

(v) When 20 increasing from 10 to 20has no significant effects on convergencespeed

SUPPORT Models (3) and (4) in Table 3 re-port analyses on convergence speed For eachindependent variable the coefficients standarderrors (in parentheses) and significance levelsare reported The second Wald test looks atwhether the coefficient of D1000ln(Round)equals that of D2000ln(Round)

Part (i) of Result 2 supports Hypotheses 1(b)and 2(b) while by part (ii) we reject Hypothesis3(b) Part (iii) supports Hypothesis 4(b) Parts(iv) and (v) reject Hypothesis 5(b) Result 2provides the first empirical evidence on the roleof strategic complementarity and the speed ofconvergence

Although part (ii) indicates that increasing from 18 to 20 does not significantly changeconvergence speed we now investigate whetherthere are any differences between 18 and thesupermodular treatments In particular we com-pare 2018 with 2020 and 204012 InResult 1 we show that these treatments achieve

the same convergence level Also we cannotreject the hypothesis that round-one prices foreach player are drawn for the same distributionfor each treatment13 As the treatments all startand converge to similar levels of equilibriumplay we use a more flexible functional form tolook for differences in speed

In Table 4 we report the results of probitregressions comparing 2018 with the two20 supermodular treatments We use Roundln(Round) and their interactions with 2018as independent variables to allow different con-vergence speeds to the same level of conver-gence In model (1) the dependent variable isplayer 1 -equilibrium play There is no signif-icant difference between 2018 and the 20supermodular treatments In model (2) the de-pendent variable is player 2 -equilibrium playThere are significant differences between thenear-supermodular and 20 supermodular treat-ments In early rounds ln(Round) is large relativeto Round The negative and weakly significantcoefficient on D2018ln(Round) implies a lowerprobability of early-round equilibrium play in

12 We omit 1020 from this analysis First it does notachieve the same -p1 convergence level as the other su-permodular treatments Second omitting it avoids the pos-sibility of an -effect

13 Kolmogorov-Smirnov test result tables are availableby request

TABLE 4mdashCONVERGENCE SPEED

Probit models with individual-level clustering comparing2018 with the 20 supermodular treatments achieving

similar convergence levels

Dependent variable -Nashequilibrium play

(1) -Nashprice 1

(2) -Nashprice 2

ln(Round) 0054 0216(0040) (0043)

Round 0004 0001(0002) (0002)

D2018ln(Round) 0013 0066(0032) (0035)

D2018Round 0001 0006(0002) (0003)

Observations 4680 4680Log pseudo-likelihood 287907 299385

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses Significant at 10-percent level 5-percent level 1-percent level Dy isthe dummy variable for treatment y Excluded dummies areD2020 and D2040

1518 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

2018 relative to the supermodular treatmentsThe 18 treatment does catch up to these treat-ments which is reflected in the positive andsignificant coefficient on D2018 interactedwith Round This result suggests a differencebetween supermodular and near-supermodulartreatments while they achieve similar conver-gence levels the supermodular treatments per-form better in early rounds and thus mightreach certain convergence levels faster thantheir near-supermodular counterparts

The previous discussion examines the sepa-rate effects of strategic complementarity (-effects) and (-effects) on convergenceVarying allows us however to test the im-portance of strategic complementarity relativeto other features of mechanism design Having afull factorial design in the parameters 1020 and 0 20 we can study whether affects the role of strategic complementarity Inparticular as increases from 0 to 20 weexpect improvement in convergence level andspeed We study whether this improvementchanges when increases from 10 to 20

The last two Wald tests in the bottom panelof Table 3 separately examine the -effectson the change in convergence level and speedresulting from increasing from 0 to 20(-effects on -effects) Changing from 10 to20 does not significantly change the improve-ment in either convergence level or speed Thisis the first empirical result examining the ef-fects of other factors on the role of strategiccomplementarity

So far we have discussed the performance ofthe mechanism relative to equilibrium predic-tions and individual behavior We now turn togroup-level welfare results Since the compen-sation mechanism balances the budget only in

equilibrium total payoffs off the equilibriumpath can be only weakly related to efficientpayoffs Therefore we use two separate mea-sures to capture welfare implications an effi-ciency measure and a measure of budgetimbalance

We first define the per-round efficiency mea-sure which includes neither the taxsubsidy northe penalty terms ie

et rxt cxt ext

rx cx ex

where x is the efficient quantity The efficiencyachieved in a block e(t1 t2) where 0 t1 t2 60 is then defined as

et1 t2 t t1

t2 et

t2 t1 1

RESULT 3 (Efficiency in the Last 20Rounds)

(i) When 20 increasing from 0 to 18significantly improves efficiency while in-creasing from 0 to 20 weakly improvesefficiency

(ii) When 20 increasing from 18 to 20has no significant effect on efficiency

(iii) When 20 increasing from 20 to 40weakly improves efficiency

(iv) When 0 increasing from 10 to 20weakly improves efficiency

(v) When 20 increasing from 10 to 20has no significant effect on efficiency

SUPPORT Table 5 presents the efficiencymeasure for each session in each treatment

TABLE 5mdashEFFICIENCY MEASURE

Efficiency Last 20 rounds

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0600 0671 0804 0858 0733 1000 2000 00872000 0650 0870 0907 0873 0825 0825 2000 2020 00952018 0963 0952 0889 0959 0941 2000 2018 00161020 0958 0919 0905 0881 0984 0929 1020 2020 07142020 0888 0937 0939 0780 0971 0903 2018 2020 08332040 0973 0962 0943 0949 0957 2020 2040 0064

Note Significant at 10-percent level 5-percent level 1-percent level

1519VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

in the last 20 rounds the alternative hypoth-eses and the results of one-tailed permutationtests

Result 3 is largely consistent with Result 1indicating that supermodular and near-supermodular mechanisms induce a signifi-cantly higher proportion of equilibrium playthan mechanisms far from the supermodularthreshold The new finding is that increasing from the threshold 20 to 40 weakly improvesefficiency

Second the budget surplus at round t is thesum of the penalty terms plus tax minus sub-sidy in that round ie s(t) ( )(p1(t) p2(t))2 p2(t)x(t) p1(t)x(t) Examining thesession-level budget surplus across all roundswe find the following results First 22 out of 27sessions result in a budget surplus This comesfrom a combination of our choice of parametersand the dynamics of play Second the onlysignificant difference across treatments is thatbudget surpluses under 2018 and 2020are significantly lower than those under2040 ( p-values 0029 and 0024 respec-tively) This finding points to a potential costof driving up the punishment parameter Inother words before the system equilibrates ahigh punishment parameter can worsen thebudget imbalance problem inherent in themechanism

Overall Results 1 through 3 suggest the fol-lowing observations regarding games with stra-tegic complementarities First in terms ofconvergence level and speed supermodular andnear-supermodular games perform significantlybetter than those far under the threshold Sec-ond while there is no significant differencebetween supermodular and near-supermodulargames in terms of convergence level early per-formance differences may lead to supermodulartreatments reaching certain convergence levelsmore quickly Third beyond the threshold in-creasing has no significant effect on either thelevel or the speed of convergence but has thedisadvantage of risking higher budget imbal-ance Last for a given increasing haspartial effects on convergence level but nosignificant effect on convergence speed Tocheck the persistence of these experimental re-sults in the long run we use simulations inSection V

V Simulation Results Continued Dynamics

Our experiment examines the relationship be-tween strategic complementarity and conver-gence to equilibrium Learning theory predictslong-run convergence Figures 1 and 2 showthat convergence continues to improve in laterrounds for several treatments suggesting con-tinued dynamics Due to time attention andresource constraints it was not feasible to runthe experiment for much longer than 60rounds Therefore we rely on simulations tostudy continued convergence beyond 60rounds

To do so we look for a learning algorithmwhich when calibrated closely approximatesthe observed dynamic paths over 60 roundsThe large empirical literature on learning ingames (see eg Camerer 2003 for a survey)suggests many models Our interest here is notto compare the performance of various learningmodels We look for learning models that matchtwo criteria First we require a model that per-forms well in a variety of experimental gamesSecond given our experiment has complete in-formation about the payoff structure the modelneeds to incorporate this information In thefollowing subsections we first introduce threelearning models which meet our criteria Wethen look at how well the learning models pre-dict the experimental data by calibrating eachalgorithm on a subset of the experimental dataand validating the models on a hold-out sampleWe then report the forecasting results using oneof the calibrated algorithms

A Three Learning Models

The models we examine are stochastic ficti-tious play with discounting (hereafter shortenedas sFP) (Yin-Wong Cheung and Daniel Fried-man 1997 Fudenberg and Levine 1998) func-tional EWA (fEWA) (Teck-Hua Ho et al2001) and the payoff assessment learning model(PA) (Rajiv Sarin and Farshid Vahid 1999)We now give a brief overview of each modelInterested readers are referred to the originalsfor complete descriptions

The particular version of sFP that we use islogistic fictitious play (see eg Fudenberg andLevine 1998) A player predicts the matchrsquosprice in the next round pi(t 1) according to

1520 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(6) pi t 1

pi t yen 1

t 1

rpi t

1 yen 1

t 1

r

for some discount factor r [0 1] Note r 0corresponds to the Cournot best reply assess-ment pi(t 1) pi(t) When r 1 it yieldsthe standard fictitious play assessment Theusual adaptive learning model assumes 0 r 1All observations influence the expected state butmore recent observations have greater weight

As opposed to standard fictitious play stochas-tic fictitious play allows decision randomizationand thus better captures the human learning pro-cess Omitting time subscripts the probability thata player announces price pi is given by

(7) Pr pipi exp13 pi pi

yenj 0

40

exp13 pj pj

Given a predicted price a player is thus morelikely to play strategies that yield higher pay-offs How much more likely is determined by 13the sensitivity parameter As 13 increases theprobability of a best response to pi increases

The second model we consider fEWA is aone-parameter variant of the experience-weighted attraction (EWA) model (Camererand Ho 1999) In this model strategy probabil-ities are determined by logit probabilities simi-lar to equation (7) with actual payoffs ((pipi)) replaced by strategy attraction In bothvariants strategy attraction Ai

j(t) and an expe-rience weight N(t) are updated after everyperiod The experience weight is updated ac-cording to N(t) (1 ) N(t 1) 1where is the change-detection parameter and controls exploration (low ) versus exploita-tion Attractions are updated according to thefollowing rule

(8) Aijt

Nt 1Ai

jt 1 1 I pij pi ti pj pi t

Nt

where the indicator function I(pij pi(t)) equals

one if pij pi(t) and zero otherwise and [0

1] is the imagination weight In EWA all pa-rameters are estimated whereas in fEWA theseparameters (except for 13) are endogenously de-termined by the following functions Thechange-detector function i(t) is given by

(9) i t 1 5 j 1

mi Ipij pi t

1

yen 1

t

I pij pi t

t 2

This function will be close to 1 when recenthistory resembles previous history The imagi-nation weight i(t) equals i(t) while equalsthe Gini coefficient of previous choice frequen-cies EWA models encompass a variety offamiliar learning models cumulative reinforce-ment learning ( 0 1 N(0) 1)weighted reinforcement learning ( 0 0N(0) 1) weighted fictitious play ( 1 0) standard fictitious play ( 1 0)and Cournot best reply ( 1 1)

Finally we introduce the main componentsof the PA model For simplicity we omit allsubscripts that represent player i and let j(t)represent the actual payoff of strategy j in roundt Since the game has a large strategy space weincorporate similarity functions into the modelto represent agent use of strategy similarity Asstrategies in this game are naturally ordered bytheir labels we use the Bartlett similarity func-tion fjk(h t) to denote the similarity betweenthe played strategy k and an unplayed strategyj at period t

fjk h t 1 j kh if j k h0 otherwise

In this function the parameter h determines theh 1 unplayed strategies on either side of theplayed strategy to be updated When h 1fjk(1 t) degenerates into an indicator functionequal to one if strategy j is chosen in round t andzero otherwise

The PA model assumes that a player is amyopic subjective maximizer That is hechooses strategies based on assessed payoffs

1521VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

and does not explicitly take into account thelikelihood of alternate states Let uj(t) denotethe subjective assessment of strategy pj at timet and r the discount factor Payoff assessmentsare updated through a weighted average of hisprevious assessments and the payoff he actuallyobtains at time t If strategy k is chosen at timet then

(10) uj t 1 1 rfjk h tuj t

rfjk h tk t j

Each period the assessed strategy payoffs aresubject to zero-mean symmetrically distributedshocks zj(t) The decision maker chooses on thebasis of his shock-distorted subjective assess-ments uj(t) uj(t) zj(t) At time t he choosesstrategy pk if

(11) uk t uj t 0 pj pk

Note that mood shocks affect only his choicesand not the manner in which assessments areupdated

B Calibration

Literature assessing the performance oflearning models contains two approaches to cal-ibration and validation The first approach cal-ibrates the model on the first t rounds andvalidates on the remaining rounds The secondapproach uses half of the sessions in each treat-ment to calibrate and the other half to validateWe choose the latter approach for two reasonsFirst this approach is feasible as we have mul-tiple independent sessions for each treatmentSecond we need not assume that the parametersof later rounds are the same as those in earlierrounds We thus calibrate the parameters ofeach model in blocks of 15 rounds using theexperimental data from the first two sessions ofeach treatment We then evaluate each model bymeasuring how well the parameterized modelpredicts play in the remaining two or threesessions

For parameter estimation we conduct MonteCarlo simulations designed to replicate thecharacteristics of the experimental settings In

all calibrations we exclude the last two rounds(59 and 60) to avoid any end-of-game effectsWe then compare the simulated paths with theexperimental data to find those parameters thatminimize the mean-squared deviation (MSD)scores Since the final distributions of our pricedata are unimodal the simulated mean is aninformative statistic and is well captured byMSD (Ernan Haruvy and Dale Stahl 2000) Inall simulations we use the k-period-aheadrather than the one-period-ahead approach14 be-cause we are interested in forecasting the long-run mechanism performance In doing so wechoose k 10 15 20 30 and 58 We look atblocks of 10 15 20 and 30 rounds because asplayers gain information and experience infor-mation use may change over time We use k 15 rather than the other values because it bestcaptures the actual dynamics in the experimen-tal data

Each simulation consists of 1500 games(18000 players) and the following steps

(i) Simulated players are randomly matchedinto pairs at the beginning of each round

(ii) Simulated players select price announce-ments(a) Initial round Almost half of all play-

ers selected a first-round price of 20(44 percent of player 1s and 43 percentof player 2s) There are also consider-able spikes in the frequency of select-ing other prices divisible by 515 Basedon Kolmogorov-Smirnov tests on theactual round-one price distribution wereject the null hypotheses of uniformdistribution (d 0250 p-value 0000) and normal distribution (d 0381 p-value 0000) We thus fol-low the convention (eg Ho et al2001) and use the actual first-roundempirical distribution of choices togenerate the first-round choices

(b) Subsequent rounds Simulated playerstrategies are determined via equation(7) for sFP and fEWA and equation(11) for PA

14 Ido Erev and Haruvy (2000) discuss the tradeoffs ofthe two approaches

15 For example the seven most frequently selected pricesare divisible by 5

1522 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iii) Simulated player 1rsquos quantity choice isbased on the following steps(a) Determine each player 1rsquos best re-

sponse to p2(b) Determine whether player 1 will devi-

ate from the best response via the ac-tual probability of errors for eachblock

(c) If yes deviate via the actual mean andstandard deviation for the block Oth-erwise play the best response

(iv) Payoffs are determined by the payoff func-tion of the compensation mechanismequation (1) for each treatment

(v) Assessments are updated according toequation (8) for fEWA and equation (10)for PA

(vi) Proceed to the next round

The discount factor r [0 1] is searched ata grid size of 01 The parameter 13 is searchedat a grid size of 01 in the interval [15 105] forfEWA and [15 250] for sFP The size of thesimilarity window h [1 10] is searched at agrid size of 1 Mood shocks z are drawn from

a uniform distribution16 on an interval [a a]where a is searched on [0 500] with a step sizeof 50 For all parameters intervals and gridsizes are determined by payoff magnitude

Table 6 reports the calibrated parameters(discount factor sensitivity parameter moodshock interval and similarity window size) forthe first two sessions of each treatment in 15-round blocks17 Estimated parameters for thesupermodular and near-supermodular treatmentsare consistent with the increased level of con-vergence over time while the 00 treatmentsare not The second column (sFP) reports thebest-fit parameters for the stochastic fictitiousplay model With the exception of treatment2000 the discount factor is close to 10

16 Chen and Yuri Khoroshilov (2003) compare threeversions of PA models where shocks were drawn fromuniform logistic and double exponential distributions ontwo sets of experimental data and find that the performanceof the three versions of PA were statistically indistinguish-able

17 We also calibrate all parameters for the entire 60rounds but do not report results due to space limitations

TABLE 6mdashCALIBRATION OF THREE LEARNING MODELS IN FIFTEEN-ROUND BLOCKS

Model sFP fEWA PA

Block 1 2 3 4 1 2 3 4 1 2 3 4

Discount rate (r) Discount rate (r)1000 10 07 09 10 01 00 09 102000 07 06 01 01 05 10 02 012018 10 10 09 09 02 10 03 021020 10 10 10 09 01 03 06 032020 10 10 10 09 02 01 05 022040 10 10 10 07 01 10 04 03

Sensitivity (13) Sensitivity (13) Shock interval (a)1000 15 44 45 28 16 45 44 37 500 500 250 502000 15 43 140 180 17 40 51 74 50 100 150 502018 16 80 115 183 15 53 62 95 50 300 500 1501020 19 59 86 138 18 47 62 82 100 500 450 3002020 28 56 81 112 24 38 61 67 200 300 350 3502040 34 65 140 205 32 49 67 99 150 50 150 50

Similarity window (h)1000 1 1 10 92000 10 10 9 62018 5 4 6 11020 1 1 8 32020 2 2 7 12040 1 3 2 2

1523VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

indicating that players keep track of all past playwhen forming beliefs about opponentsrsquo nextmove18 Given these beliefs the increasing sen-sitivity parameter 13 for all treatments except1000 indicates that the likelihood of bestresponse increases over time The third columnreports calibration results for the fEWA modelAgain with the exception of treatment 1000the parameter 13 increases over time The finalcolumn reports calibration of parameters in thePA model For each treatment except 1000 amiddle block has the highest discount factorindicating more weight on new informationabout a strategyrsquos performance The second pa-rameter in the PA model represents the upperbound of the interval from which shocks aredrawn a Experimentation should decrease inthe final rounds Indeed for all treatments theestimated mood-shock ranges (weakly) de-crease from the third to the last block Finallythe decreasing similarity windows from thethird to final block indicate decreasing strategyspillover Relatively large discount factors inthe third block combined with relatively largesimilarity windows flatten the payoff assess-ments around the most recently played strate-gies consistent with the local experimentation(or relatively stabilized play) observed in thedata

C Validation

Using the parameters calibrated on the firsttwo sessions of each treatment we next com-pare the performance of the three learning mod-els in predicting play in the hold-out sampleFor comparison we also present the perfor-mance of two static models The random choicemodel assumes that each player randomlychooses any strategy with equal probability forall rounds This model incorporates only num-ber of strategies and thus provides a minimumstandard for a dynamic learning model Theequilibrium model assumes that each playerplays the subgame perfect Nash equilibriumevery round Its fit conveys the same informa-tion as the proportion of equilibrium play pre-

sented in Section IV but with a different metric(MSD)

Table 7 presents each modelrsquos MSD scoresfor each hold-out session Recall that two ofeach treatmentrsquos four or five independent ses-sions are used for calibration and the rest forvalidation The results indicate that all threelearning models perform significantly betterthan the random choice model (p-value 001one-sided permutation tests) and by a largermargin significantly better than the equilibriummodel (p-value 001 one-sided permutationtests) While the equilibrium model does a poorjob of explaining overall experimental data itsperformance improvement over time19 justifiesthe use of learning models to explain the dy-namics of play Within each of the top threepanels (ie learning models) session-levelMSD scores are lower for the supermodular andnear-supermodular treatments indicating thateach learning model does a better job explainingbehavior in those treatments Indeed in the em-pirical learning literature learning models fitbetter in experiments with better equilibriumconvergence (see eg Chen and Khoroshilov2003)

RESULT 4 (Comparison of Learning Mod-els) The performance of PA is weakly betterthan that of fEWA and strictly better than that ofsFP The performances of fEWA and sFP arenot significantly different

SUPPORT Table 7 reports the MSD scores foreach independent session in the hold-out sampleunder each model The Wilcoxon signed-rankstests show that the MSD scores under PA areweakly lower than those under fEWA (z 0085 p-value 0068) and strictly lowerthan those under sFP (z 0028 p-value 0023) Using the same test we cannot reject thehypothesis that fEWA and sFP yield the sameMSD scores (z 0662 p-value 0508)

Although the performance of PA is onlyweakly better than that of fEWA the overallMSD scores for PA are lower than those for

18 As the discount factor is zero in Cournot best replywe can reject this model based on our estimation of thediscount factors

19 We omit the table to show the performance of theequilibrium model in blocks of 15 rounds as the informa-tion is repetitive with Figures 1 and 2

1524 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

fEWA for every treatment Therefore we usePA for forecasting beyond 60 rounds

Figure 3 presents the simulated path of thecalibrated PA model and compares it with theactual data in the hold-out sample by superim-posing the simulated mean (black line) plus andminus one standard deviation (grey lines) on theactual mean (black box) and standard deviation(error bars) The simulation does a good job oftracking both the mean and the standard devia-tion of the actual data In treatments 1000and 2040 however the reduction in variancelags that seen in the middle rounds of theexperiment

D Forecasting

We now report the results from the PAmodel As the strategic complementarity predic-tions concern long-run performance we use thecalibrated parameters for the first 58 rounds tosimulate play in later rounds This exercise al-

lows us to study convergence and efficiency inlong but finite horizons

In all forecasting exercises we base all post-58-round parameters on those from the lastblock (calibrated from rounds 46 to 58) Sincewe do not know how these parameters mightchange beyond 60 rounds we use three differ-ent specifications First we retain all final-blockparameters Second we exponentially decay insubsequent blocks the probability of deviationfrom the best response in the production stage20

Third we exponentially decay the shock inter-val in subsequent blocks As all three specifica-tions yield qualitatively similar results wereport results from only the third specificationdue to space limitations21 In presenting the

20 In block k 4 we use the probability of deviationPrk minPr4(k 4)2 108 We set a lower bound of108 to avoid the division-by-zero problem

21 Different sequences of random numbers produceslightly different parameters estimates While the conver-

TABLE 7mdashVALIDATION ON HOLD-OUT SESSIONS

Session

Stochastic fictitious play

1000 2000 2018 1020 2020 2040

1 0955 0934 0940 0930 0888 09402 0960 0946 0890 0917 0973 08783 0948 0883 0873Overall 0957 0943 0915 0910 0912 0909

fEWA

1 0956 0934 0942 0928 0887 09482 0960 0946 0890 0922 0978 08863 0946 0879 0871Overall 0958 0942 0916 0910 0912 0917

Payoff assessment

1 0952 0933 0914 0941 0888 09162 0957 0944 0899 0922 0917 08983 0947 0894 0903Overall 0955 0941 0907 0919 0902 0907

Equilibrium play

1 1867 1853 1833 1833 1489 17922 1889 1919 1606 1839 1953 16693 1917 1447 1539Overall 1878 1896 1719 1706 1660 1731

Random play

Overall 0976 0976 0976 0976 0976 0976

1525VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

results we use the shorthand notation x y todenote a measure under treatment x is signifi-cantly higher than that under treatment y at the5-percent level or less and x y to denote thatthe measures are not significantly different atthe 5-percent level

Figure 4 reports the simulated proportion of-equilibrium play We report only the resultsfor the first 500 rounds since the dynamics donot change much thereafter In our simulationall treatments reach higher convergence levelsthan those achieved in the 60 rounds of the

experiment Since the simulated variance reduc-tion lags that of the experiment round 60 resultsare not achieved until approximately 90 roundsof simulation We further note that these im-provements slow down after 200 rounds Theproportion of -equilibrium play is bounded by72 percent for player 1 and 93 percent for player2 We now outline the simulation results

RESULT 5 (Level of Convergence in Round500) In the simulated data for player 1 atround 500 we have the following level of con-vergence ranking in

(i) p1 2018 2040 2020 1020 2000 1000

(ii) -p1 2018 2040 2020 1020 2000 1000

(iii) p2 2040 2018 1020 2020 2000 1000 and

gence level of a given treatment for a given set of parameterestimates is not affected by the sequence of random num-bers this convergence level is somewhat sensitive to theexact parameter calibration We therefore conduct four dif-ferent calibrations for each treatment and select the param-eters that yield the highest level The relative rankings oftreatments are quite robust with respect to ordinal selected

FIGURE 3 SIMULATED DYNAMIC PATH VERSUS ACTUAL DATA

1526 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iv) -p2 2040 2018 1020 2020 2000 1000

SUPPORT The fourth and seventh columns ofTable 8 report p-values for t-tests of the preced-ing alternative hypotheses for players 1 and 2respectively

Comparing Results 1 and 5 we first note thatthe four supermodular and near-supermodulartreatments continue to dominate the 00 treat-ments In fact the simulations suggest that thegap remains constant compared with 2000and actually increases compared with 1000Unlike Result 1 however the four dominanttreatments differ In particular 2018 per-forms better in terms of player 1rsquos convergenceand 2040 in terms of player 2rsquos although thedifferences within the top four treatments aresmaller than the differences between these four

and the 00 treatments In addition an increasein significantly improves convergence by alarge margin (40 to 80 percent) for both playerswhen 0

It is instructive to compare these results withthose concerning the speed of convergence inour experimental treatments In terms of con-vergence to -p2 our regression results (Table4) indicate that 2018rsquos performance in laterrounds enables it to achieve the same conver-gence level as the supermodular treatmentsThis trend continues in our simulations as2018 performs robustly well compared to thesupermodular treatments Likewise in our anal-ysis of the experimental results the conver-gence speed of 2040 was not significantlydifferent from those of the other dominant treat-ments In our simulations however this treat-ment dominates all treatments in player 2equilibrium play

FIGURE 3 CONTINUED

1527VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

In our simulations the convergence improve-ment due to increasing from 0 to 20 doesdepend on (-effect on -effect) An increasein significantly decreases the improvement in

convergence level (all p-values for one-sidedt-tests are less than 001) Due to the relativelypoor performance of 1000 in our simula-tions increasing from 10 to 20 when 0

Player 1

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500Round

Ep

silo

n-e

qu

ilib

riu

m p

lay

α10β00α20β00α20β18α10β20α20β20α20β40

Player 2

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500

Round

Ep

silo

n-e

qu

ilib

riu

m p

lay α10β00

α20β00α20β18α10β20α20β20α20β40

FIGURE 4 SIMULATED -EQUILIBRIUM PLAY

1528 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

improves convergence more dramatically inround 500 than in rounds 40 to 60 In the longrun an increase in is a partial substitute for anincrease in

We now use definition 2 to examine the con-vergence speed in the long run Table 9 presentsresults for the first time a treatment reaches thelevel of convergence L We omit the two 00treatments since they do not converge to thesame levels as the other treatments In terms ofplayer 1 2018 achieves the fastest conver-gence for all levels L The picture for player 2convergence however is more subtle First forthe 20 and 18 treatments the speeds of con-vergence to all L are remarkably similarWhile 2040 lags the other treatments interms of achieving 80-percent -equilibriumplay for player 2 it is the only treatment toachieve L 90 percent

Apart from convergence to equilibrium it isalso important to look at long-run welfare prop-erties We evaluate mechanism performance us-ing the efficiency measure in Section IV

Figure 5 summarizes the simulated and actualefficiency achieved by each of the treatmentsThe top panel reports simulated results whilethe bottom panel reports efficiency in the ex-perimental data In the long run supermodularand near-supermodular treatments continue todiffer from the 00 treatments with 20 supe-rior to 10 in the 00 treatments

VI Interpretation and Discussion

The underlying force for convergence ingames with strategic complementarities is thecombination of the slope of the best-responsefunctions and adaptive learning by players In

TABLE 9mdashSPEED OF CONVERGENCE IN SIMULATED DATA

(Threshold is the percent of players playing epsilon-equilibrium strategy Entry indicates round in which went overthreshold for good where ldquomdashrdquo indicates that treatment never achieved threshold)

Threshold

Player 1 Player 2

04 05 06 07 04 05 06 07 08 09

2018 54 83 126 316 38 52 62 87 127 mdash1020 129 221 mdash mdash 36 48 63 91 148 mdash2020 61 91 149 mdash 39 51 61 84 137 mdash2040 101 166 263 496 30 38 56 94 166 339

TABLE 8mdashRESULTS OF t-TESTS COMPARING LEVEL OF CONVERGENCE OF SIMULATIONS OF EXPERIMENTAL TREATMENTS

(Values are for round 500 and based on 1500 simulated games)

Treatment

Player 1 equilibrium price Player 2 equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0073 00822000 0188 2000 1000 0000 0213 2000 1000 00002018 0265 2018 2040 0187 0381 2018 1020 02641020 0211 1020 2000 0000 0376 1020 2020 00012020 0243 2020 1020 0000 0353 2020 2000 00002040 0259 2040 2020 0007 0442 2040 2018 0000

Treatment

Player 1 -equilibrium price Player 2 -equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0216 02322000 0519 2000 1000 0000 0573 2000 1000 00002018 0713 2018 2040 0045 0870 2018 1020 00081020 0584 1020 2000 0000 0857 1020 2020 00002020 0662 2020 1020 0000 0834 2020 2000 00002040 0701 2040 2020 0000 0925 2040 2018 0000

1529VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the compensation mechanisms the slope ofplayer 2rsquos best response function is determinedby As and determine the penalty for

mismatched prices we seriously consider thepossibility that convergence in supermodularand near-supermodular treatments to an equilib-

Simulation Data

75

80

85

90

95

100

50 150 250 350 450

Round

α10β00α20β00α20β18α10β20α20β20α20β40

Experimental Data

40

45

50

55

60

65

70

75

80

85

90

0 10 20 30 40 50 60

Round

Fra

ctio

n o

f M

axim

um

Wel

fare

α10β00α20β00α20β18α10β20α20β20α20β40

Fra

ctio

n o

f M

axim

um

Wel

fare

FIGURE 5 EFFICIENCY IN THE SIMULATED AND ACTUAL DATA

1530 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium where both players announce the sameprice is not due to strategic complementaritiesbut rather to increases in and making thepenalty terms more prominent and matching strat-egies more focal22 We thus have two hypoth-eses to explain the observed improvement inconvergence to equilibrium play in supermodu-lar and near-supermodular treatments a best-response hypothesis and a matching hypothesisIn this section we present evidence that theimprovement in convergence is due to payoff-relevant changes to best responses and not tomatching becoming more focal

While the focal point hypothesis suggeststhat players converge to a match it does notspecify the price at which they match In thefirst round of our experiments almost 50 per-cent of all prices were 20 and less than 10percent were in the -equilibrium range of[15 17] In the final rounds of our experi-ments play in the four supermodular or near-supermodular treatments converges stronglyto this range suggesting more than simplematching

By equation (4) player 1rsquos best response isalways to match regardless of p2 By theGeneral Incentive Hypothesis we expectmore matching behavior by player 1 as increases By equation (5) however match-ing is a best response for player 2 only inequilibrium Therefore data for player 2 canhelp us separate the best-response and match-ing hypotheses

We first investigate which model better ex-plains our experimental data We operationalizethe best response hypothesis by the stochasticfictitious play model of Section V and thematching hypothesis with a stochastic matchingmodel23 In both models the predicted player 1price is specified by equation (6) In the match-ing model player 2 plays this price with prob-ability 1 and one of the other 40 priceswith probability 40 We estimate using thefirst two sessions of each treatment24 We then

compare the performance of the matchingmodel with the sFP model using the hold-outsample Using the Wilcoxon signed-rank teststo compare the MSD scores in the hold-outsample we conclude that sFP explains player2rsquos behavior significantly better than the match-ing model (p-value 0006) We conclude thatbest response to a predicted price explains be-havior significantly better than the matchingmodel

We next investigate whether differences inthe cost of matching defined as the differencebetween matching and best-response profitscan explain all of the differences in match-ing behavior among treatments or whetherthere exists additional matching behaviorthat cannot be explained by payoff-relevantfactors

We examine the probability that player 2 triesto match the anticipated player 1 price We donot know what price player 2 anticipates Toestimate how players weigh history we cali-brate a matching model in a manner similar tothat outlined in Section V B We assume aplayer matches the price he anticipates givenby equation (6) and we find the discount ratefor each treatment that minimizes MSD25 Thisgives us the anticipated price that best explainsthe matching hypothesis Using this discountrate we calculate for each t 1 an anticipatedplayer 1 price p1 for each player 2 We alsocalculate the opportunity cost of matching p1equal to the difference between best-responseand matching profits This opportunity cost cap-tures the payoff relevant effects of and alsodepends on the price to be matched

We next regress the probability that player 2matches the anticipated price Pr(p2 p1) onln(Round) treatment dummies and treatmentdummies interacted with ln(Round) In onespecification we include the cost of matching asa regressor If parameter changes make match-ing more focal then changes in the cost ofmatching will not explain all of the changes inprobability of matching

22 We thank an anonymous referee and the co-editor forpointing this out

23 We do not report the results of the deterministicmatching model as it performs significantly worse than itsstochastic counterpart

24 Estimated in each block are as follows 1000 10 7 6 8 2000 7 6 5 4 2018 7

2 3 3 1020 5 3 3 2 2020 1 0 0 02040 2 0 0 1

25 In all treatments the calibrated discount rate is r 01

1531VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Table 10 presents the results of two probitmodels with clustering at the individual level Inthe first specification we do not include the costof matching as a regressor In this specificationthe coefficient for D00 is negative and signifi-cant thus reducing from 20 to 0 decreases theprobability that player 2 will try to match player1rsquos price Including the cost of matching in theregression however we find that neither thedummies nor their interactions are signifi-cantmdashthe coefficient for the previously signifi-cant D00 dummy falls almost by a half whilethe precision of the estimate stays about thesame The coefficient on the cost of matchinghowever is significant and negative Thisfinding suggests that the rise in matching that

occurs when we increase from 0 to 20 is notbecause matching is more focal but ratherbecause an increase of decreases the cost ofmatching26

VII Concluding Remarks

The appeal of games with strategic comple-mentarities is simple as long as players areadaptive learners the game will in the limitconverge to the set bounded by the largest andsmallest Nash equilibria This convergence de-pends on neither initial conditions nor the as-sumption of a particular learning modelUnfortunately while many competitive envi-ronments are supermodular many are not Inthis study we examine whether there are non-supermodular games which have the same con-vergence properties as supermodular ones Wealso study whether there exists a clear conver-gence ranking among games with strategiccomplementarities

Our results confirm the findings of previousexperimental studies that supermodular gamesperform significantly better than games far be-low the supermodular threshold From a littlebelow the threshold to the threshold howeverthe change in convergence level is statisticallyinsignificant This results suggests that in thecontext of mechanism design the designerneed not be overly concerned with setting pa-rameters that are firmly above the supermodularthreshold close is just as good It also enlargesthe set of robustly stable games For examplethe Falkinger mechanism (Falkinger 1996) isnot supermodular but close These results sug-gest that near-supermodular games perform likesupermodular ones

Our next result concerns convergence perfor-mance within the class of games with strategiccomplementarities Variations in the degree ofcomplementarities have no significant effect onperformance within the 60 experimental roundsOur simulations suggest however an increased

26 We consider other specifications of anticipated priceincluding the averages of the last one two three and fourprices seen In only one specification was a dummy or itsinteraction with ln(Round) significant In that specificationthe coefficient for D18 is positive and its interaction withln(Round) is negative a finding which is not consistent witha focal point hypothesis

TABLE 10mdashPLAYER 2 MATCHING HYPOTHESIS

(Probit models with individual-level clustering comparingmodels with and without the cost to Player 2 of matching

the anticipated price of Player 1)

Model

Dependent variableProbability player 2 priceequals anticipated player 1

price

(1) (2)

D10 0014 0017(0043) (0041)

D00 0077 0043(0035) (0036)

D18 0046 0032(0053) (0056)

D40 0008 0041(0061) (0042)

ln(Round) 0041 0028(0011) (0010)

Match Cost 0001(0000)

ln(Round)D10 0014 0012(0012) (0012)

ln(Round)D00 0003 0000(0013) (0012)

ln(Round)D18 0006 0002(0021) (0020)

ln(Round)D40 0001 0015(0018) (0016)

Observations 9588 9588Log pseudo-likelihood 324321 317716

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses are adjusted for cluster-ing at the individual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummyvariable for treatment y Excluded dummy is D2020Match Cost is the difference in cents of matching andbest-responding to anticipated Player 1 price

1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 7: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

ity of others Both of these factors introduceuncertainty in the environment

Table 1 presents features of the experimentalsessions including parameters number of inde-pendent sessions in each treatment whether themechanism is supermodular in that treatmentand equilibrium prices and quantities Overall27 independent computerized sessions wereconducted in the Research Center for GroupDynamics (RCGD) lab at the University ofMichigan from April to July 2001 and in April2003 We used zTree to program our experi-ments Our subjects were students from the Uni-versity of Michigan No subject was used inmore than one session yielding 324 subjectsEach session lasted approximately one-and-a-half hours The exchange rate for all treatmentswas one dollar for 4250 points The averageearnings was $2282 Data are available fromthe authors upon request

III Hypotheses

Given the design above we next identify ourhypotheses To do so we first define and discusstwo measures of convergence the level andspeed7 In theory convergence implies that allplayers play the stage game equilibrium and nodeviation is observed This is not realistic how-ever in an experimental setting Therefore wedefine the following measures

DEFINITION 1 The level of convergence atround t L(t) is measured by the proportion ofNash equilibrium play in that round The levelof convergence for a block of rounds Lb(t1 t2)measures the average proportion of Nash equi-librium play between rounds t1 and t2 ie Lb(t1t2) yentt1

t2 L(t)(t2 t1 1) where 0 t1 t2 T and T is the total number of rounds

We define the level of convergence for both around and a block of rounds The block conver-gence measure smooths out inter-roundvariation

Ideally the speed of convergence shouldmeasure how quickly all players converge toequilibrium strategies In our experimental set-ting however we never observe perfect con-vergence We therefore use a more generaldefinition for the speed of convergence

DEFINITION 2 For a given level of conver-gence L (0 1] the speed of convergence ismeasured by the first round in which the level ofconvergence reaches L and does not subse-quently drop below this level ie such thatL(t) L for any t

We use Definition 2 for analysis of conver-gence speed in simulations Due to oscillation inexperimental data Definition 2 is not practicalWe use the following observation which en-ables us to measure convergence speed usingestimates for the slope of L(t) L(t) L(t 1) L(t) obtained in regression analysis LetLy(t) be the convergence level of treatment y at

7 We thank anonymous referees for suggesting this sep-aration and appropriate measures

TABLE 1mdashFEATURES OF EXPERIMENTAL SESSIONS

Parameters and treatments Properties Equilibrium

Supermodular (p1 p2 X)10 20

1000 20000 (4 sessions) (5 sessions)

2018 18 (4 sessions) No (16 16 32)

1020 202020 (5 sessions) (5 sessions)

204040 (4 sessions) Yes (16 16 32)

1511VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

time t We now relate the slope of L(t) and theinitial level of convergence L(1) to the speed ofconvergence

OBSERVATION 1 If L1(1) L2(1) andL1(t) L2(t) for all t [1 T 1] thengiven any L (0 1] the first treatment con-verges more quickly than the second treatment

Based on theories presented in Section I wenow form our hypotheses about the level andspeed of convergence While theories of strate-gic complementarities do not make any predic-tions about the speed of convergence we formour hypotheses based on previous experimentsthat incidentally address games with strategiccomplementarities

HYPOTHESIS 1 When 20 increasing from 0 to 20 significantly increases (a) the leveland (b) the speed of convergence

Hypothesis 1 is based on the theoretical predic-tion that games with strategic complementari-ties converge to the unique Nash equilibrium aswell as on previous experimental findings thatsupermodular games perform robustly betterthan their non-supermodular counterparts

HYPOTHESIS 2 When 20 increasing from 0 to 18 significantly increases (a) the leveland (b) the speed of convergence

Hypothesis 2 is based on the findings ofFalkinger et al (2000) that average play is closeto equilibrium when the free parameter isslightly below the supermodular threshold

HYPOTHESIS 3 When 20 increasing from 18 to 20 significantly increases (a) thelevel and (b) the speed of convergence

Since we have not found any previous experi-mental studies that compare the performance ofgames with strategic complementarities withthose near the threshold Hypothesis 3 is purespeculation

HYPOTHESIS 4 When 20 increasing from 20 to 40 does not significantly in-crease either (a) the level or (b) the speed ofconvergence

Since we have not found any previous experi-mental studies within the class of games withstrategic complementarities Hypotheses 4 isagain our speculation

HYPOTHESIS 5 When 0 or 20 increas-ing from 10 to 20 significantly increases (a)the level and (b) the speed of convergence

HYPOTHESIS 6 Changing from 10 to 20significantly increases the improvement in (a)the level and (b) the speed of convergence thatresults from increasing from 0 to 20

Hypothesis 5 is based on experimental findingssupporting the General Incentive HypothesisHypothesis 6 is our speculation

Hypotheses 1 through 6 are concerned withonly one measure of performance convergenceto equilibrium Other measures such as effi-ciency and budget balance can be largely de-rived from convergence patterns Thereforealthough we omit the formal hypotheses wepresent results regarding these measures in Sec-tions IV and V

IV Experimental Results

In this section we compare the performanceof the mechanism as we vary and At theindividual level we look at both the level andspeed of convergence to subgame-perfect equi-librium and near equilibrium in each of the sixdifferent treatments At the aggregate level weexamine the efficiency and budget imbalancegenerated by each treatment In the followingdiscussion we focus on prices rather than onquantity Recall that player 1rsquos best response inthe production stage is uniquely determined byp2 and that player 1 has all of the informationneeded to select this best response In our ex-periment deviations in the production stagetend to be small (the average absolute deviationis less than 1 in 23 of 27 sessions) and does notdiffer significantly among treatments There-fore it is not surprising that in all of our anal-yses the results for quantities largely mirror theresults for player 2rsquos price8

Recall that subgame perfect Nash equilib-

8 Tables are available from the authors upon request

1512 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium for a stage game is (p1 p2 X) (16 1632) Since the strategy space in this experimentis rather large and the payoff function is rela-tively flat near equilibrium a small deviationfrom equilibrium is not very costly For exam-ple in the 2020 treatment a one-unit unilat-eral deviation from equilibrium prices costsplayer 1 $0005 and player 2 $0014 Thereforewe check the -equilibrium play by looking atthe proportion of price announcements within1 of the equilibrium price and the quantityannouncement within 4 of the equilibriumquantity since a one-unit change in p2 results ina four-unit best-response quantity changeTherefore the -equilibrium prediction is (-p1-p2 -x) (15 16 17 15 16 1728 32 36)

Figures 1 and 2 contain box and whiskersplots for the prices of each treatment for all 60rounds by players 1 and 2 respectively The boxrepresents the ranges of the twenty-fifth andseventy-fifth percentiles of prices while thewhiskers extend to the minimum and maximumprices in each round The horizontal line withineach box represents the median price Com-pared with the 00 treatments equilibriumprice convergence is clearly more pronouncedin the supermodular and near-supermodulartreatments

To analyze the performance of the compen-sation mechanism we first compare the conver-gence level achieved in the last 20 rounds ofeach treatment Table 2 reports the level ofconvergence (Lb(41 60)) to subgame-perfectNash equilibrium (panels A and B) and -Nashequilibrium (panels C and D) for each sessionunder each of the six different treatments aswell as the alternative hypotheses and the cor-responding p-values of one-tailed permutationtests9 under the null hypothesis that the con-vergence levels in the two treatments (column

[8]) are the same While the proportion of-equilibrium play is much higher than the pro-portion of equilibrium play the results of thepermutation tests largely follow similar pat-terns We now formally test our hypothesesregarding convergence level Parts (i) to (iii) ofResult 1 present the effects of the degree ofstrategic complementarity (-effects) Parts (iv)and (v) present effects due to changes in (-effects)

RESULT 1 (Level of Convergence in theLast 20 Rounds)

(i) When 20 increasing from 0 to 18and from 0 to 20 significantly10 increasesthe level of convergence in p1 p2 and -p2

(ii) When 20 increasing from 18 to 20does not change the level of convergencesignificantly

(iii) When 20 increasing from 20 to 40does not change the level of convergencesignificantly

(iv) When 0 increasing from 10 to 20weakly increases the level of convergencein p2 and significantly increases the levelof convergence in -p2 but has no signif-icant effects on p1 or -p1

(v) When 20 increasing from 10 to 20weakly increases the level of convergencein p1 but not in -p1 p2 or -p2

SUPPORT The last two columns of Table2 report the corresponding alternative hypothe-ses and permutation test results

By part (i) of Result 1 we accept Hypotheses1(a) and 2(a) Part (i) also confirms previousexperimental findings that supermodular gamesperform significantly better than those far fromthe supermodular threshold Furthermore near-supermodular games also perform significantlybetter than those far from the threshold

By part (ii) however we reject Hypothesis3(a) This is the first experimental result thatshows that from a little below the supermodularthreshold ( 18) to the threshold ( 20)

9 The permutation test also known as the Fisher random-ization test is a nonparametric version of a difference oftwo means t-test (See eg Sidney Siegel and N JohnCastellan 1988) The idea is simple and intuitive by pool-ing all independent observations the p-value is the exactprobability of observing a separation between the two treat-ments as the one observed when the pooled observations arerandomly divided into two equal-sized groups This testuses all of the information in the sample and thus has100-percent power-efficiency It is among the most power-ful of all statistical tests

10 When presenting results throughout the paper wefollow the convention that a significance level of 5 percentor less is significant while a significance level between 5percent and 10 percent is weakly significant

1513VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

improvement in convergence level is statisti-cally insignificant In other words we do notsee a dramatic improvement at the thresholdThis implies that the performance of near-supermodular games such as the Falkingermechanism ought to be comparable to that ofsupermodular games

By contrast we accept Hypothesis 4(a) bypart (iii) This is the first experimental resultsystematically comparing convergence levels ofsupermodular games where theory is silentThe convergence level does not significantlyimprove as increases from the threshold 20to 40 Therefore the marginal returns for being

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β00 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β00 Player 1

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β20 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β20 Player 1

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α20β18 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β40 Player 1

Pri

ce A

nn

ou

nce

d

Round

FIGURE 1 DISTRIBUTION OF ANNOUNCED PRICES IN EXPERIMENTAL TREATMENTS PLAYER 1

1514 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

ldquomore supermodularrdquo diminish once the payoffsbecome supermodular

Proposition 2 predicts convergence underCournot best reply for any 0 Howeverthere is a significant difference in convergencelevel as increases from 0 to 18 20 and

beyond We investigate in Section V whetherthis difference persists in the long run

While parts (i) to (iii) present the -effectsparts (iv) and (v) examine the -effects and wepartially accept Hypothesis 5(a) Recall fromequation (5) and subsequent discussions that at

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β00 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β00 Player 2

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β20 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β20 Player 2

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α20β18 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β40 Player 2

Pri

ce A

nn

ou

nce

d

Round

FIGURE 2 DISTRIBUTION OF ANNOUNCED PRICES IN EXPERIMENTAL TREATMENTS PLAYER 2

1515VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the threshold of strategic complementarity 20 player 2rsquos Nash equilibrium strategy is alsoa dominant strategy The finding of no -effecton player 2rsquos equilibrium play when 20 isconsistent with this observation

To determine the effects of strategic comple-mentarities and other factors on convergencespeed and to investigate further their effects onconvergence level we use probit models withclustering at the individual level The resultsfrom these models are presented in Table 3 Thedependent variable is -p1 in specifications (1)and (3) and -p2 in specifications (2) and (4)

In specifications (1) and (2) the independentvariables are treatment dummies Dy wherey 1000 2000 18 1020 and 40ln(Round) and a constant We use dummies fordifferent values of and rather than directparameter values to avoid assuming a linearrelationship of parameter effects and omit thedummy for 2020 Therefore in these twospecifications restricting learning speed to bethe same across treatments the estimated coef-ficient of Dy captures the difference in theconvergence level between treatments y and2020 Results from these specifications are

TABLE 2mdashLEVEL OF CONVERGENCE IN THE LAST 20 ROUNDS

Panel A Proportion of player 1 Nash equilibrium price (p1) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0000 0008 0158 0008 0044 1000 2000 03972000 0000 0083 0083 0042 0058 0053 2000 2020 00082018 0125 0117 0100 0192 0133 2000 2018 00081020 0242 0067 0067 0067 0208 0130 1020 2020 00642020 0325 0267 0208 0058 0367 0245 2018 2020 00642040 0175 0400 0158 0133 0217 2020 2040 0643

Panel B Proportion of player 2 Nash equilibrium price (p2) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0025 0042 0025 0067 0040 1000 2000 00562000 0067 0083 0033 0075 0050 0062 2000 2020 00322018 0133 0292 0108 0258 0198 2000 2018 00081020 0392 0108 0175 0067 0667 0282 1020 2020 05042020 0308 0117 0483 0017 0450 0275 2018 2020 02382040 0325 0492 0233 0233 0321 2020 2040 0365

Panel C Proportion of player 1 -Nash equilibrium price (-p1) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0108 0200 0350 0158 0204 1000 2000 01832000 0067 0458 0358 0183 0425 0298 2000 2020 00872018 0317 0625 0300 0583 0456 2000 2018 01031020 0475 0200 0300 0375 0492 0368 1020 2020 01942020 0450 0558 0475 0133 0775 0478 2018 2020 04442040 0458 0492 0375 0442 0442 2020 2040 0643

Panel D Proportion of player 2 -Nash equilibrium price (-p2) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0133 0200 0242 0225 0200 1000 2000 00162000 0300 0400 0200 0275 0333 0302 2000 2020 00162018 0717 0667 0342 0700 0606 2000 2018 00161020 0650 0517 0542 0400 0892 0600 1020 2020 05642020 0533 0592 0642 0225 0900 0578 2018 2020 05562040 0825 0792 0467 0517 0650 2020 2040 0294

Note Significant at 10-percent level 5-percent level 1-percent level

1516 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

largely consistent with Result 1 which uses amore conservative test The coefficients ofln(Round) are both positive and highly signifi-cant indicating that players learn to play equilib-rium strategies over time The concave functionalform ln(Round) which yields a better log-likelihood than either the linear or quadratic func-tional form indicates that learning is rapid at thebeginning and decreases over time We will ex-amine learning in more detail in Section V

In specifications (3) and (4) we useln(Round) interaction of each of the treatmentdummies with ln(Round) and a constant asindependent variables The interaction term al-lows different slopes for different treatmentsCompared with the coefficient of ln(Round) the

coefficient for the interaction term Dyln(Round) captures the slope differences be-tween treatment y and 2020 By Observation1 as we cannot reject the hypotheses that initialround probability of equilibrium play is thesame across all treatments11 we can comparethe slope of L(t) L(t) between different treat-ments From the probit specification L(t) 13(constant Db ln(t)) where D is the 1 5vector of treatment dummies and b is the 5 1vector of estimated coefficients we can derive

11 When including treatment dummies in this specifica-tion Wald tests on the hypothesis that the treatment dummycoefficients are all zero yield p-values of 02703 for -p1and 03383 for -p2

TABLE 3mdashCONVERGENCE SPEED PROBIT MODELS WITH CLUSTERING AT INDIVIDUAL LEVEL

Dependent variable -Nash equilibrium play

(1) -Nashprice 1

(2) -Nashprice 2

(3) -Nashprice 1

(4) -Nashprice 2

D1000 0143 0268(0048) (0050)

D2000 0090 0217(0060) (0057)

D18 0005 0004(0071) (0069)

D1020 0080 0025(0060) (0073)

D40 0000 0078(0068) (0078)

ln(Round) 0115 0167 0131 0183(0014) (0017) (0021) (0022)

D1000ln(Round) 0050 0095(0019) (0022)

D2000ln(Round) 0031 0073(0020) (0021)

D18ln(Round) 0001 0004(0020) (0021)

D1020ln(Round) 0024 0008(0019) (0022)

D40ln(Round) 0002 0023(0019) (0024)

Observations 9720 9720 9720 9720Number of groups 162 162 162 162Log pseudo-likelihood 5562890 5824055 5557424 5793013

D1000 D2000 150 131D1000ln(Round) D2000ln(Round) 133 123D1020 D1000 D2000 005 107D1020ln(Round) D1000ln(Round)

D2000ln(Round)005 103

Notes Coefficients are probability derivatives Robust standard errors in parentheses are adjusted for clustering at theindividual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummy variable fortreatment y Excluded dummy is D2020 The bottom panel presents null hypotheses and Wald 2(1) test statistics

1517VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the slope of the probability function for treat-ment y with respect to t Ly(t) 13(con-stant Db)byt All coefficients reported inTable 3 are increments of probability deriva-tives 13(constant Db)by compared to thebaseline case of 2020

RESULT 2 (Speed of Convergence)

(i) When 20 increasing from 0 to 18and from 0 to 20 significantly increases thespeed of convergence in -p1 and -p2

(ii) When 20 increasing from 18 to 20has no significant effect on convergencespeed

(iii) When 20 increasing from 20 to 40has no significant effect on convergencespeed

(iv) When 0 increasing from 10 to 20has no significant effects on convergencespeed

(v) When 20 increasing from 10 to 20has no significant effects on convergencespeed

SUPPORT Models (3) and (4) in Table 3 re-port analyses on convergence speed For eachindependent variable the coefficients standarderrors (in parentheses) and significance levelsare reported The second Wald test looks atwhether the coefficient of D1000ln(Round)equals that of D2000ln(Round)

Part (i) of Result 2 supports Hypotheses 1(b)and 2(b) while by part (ii) we reject Hypothesis3(b) Part (iii) supports Hypothesis 4(b) Parts(iv) and (v) reject Hypothesis 5(b) Result 2provides the first empirical evidence on the roleof strategic complementarity and the speed ofconvergence

Although part (ii) indicates that increasing from 18 to 20 does not significantly changeconvergence speed we now investigate whetherthere are any differences between 18 and thesupermodular treatments In particular we com-pare 2018 with 2020 and 204012 InResult 1 we show that these treatments achieve

the same convergence level Also we cannotreject the hypothesis that round-one prices foreach player are drawn for the same distributionfor each treatment13 As the treatments all startand converge to similar levels of equilibriumplay we use a more flexible functional form tolook for differences in speed

In Table 4 we report the results of probitregressions comparing 2018 with the two20 supermodular treatments We use Roundln(Round) and their interactions with 2018as independent variables to allow different con-vergence speeds to the same level of conver-gence In model (1) the dependent variable isplayer 1 -equilibrium play There is no signif-icant difference between 2018 and the 20supermodular treatments In model (2) the de-pendent variable is player 2 -equilibrium playThere are significant differences between thenear-supermodular and 20 supermodular treat-ments In early rounds ln(Round) is large relativeto Round The negative and weakly significantcoefficient on D2018ln(Round) implies a lowerprobability of early-round equilibrium play in

12 We omit 1020 from this analysis First it does notachieve the same -p1 convergence level as the other su-permodular treatments Second omitting it avoids the pos-sibility of an -effect

13 Kolmogorov-Smirnov test result tables are availableby request

TABLE 4mdashCONVERGENCE SPEED

Probit models with individual-level clustering comparing2018 with the 20 supermodular treatments achieving

similar convergence levels

Dependent variable -Nashequilibrium play

(1) -Nashprice 1

(2) -Nashprice 2

ln(Round) 0054 0216(0040) (0043)

Round 0004 0001(0002) (0002)

D2018ln(Round) 0013 0066(0032) (0035)

D2018Round 0001 0006(0002) (0003)

Observations 4680 4680Log pseudo-likelihood 287907 299385

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses Significant at 10-percent level 5-percent level 1-percent level Dy isthe dummy variable for treatment y Excluded dummies areD2020 and D2040

1518 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

2018 relative to the supermodular treatmentsThe 18 treatment does catch up to these treat-ments which is reflected in the positive andsignificant coefficient on D2018 interactedwith Round This result suggests a differencebetween supermodular and near-supermodulartreatments while they achieve similar conver-gence levels the supermodular treatments per-form better in early rounds and thus mightreach certain convergence levels faster thantheir near-supermodular counterparts

The previous discussion examines the sepa-rate effects of strategic complementarity (-effects) and (-effects) on convergenceVarying allows us however to test the im-portance of strategic complementarity relativeto other features of mechanism design Having afull factorial design in the parameters 1020 and 0 20 we can study whether affects the role of strategic complementarity Inparticular as increases from 0 to 20 weexpect improvement in convergence level andspeed We study whether this improvementchanges when increases from 10 to 20

The last two Wald tests in the bottom panelof Table 3 separately examine the -effectson the change in convergence level and speedresulting from increasing from 0 to 20(-effects on -effects) Changing from 10 to20 does not significantly change the improve-ment in either convergence level or speed Thisis the first empirical result examining the ef-fects of other factors on the role of strategiccomplementarity

So far we have discussed the performance ofthe mechanism relative to equilibrium predic-tions and individual behavior We now turn togroup-level welfare results Since the compen-sation mechanism balances the budget only in

equilibrium total payoffs off the equilibriumpath can be only weakly related to efficientpayoffs Therefore we use two separate mea-sures to capture welfare implications an effi-ciency measure and a measure of budgetimbalance

We first define the per-round efficiency mea-sure which includes neither the taxsubsidy northe penalty terms ie

et rxt cxt ext

rx cx ex

where x is the efficient quantity The efficiencyachieved in a block e(t1 t2) where 0 t1 t2 60 is then defined as

et1 t2 t t1

t2 et

t2 t1 1

RESULT 3 (Efficiency in the Last 20Rounds)

(i) When 20 increasing from 0 to 18significantly improves efficiency while in-creasing from 0 to 20 weakly improvesefficiency

(ii) When 20 increasing from 18 to 20has no significant effect on efficiency

(iii) When 20 increasing from 20 to 40weakly improves efficiency

(iv) When 0 increasing from 10 to 20weakly improves efficiency

(v) When 20 increasing from 10 to 20has no significant effect on efficiency

SUPPORT Table 5 presents the efficiencymeasure for each session in each treatment

TABLE 5mdashEFFICIENCY MEASURE

Efficiency Last 20 rounds

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0600 0671 0804 0858 0733 1000 2000 00872000 0650 0870 0907 0873 0825 0825 2000 2020 00952018 0963 0952 0889 0959 0941 2000 2018 00161020 0958 0919 0905 0881 0984 0929 1020 2020 07142020 0888 0937 0939 0780 0971 0903 2018 2020 08332040 0973 0962 0943 0949 0957 2020 2040 0064

Note Significant at 10-percent level 5-percent level 1-percent level

1519VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

in the last 20 rounds the alternative hypoth-eses and the results of one-tailed permutationtests

Result 3 is largely consistent with Result 1indicating that supermodular and near-supermodular mechanisms induce a signifi-cantly higher proportion of equilibrium playthan mechanisms far from the supermodularthreshold The new finding is that increasing from the threshold 20 to 40 weakly improvesefficiency

Second the budget surplus at round t is thesum of the penalty terms plus tax minus sub-sidy in that round ie s(t) ( )(p1(t) p2(t))2 p2(t)x(t) p1(t)x(t) Examining thesession-level budget surplus across all roundswe find the following results First 22 out of 27sessions result in a budget surplus This comesfrom a combination of our choice of parametersand the dynamics of play Second the onlysignificant difference across treatments is thatbudget surpluses under 2018 and 2020are significantly lower than those under2040 ( p-values 0029 and 0024 respec-tively) This finding points to a potential costof driving up the punishment parameter Inother words before the system equilibrates ahigh punishment parameter can worsen thebudget imbalance problem inherent in themechanism

Overall Results 1 through 3 suggest the fol-lowing observations regarding games with stra-tegic complementarities First in terms ofconvergence level and speed supermodular andnear-supermodular games perform significantlybetter than those far under the threshold Sec-ond while there is no significant differencebetween supermodular and near-supermodulargames in terms of convergence level early per-formance differences may lead to supermodulartreatments reaching certain convergence levelsmore quickly Third beyond the threshold in-creasing has no significant effect on either thelevel or the speed of convergence but has thedisadvantage of risking higher budget imbal-ance Last for a given increasing haspartial effects on convergence level but nosignificant effect on convergence speed Tocheck the persistence of these experimental re-sults in the long run we use simulations inSection V

V Simulation Results Continued Dynamics

Our experiment examines the relationship be-tween strategic complementarity and conver-gence to equilibrium Learning theory predictslong-run convergence Figures 1 and 2 showthat convergence continues to improve in laterrounds for several treatments suggesting con-tinued dynamics Due to time attention andresource constraints it was not feasible to runthe experiment for much longer than 60rounds Therefore we rely on simulations tostudy continued convergence beyond 60rounds

To do so we look for a learning algorithmwhich when calibrated closely approximatesthe observed dynamic paths over 60 roundsThe large empirical literature on learning ingames (see eg Camerer 2003 for a survey)suggests many models Our interest here is notto compare the performance of various learningmodels We look for learning models that matchtwo criteria First we require a model that per-forms well in a variety of experimental gamesSecond given our experiment has complete in-formation about the payoff structure the modelneeds to incorporate this information In thefollowing subsections we first introduce threelearning models which meet our criteria Wethen look at how well the learning models pre-dict the experimental data by calibrating eachalgorithm on a subset of the experimental dataand validating the models on a hold-out sampleWe then report the forecasting results using oneof the calibrated algorithms

A Three Learning Models

The models we examine are stochastic ficti-tious play with discounting (hereafter shortenedas sFP) (Yin-Wong Cheung and Daniel Fried-man 1997 Fudenberg and Levine 1998) func-tional EWA (fEWA) (Teck-Hua Ho et al2001) and the payoff assessment learning model(PA) (Rajiv Sarin and Farshid Vahid 1999)We now give a brief overview of each modelInterested readers are referred to the originalsfor complete descriptions

The particular version of sFP that we use islogistic fictitious play (see eg Fudenberg andLevine 1998) A player predicts the matchrsquosprice in the next round pi(t 1) according to

1520 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(6) pi t 1

pi t yen 1

t 1

rpi t

1 yen 1

t 1

r

for some discount factor r [0 1] Note r 0corresponds to the Cournot best reply assess-ment pi(t 1) pi(t) When r 1 it yieldsthe standard fictitious play assessment Theusual adaptive learning model assumes 0 r 1All observations influence the expected state butmore recent observations have greater weight

As opposed to standard fictitious play stochas-tic fictitious play allows decision randomizationand thus better captures the human learning pro-cess Omitting time subscripts the probability thata player announces price pi is given by

(7) Pr pipi exp13 pi pi

yenj 0

40

exp13 pj pj

Given a predicted price a player is thus morelikely to play strategies that yield higher pay-offs How much more likely is determined by 13the sensitivity parameter As 13 increases theprobability of a best response to pi increases

The second model we consider fEWA is aone-parameter variant of the experience-weighted attraction (EWA) model (Camererand Ho 1999) In this model strategy probabil-ities are determined by logit probabilities simi-lar to equation (7) with actual payoffs ((pipi)) replaced by strategy attraction In bothvariants strategy attraction Ai

j(t) and an expe-rience weight N(t) are updated after everyperiod The experience weight is updated ac-cording to N(t) (1 ) N(t 1) 1where is the change-detection parameter and controls exploration (low ) versus exploita-tion Attractions are updated according to thefollowing rule

(8) Aijt

Nt 1Ai

jt 1 1 I pij pi ti pj pi t

Nt

where the indicator function I(pij pi(t)) equals

one if pij pi(t) and zero otherwise and [0

1] is the imagination weight In EWA all pa-rameters are estimated whereas in fEWA theseparameters (except for 13) are endogenously de-termined by the following functions Thechange-detector function i(t) is given by

(9) i t 1 5 j 1

mi Ipij pi t

1

yen 1

t

I pij pi t

t 2

This function will be close to 1 when recenthistory resembles previous history The imagi-nation weight i(t) equals i(t) while equalsthe Gini coefficient of previous choice frequen-cies EWA models encompass a variety offamiliar learning models cumulative reinforce-ment learning ( 0 1 N(0) 1)weighted reinforcement learning ( 0 0N(0) 1) weighted fictitious play ( 1 0) standard fictitious play ( 1 0)and Cournot best reply ( 1 1)

Finally we introduce the main componentsof the PA model For simplicity we omit allsubscripts that represent player i and let j(t)represent the actual payoff of strategy j in roundt Since the game has a large strategy space weincorporate similarity functions into the modelto represent agent use of strategy similarity Asstrategies in this game are naturally ordered bytheir labels we use the Bartlett similarity func-tion fjk(h t) to denote the similarity betweenthe played strategy k and an unplayed strategyj at period t

fjk h t 1 j kh if j k h0 otherwise

In this function the parameter h determines theh 1 unplayed strategies on either side of theplayed strategy to be updated When h 1fjk(1 t) degenerates into an indicator functionequal to one if strategy j is chosen in round t andzero otherwise

The PA model assumes that a player is amyopic subjective maximizer That is hechooses strategies based on assessed payoffs

1521VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

and does not explicitly take into account thelikelihood of alternate states Let uj(t) denotethe subjective assessment of strategy pj at timet and r the discount factor Payoff assessmentsare updated through a weighted average of hisprevious assessments and the payoff he actuallyobtains at time t If strategy k is chosen at timet then

(10) uj t 1 1 rfjk h tuj t

rfjk h tk t j

Each period the assessed strategy payoffs aresubject to zero-mean symmetrically distributedshocks zj(t) The decision maker chooses on thebasis of his shock-distorted subjective assess-ments uj(t) uj(t) zj(t) At time t he choosesstrategy pk if

(11) uk t uj t 0 pj pk

Note that mood shocks affect only his choicesand not the manner in which assessments areupdated

B Calibration

Literature assessing the performance oflearning models contains two approaches to cal-ibration and validation The first approach cal-ibrates the model on the first t rounds andvalidates on the remaining rounds The secondapproach uses half of the sessions in each treat-ment to calibrate and the other half to validateWe choose the latter approach for two reasonsFirst this approach is feasible as we have mul-tiple independent sessions for each treatmentSecond we need not assume that the parametersof later rounds are the same as those in earlierrounds We thus calibrate the parameters ofeach model in blocks of 15 rounds using theexperimental data from the first two sessions ofeach treatment We then evaluate each model bymeasuring how well the parameterized modelpredicts play in the remaining two or threesessions

For parameter estimation we conduct MonteCarlo simulations designed to replicate thecharacteristics of the experimental settings In

all calibrations we exclude the last two rounds(59 and 60) to avoid any end-of-game effectsWe then compare the simulated paths with theexperimental data to find those parameters thatminimize the mean-squared deviation (MSD)scores Since the final distributions of our pricedata are unimodal the simulated mean is aninformative statistic and is well captured byMSD (Ernan Haruvy and Dale Stahl 2000) Inall simulations we use the k-period-aheadrather than the one-period-ahead approach14 be-cause we are interested in forecasting the long-run mechanism performance In doing so wechoose k 10 15 20 30 and 58 We look atblocks of 10 15 20 and 30 rounds because asplayers gain information and experience infor-mation use may change over time We use k 15 rather than the other values because it bestcaptures the actual dynamics in the experimen-tal data

Each simulation consists of 1500 games(18000 players) and the following steps

(i) Simulated players are randomly matchedinto pairs at the beginning of each round

(ii) Simulated players select price announce-ments(a) Initial round Almost half of all play-

ers selected a first-round price of 20(44 percent of player 1s and 43 percentof player 2s) There are also consider-able spikes in the frequency of select-ing other prices divisible by 515 Basedon Kolmogorov-Smirnov tests on theactual round-one price distribution wereject the null hypotheses of uniformdistribution (d 0250 p-value 0000) and normal distribution (d 0381 p-value 0000) We thus fol-low the convention (eg Ho et al2001) and use the actual first-roundempirical distribution of choices togenerate the first-round choices

(b) Subsequent rounds Simulated playerstrategies are determined via equation(7) for sFP and fEWA and equation(11) for PA

14 Ido Erev and Haruvy (2000) discuss the tradeoffs ofthe two approaches

15 For example the seven most frequently selected pricesare divisible by 5

1522 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iii) Simulated player 1rsquos quantity choice isbased on the following steps(a) Determine each player 1rsquos best re-

sponse to p2(b) Determine whether player 1 will devi-

ate from the best response via the ac-tual probability of errors for eachblock

(c) If yes deviate via the actual mean andstandard deviation for the block Oth-erwise play the best response

(iv) Payoffs are determined by the payoff func-tion of the compensation mechanismequation (1) for each treatment

(v) Assessments are updated according toequation (8) for fEWA and equation (10)for PA

(vi) Proceed to the next round

The discount factor r [0 1] is searched ata grid size of 01 The parameter 13 is searchedat a grid size of 01 in the interval [15 105] forfEWA and [15 250] for sFP The size of thesimilarity window h [1 10] is searched at agrid size of 1 Mood shocks z are drawn from

a uniform distribution16 on an interval [a a]where a is searched on [0 500] with a step sizeof 50 For all parameters intervals and gridsizes are determined by payoff magnitude

Table 6 reports the calibrated parameters(discount factor sensitivity parameter moodshock interval and similarity window size) forthe first two sessions of each treatment in 15-round blocks17 Estimated parameters for thesupermodular and near-supermodular treatmentsare consistent with the increased level of con-vergence over time while the 00 treatmentsare not The second column (sFP) reports thebest-fit parameters for the stochastic fictitiousplay model With the exception of treatment2000 the discount factor is close to 10

16 Chen and Yuri Khoroshilov (2003) compare threeversions of PA models where shocks were drawn fromuniform logistic and double exponential distributions ontwo sets of experimental data and find that the performanceof the three versions of PA were statistically indistinguish-able

17 We also calibrate all parameters for the entire 60rounds but do not report results due to space limitations

TABLE 6mdashCALIBRATION OF THREE LEARNING MODELS IN FIFTEEN-ROUND BLOCKS

Model sFP fEWA PA

Block 1 2 3 4 1 2 3 4 1 2 3 4

Discount rate (r) Discount rate (r)1000 10 07 09 10 01 00 09 102000 07 06 01 01 05 10 02 012018 10 10 09 09 02 10 03 021020 10 10 10 09 01 03 06 032020 10 10 10 09 02 01 05 022040 10 10 10 07 01 10 04 03

Sensitivity (13) Sensitivity (13) Shock interval (a)1000 15 44 45 28 16 45 44 37 500 500 250 502000 15 43 140 180 17 40 51 74 50 100 150 502018 16 80 115 183 15 53 62 95 50 300 500 1501020 19 59 86 138 18 47 62 82 100 500 450 3002020 28 56 81 112 24 38 61 67 200 300 350 3502040 34 65 140 205 32 49 67 99 150 50 150 50

Similarity window (h)1000 1 1 10 92000 10 10 9 62018 5 4 6 11020 1 1 8 32020 2 2 7 12040 1 3 2 2

1523VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

indicating that players keep track of all past playwhen forming beliefs about opponentsrsquo nextmove18 Given these beliefs the increasing sen-sitivity parameter 13 for all treatments except1000 indicates that the likelihood of bestresponse increases over time The third columnreports calibration results for the fEWA modelAgain with the exception of treatment 1000the parameter 13 increases over time The finalcolumn reports calibration of parameters in thePA model For each treatment except 1000 amiddle block has the highest discount factorindicating more weight on new informationabout a strategyrsquos performance The second pa-rameter in the PA model represents the upperbound of the interval from which shocks aredrawn a Experimentation should decrease inthe final rounds Indeed for all treatments theestimated mood-shock ranges (weakly) de-crease from the third to the last block Finallythe decreasing similarity windows from thethird to final block indicate decreasing strategyspillover Relatively large discount factors inthe third block combined with relatively largesimilarity windows flatten the payoff assess-ments around the most recently played strate-gies consistent with the local experimentation(or relatively stabilized play) observed in thedata

C Validation

Using the parameters calibrated on the firsttwo sessions of each treatment we next com-pare the performance of the three learning mod-els in predicting play in the hold-out sampleFor comparison we also present the perfor-mance of two static models The random choicemodel assumes that each player randomlychooses any strategy with equal probability forall rounds This model incorporates only num-ber of strategies and thus provides a minimumstandard for a dynamic learning model Theequilibrium model assumes that each playerplays the subgame perfect Nash equilibriumevery round Its fit conveys the same informa-tion as the proportion of equilibrium play pre-

sented in Section IV but with a different metric(MSD)

Table 7 presents each modelrsquos MSD scoresfor each hold-out session Recall that two ofeach treatmentrsquos four or five independent ses-sions are used for calibration and the rest forvalidation The results indicate that all threelearning models perform significantly betterthan the random choice model (p-value 001one-sided permutation tests) and by a largermargin significantly better than the equilibriummodel (p-value 001 one-sided permutationtests) While the equilibrium model does a poorjob of explaining overall experimental data itsperformance improvement over time19 justifiesthe use of learning models to explain the dy-namics of play Within each of the top threepanels (ie learning models) session-levelMSD scores are lower for the supermodular andnear-supermodular treatments indicating thateach learning model does a better job explainingbehavior in those treatments Indeed in the em-pirical learning literature learning models fitbetter in experiments with better equilibriumconvergence (see eg Chen and Khoroshilov2003)

RESULT 4 (Comparison of Learning Mod-els) The performance of PA is weakly betterthan that of fEWA and strictly better than that ofsFP The performances of fEWA and sFP arenot significantly different

SUPPORT Table 7 reports the MSD scores foreach independent session in the hold-out sampleunder each model The Wilcoxon signed-rankstests show that the MSD scores under PA areweakly lower than those under fEWA (z 0085 p-value 0068) and strictly lowerthan those under sFP (z 0028 p-value 0023) Using the same test we cannot reject thehypothesis that fEWA and sFP yield the sameMSD scores (z 0662 p-value 0508)

Although the performance of PA is onlyweakly better than that of fEWA the overallMSD scores for PA are lower than those for

18 As the discount factor is zero in Cournot best replywe can reject this model based on our estimation of thediscount factors

19 We omit the table to show the performance of theequilibrium model in blocks of 15 rounds as the informa-tion is repetitive with Figures 1 and 2

1524 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

fEWA for every treatment Therefore we usePA for forecasting beyond 60 rounds

Figure 3 presents the simulated path of thecalibrated PA model and compares it with theactual data in the hold-out sample by superim-posing the simulated mean (black line) plus andminus one standard deviation (grey lines) on theactual mean (black box) and standard deviation(error bars) The simulation does a good job oftracking both the mean and the standard devia-tion of the actual data In treatments 1000and 2040 however the reduction in variancelags that seen in the middle rounds of theexperiment

D Forecasting

We now report the results from the PAmodel As the strategic complementarity predic-tions concern long-run performance we use thecalibrated parameters for the first 58 rounds tosimulate play in later rounds This exercise al-

lows us to study convergence and efficiency inlong but finite horizons

In all forecasting exercises we base all post-58-round parameters on those from the lastblock (calibrated from rounds 46 to 58) Sincewe do not know how these parameters mightchange beyond 60 rounds we use three differ-ent specifications First we retain all final-blockparameters Second we exponentially decay insubsequent blocks the probability of deviationfrom the best response in the production stage20

Third we exponentially decay the shock inter-val in subsequent blocks As all three specifica-tions yield qualitatively similar results wereport results from only the third specificationdue to space limitations21 In presenting the

20 In block k 4 we use the probability of deviationPrk minPr4(k 4)2 108 We set a lower bound of108 to avoid the division-by-zero problem

21 Different sequences of random numbers produceslightly different parameters estimates While the conver-

TABLE 7mdashVALIDATION ON HOLD-OUT SESSIONS

Session

Stochastic fictitious play

1000 2000 2018 1020 2020 2040

1 0955 0934 0940 0930 0888 09402 0960 0946 0890 0917 0973 08783 0948 0883 0873Overall 0957 0943 0915 0910 0912 0909

fEWA

1 0956 0934 0942 0928 0887 09482 0960 0946 0890 0922 0978 08863 0946 0879 0871Overall 0958 0942 0916 0910 0912 0917

Payoff assessment

1 0952 0933 0914 0941 0888 09162 0957 0944 0899 0922 0917 08983 0947 0894 0903Overall 0955 0941 0907 0919 0902 0907

Equilibrium play

1 1867 1853 1833 1833 1489 17922 1889 1919 1606 1839 1953 16693 1917 1447 1539Overall 1878 1896 1719 1706 1660 1731

Random play

Overall 0976 0976 0976 0976 0976 0976

1525VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

results we use the shorthand notation x y todenote a measure under treatment x is signifi-cantly higher than that under treatment y at the5-percent level or less and x y to denote thatthe measures are not significantly different atthe 5-percent level

Figure 4 reports the simulated proportion of-equilibrium play We report only the resultsfor the first 500 rounds since the dynamics donot change much thereafter In our simulationall treatments reach higher convergence levelsthan those achieved in the 60 rounds of the

experiment Since the simulated variance reduc-tion lags that of the experiment round 60 resultsare not achieved until approximately 90 roundsof simulation We further note that these im-provements slow down after 200 rounds Theproportion of -equilibrium play is bounded by72 percent for player 1 and 93 percent for player2 We now outline the simulation results

RESULT 5 (Level of Convergence in Round500) In the simulated data for player 1 atround 500 we have the following level of con-vergence ranking in

(i) p1 2018 2040 2020 1020 2000 1000

(ii) -p1 2018 2040 2020 1020 2000 1000

(iii) p2 2040 2018 1020 2020 2000 1000 and

gence level of a given treatment for a given set of parameterestimates is not affected by the sequence of random num-bers this convergence level is somewhat sensitive to theexact parameter calibration We therefore conduct four dif-ferent calibrations for each treatment and select the param-eters that yield the highest level The relative rankings oftreatments are quite robust with respect to ordinal selected

FIGURE 3 SIMULATED DYNAMIC PATH VERSUS ACTUAL DATA

1526 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iv) -p2 2040 2018 1020 2020 2000 1000

SUPPORT The fourth and seventh columns ofTable 8 report p-values for t-tests of the preced-ing alternative hypotheses for players 1 and 2respectively

Comparing Results 1 and 5 we first note thatthe four supermodular and near-supermodulartreatments continue to dominate the 00 treat-ments In fact the simulations suggest that thegap remains constant compared with 2000and actually increases compared with 1000Unlike Result 1 however the four dominanttreatments differ In particular 2018 per-forms better in terms of player 1rsquos convergenceand 2040 in terms of player 2rsquos although thedifferences within the top four treatments aresmaller than the differences between these four

and the 00 treatments In addition an increasein significantly improves convergence by alarge margin (40 to 80 percent) for both playerswhen 0

It is instructive to compare these results withthose concerning the speed of convergence inour experimental treatments In terms of con-vergence to -p2 our regression results (Table4) indicate that 2018rsquos performance in laterrounds enables it to achieve the same conver-gence level as the supermodular treatmentsThis trend continues in our simulations as2018 performs robustly well compared to thesupermodular treatments Likewise in our anal-ysis of the experimental results the conver-gence speed of 2040 was not significantlydifferent from those of the other dominant treat-ments In our simulations however this treat-ment dominates all treatments in player 2equilibrium play

FIGURE 3 CONTINUED

1527VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

In our simulations the convergence improve-ment due to increasing from 0 to 20 doesdepend on (-effect on -effect) An increasein significantly decreases the improvement in

convergence level (all p-values for one-sidedt-tests are less than 001) Due to the relativelypoor performance of 1000 in our simula-tions increasing from 10 to 20 when 0

Player 1

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500Round

Ep

silo

n-e

qu

ilib

riu

m p

lay

α10β00α20β00α20β18α10β20α20β20α20β40

Player 2

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500

Round

Ep

silo

n-e

qu

ilib

riu

m p

lay α10β00

α20β00α20β18α10β20α20β20α20β40

FIGURE 4 SIMULATED -EQUILIBRIUM PLAY

1528 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

improves convergence more dramatically inround 500 than in rounds 40 to 60 In the longrun an increase in is a partial substitute for anincrease in

We now use definition 2 to examine the con-vergence speed in the long run Table 9 presentsresults for the first time a treatment reaches thelevel of convergence L We omit the two 00treatments since they do not converge to thesame levels as the other treatments In terms ofplayer 1 2018 achieves the fastest conver-gence for all levels L The picture for player 2convergence however is more subtle First forthe 20 and 18 treatments the speeds of con-vergence to all L are remarkably similarWhile 2040 lags the other treatments interms of achieving 80-percent -equilibriumplay for player 2 it is the only treatment toachieve L 90 percent

Apart from convergence to equilibrium it isalso important to look at long-run welfare prop-erties We evaluate mechanism performance us-ing the efficiency measure in Section IV

Figure 5 summarizes the simulated and actualefficiency achieved by each of the treatmentsThe top panel reports simulated results whilethe bottom panel reports efficiency in the ex-perimental data In the long run supermodularand near-supermodular treatments continue todiffer from the 00 treatments with 20 supe-rior to 10 in the 00 treatments

VI Interpretation and Discussion

The underlying force for convergence ingames with strategic complementarities is thecombination of the slope of the best-responsefunctions and adaptive learning by players In

TABLE 9mdashSPEED OF CONVERGENCE IN SIMULATED DATA

(Threshold is the percent of players playing epsilon-equilibrium strategy Entry indicates round in which went overthreshold for good where ldquomdashrdquo indicates that treatment never achieved threshold)

Threshold

Player 1 Player 2

04 05 06 07 04 05 06 07 08 09

2018 54 83 126 316 38 52 62 87 127 mdash1020 129 221 mdash mdash 36 48 63 91 148 mdash2020 61 91 149 mdash 39 51 61 84 137 mdash2040 101 166 263 496 30 38 56 94 166 339

TABLE 8mdashRESULTS OF t-TESTS COMPARING LEVEL OF CONVERGENCE OF SIMULATIONS OF EXPERIMENTAL TREATMENTS

(Values are for round 500 and based on 1500 simulated games)

Treatment

Player 1 equilibrium price Player 2 equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0073 00822000 0188 2000 1000 0000 0213 2000 1000 00002018 0265 2018 2040 0187 0381 2018 1020 02641020 0211 1020 2000 0000 0376 1020 2020 00012020 0243 2020 1020 0000 0353 2020 2000 00002040 0259 2040 2020 0007 0442 2040 2018 0000

Treatment

Player 1 -equilibrium price Player 2 -equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0216 02322000 0519 2000 1000 0000 0573 2000 1000 00002018 0713 2018 2040 0045 0870 2018 1020 00081020 0584 1020 2000 0000 0857 1020 2020 00002020 0662 2020 1020 0000 0834 2020 2000 00002040 0701 2040 2020 0000 0925 2040 2018 0000

1529VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the compensation mechanisms the slope ofplayer 2rsquos best response function is determinedby As and determine the penalty for

mismatched prices we seriously consider thepossibility that convergence in supermodularand near-supermodular treatments to an equilib-

Simulation Data

75

80

85

90

95

100

50 150 250 350 450

Round

α10β00α20β00α20β18α10β20α20β20α20β40

Experimental Data

40

45

50

55

60

65

70

75

80

85

90

0 10 20 30 40 50 60

Round

Fra

ctio

n o

f M

axim

um

Wel

fare

α10β00α20β00α20β18α10β20α20β20α20β40

Fra

ctio

n o

f M

axim

um

Wel

fare

FIGURE 5 EFFICIENCY IN THE SIMULATED AND ACTUAL DATA

1530 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium where both players announce the sameprice is not due to strategic complementaritiesbut rather to increases in and making thepenalty terms more prominent and matching strat-egies more focal22 We thus have two hypoth-eses to explain the observed improvement inconvergence to equilibrium play in supermodu-lar and near-supermodular treatments a best-response hypothesis and a matching hypothesisIn this section we present evidence that theimprovement in convergence is due to payoff-relevant changes to best responses and not tomatching becoming more focal

While the focal point hypothesis suggeststhat players converge to a match it does notspecify the price at which they match In thefirst round of our experiments almost 50 per-cent of all prices were 20 and less than 10percent were in the -equilibrium range of[15 17] In the final rounds of our experi-ments play in the four supermodular or near-supermodular treatments converges stronglyto this range suggesting more than simplematching

By equation (4) player 1rsquos best response isalways to match regardless of p2 By theGeneral Incentive Hypothesis we expectmore matching behavior by player 1 as increases By equation (5) however match-ing is a best response for player 2 only inequilibrium Therefore data for player 2 canhelp us separate the best-response and match-ing hypotheses

We first investigate which model better ex-plains our experimental data We operationalizethe best response hypothesis by the stochasticfictitious play model of Section V and thematching hypothesis with a stochastic matchingmodel23 In both models the predicted player 1price is specified by equation (6) In the match-ing model player 2 plays this price with prob-ability 1 and one of the other 40 priceswith probability 40 We estimate using thefirst two sessions of each treatment24 We then

compare the performance of the matchingmodel with the sFP model using the hold-outsample Using the Wilcoxon signed-rank teststo compare the MSD scores in the hold-outsample we conclude that sFP explains player2rsquos behavior significantly better than the match-ing model (p-value 0006) We conclude thatbest response to a predicted price explains be-havior significantly better than the matchingmodel

We next investigate whether differences inthe cost of matching defined as the differencebetween matching and best-response profitscan explain all of the differences in match-ing behavior among treatments or whetherthere exists additional matching behaviorthat cannot be explained by payoff-relevantfactors

We examine the probability that player 2 triesto match the anticipated player 1 price We donot know what price player 2 anticipates Toestimate how players weigh history we cali-brate a matching model in a manner similar tothat outlined in Section V B We assume aplayer matches the price he anticipates givenby equation (6) and we find the discount ratefor each treatment that minimizes MSD25 Thisgives us the anticipated price that best explainsthe matching hypothesis Using this discountrate we calculate for each t 1 an anticipatedplayer 1 price p1 for each player 2 We alsocalculate the opportunity cost of matching p1equal to the difference between best-responseand matching profits This opportunity cost cap-tures the payoff relevant effects of and alsodepends on the price to be matched

We next regress the probability that player 2matches the anticipated price Pr(p2 p1) onln(Round) treatment dummies and treatmentdummies interacted with ln(Round) In onespecification we include the cost of matching asa regressor If parameter changes make match-ing more focal then changes in the cost ofmatching will not explain all of the changes inprobability of matching

22 We thank an anonymous referee and the co-editor forpointing this out

23 We do not report the results of the deterministicmatching model as it performs significantly worse than itsstochastic counterpart

24 Estimated in each block are as follows 1000 10 7 6 8 2000 7 6 5 4 2018 7

2 3 3 1020 5 3 3 2 2020 1 0 0 02040 2 0 0 1

25 In all treatments the calibrated discount rate is r 01

1531VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Table 10 presents the results of two probitmodels with clustering at the individual level Inthe first specification we do not include the costof matching as a regressor In this specificationthe coefficient for D00 is negative and signifi-cant thus reducing from 20 to 0 decreases theprobability that player 2 will try to match player1rsquos price Including the cost of matching in theregression however we find that neither thedummies nor their interactions are signifi-cantmdashthe coefficient for the previously signifi-cant D00 dummy falls almost by a half whilethe precision of the estimate stays about thesame The coefficient on the cost of matchinghowever is significant and negative Thisfinding suggests that the rise in matching that

occurs when we increase from 0 to 20 is notbecause matching is more focal but ratherbecause an increase of decreases the cost ofmatching26

VII Concluding Remarks

The appeal of games with strategic comple-mentarities is simple as long as players areadaptive learners the game will in the limitconverge to the set bounded by the largest andsmallest Nash equilibria This convergence de-pends on neither initial conditions nor the as-sumption of a particular learning modelUnfortunately while many competitive envi-ronments are supermodular many are not Inthis study we examine whether there are non-supermodular games which have the same con-vergence properties as supermodular ones Wealso study whether there exists a clear conver-gence ranking among games with strategiccomplementarities

Our results confirm the findings of previousexperimental studies that supermodular gamesperform significantly better than games far be-low the supermodular threshold From a littlebelow the threshold to the threshold howeverthe change in convergence level is statisticallyinsignificant This results suggests that in thecontext of mechanism design the designerneed not be overly concerned with setting pa-rameters that are firmly above the supermodularthreshold close is just as good It also enlargesthe set of robustly stable games For examplethe Falkinger mechanism (Falkinger 1996) isnot supermodular but close These results sug-gest that near-supermodular games perform likesupermodular ones

Our next result concerns convergence perfor-mance within the class of games with strategiccomplementarities Variations in the degree ofcomplementarities have no significant effect onperformance within the 60 experimental roundsOur simulations suggest however an increased

26 We consider other specifications of anticipated priceincluding the averages of the last one two three and fourprices seen In only one specification was a dummy or itsinteraction with ln(Round) significant In that specificationthe coefficient for D18 is positive and its interaction withln(Round) is negative a finding which is not consistent witha focal point hypothesis

TABLE 10mdashPLAYER 2 MATCHING HYPOTHESIS

(Probit models with individual-level clustering comparingmodels with and without the cost to Player 2 of matching

the anticipated price of Player 1)

Model

Dependent variableProbability player 2 priceequals anticipated player 1

price

(1) (2)

D10 0014 0017(0043) (0041)

D00 0077 0043(0035) (0036)

D18 0046 0032(0053) (0056)

D40 0008 0041(0061) (0042)

ln(Round) 0041 0028(0011) (0010)

Match Cost 0001(0000)

ln(Round)D10 0014 0012(0012) (0012)

ln(Round)D00 0003 0000(0013) (0012)

ln(Round)D18 0006 0002(0021) (0020)

ln(Round)D40 0001 0015(0018) (0016)

Observations 9588 9588Log pseudo-likelihood 324321 317716

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses are adjusted for cluster-ing at the individual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummyvariable for treatment y Excluded dummy is D2020Match Cost is the difference in cents of matching andbest-responding to anticipated Player 1 price

1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 8: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

time t We now relate the slope of L(t) and theinitial level of convergence L(1) to the speed ofconvergence

OBSERVATION 1 If L1(1) L2(1) andL1(t) L2(t) for all t [1 T 1] thengiven any L (0 1] the first treatment con-verges more quickly than the second treatment

Based on theories presented in Section I wenow form our hypotheses about the level andspeed of convergence While theories of strate-gic complementarities do not make any predic-tions about the speed of convergence we formour hypotheses based on previous experimentsthat incidentally address games with strategiccomplementarities

HYPOTHESIS 1 When 20 increasing from 0 to 20 significantly increases (a) the leveland (b) the speed of convergence

Hypothesis 1 is based on the theoretical predic-tion that games with strategic complementari-ties converge to the unique Nash equilibrium aswell as on previous experimental findings thatsupermodular games perform robustly betterthan their non-supermodular counterparts

HYPOTHESIS 2 When 20 increasing from 0 to 18 significantly increases (a) the leveland (b) the speed of convergence

Hypothesis 2 is based on the findings ofFalkinger et al (2000) that average play is closeto equilibrium when the free parameter isslightly below the supermodular threshold

HYPOTHESIS 3 When 20 increasing from 18 to 20 significantly increases (a) thelevel and (b) the speed of convergence

Since we have not found any previous experi-mental studies that compare the performance ofgames with strategic complementarities withthose near the threshold Hypothesis 3 is purespeculation

HYPOTHESIS 4 When 20 increasing from 20 to 40 does not significantly in-crease either (a) the level or (b) the speed ofconvergence

Since we have not found any previous experi-mental studies within the class of games withstrategic complementarities Hypotheses 4 isagain our speculation

HYPOTHESIS 5 When 0 or 20 increas-ing from 10 to 20 significantly increases (a)the level and (b) the speed of convergence

HYPOTHESIS 6 Changing from 10 to 20significantly increases the improvement in (a)the level and (b) the speed of convergence thatresults from increasing from 0 to 20

Hypothesis 5 is based on experimental findingssupporting the General Incentive HypothesisHypothesis 6 is our speculation

Hypotheses 1 through 6 are concerned withonly one measure of performance convergenceto equilibrium Other measures such as effi-ciency and budget balance can be largely de-rived from convergence patterns Thereforealthough we omit the formal hypotheses wepresent results regarding these measures in Sec-tions IV and V

IV Experimental Results

In this section we compare the performanceof the mechanism as we vary and At theindividual level we look at both the level andspeed of convergence to subgame-perfect equi-librium and near equilibrium in each of the sixdifferent treatments At the aggregate level weexamine the efficiency and budget imbalancegenerated by each treatment In the followingdiscussion we focus on prices rather than onquantity Recall that player 1rsquos best response inthe production stage is uniquely determined byp2 and that player 1 has all of the informationneeded to select this best response In our ex-periment deviations in the production stagetend to be small (the average absolute deviationis less than 1 in 23 of 27 sessions) and does notdiffer significantly among treatments There-fore it is not surprising that in all of our anal-yses the results for quantities largely mirror theresults for player 2rsquos price8

Recall that subgame perfect Nash equilib-

8 Tables are available from the authors upon request

1512 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium for a stage game is (p1 p2 X) (16 1632) Since the strategy space in this experimentis rather large and the payoff function is rela-tively flat near equilibrium a small deviationfrom equilibrium is not very costly For exam-ple in the 2020 treatment a one-unit unilat-eral deviation from equilibrium prices costsplayer 1 $0005 and player 2 $0014 Thereforewe check the -equilibrium play by looking atthe proportion of price announcements within1 of the equilibrium price and the quantityannouncement within 4 of the equilibriumquantity since a one-unit change in p2 results ina four-unit best-response quantity changeTherefore the -equilibrium prediction is (-p1-p2 -x) (15 16 17 15 16 1728 32 36)

Figures 1 and 2 contain box and whiskersplots for the prices of each treatment for all 60rounds by players 1 and 2 respectively The boxrepresents the ranges of the twenty-fifth andseventy-fifth percentiles of prices while thewhiskers extend to the minimum and maximumprices in each round The horizontal line withineach box represents the median price Com-pared with the 00 treatments equilibriumprice convergence is clearly more pronouncedin the supermodular and near-supermodulartreatments

To analyze the performance of the compen-sation mechanism we first compare the conver-gence level achieved in the last 20 rounds ofeach treatment Table 2 reports the level ofconvergence (Lb(41 60)) to subgame-perfectNash equilibrium (panels A and B) and -Nashequilibrium (panels C and D) for each sessionunder each of the six different treatments aswell as the alternative hypotheses and the cor-responding p-values of one-tailed permutationtests9 under the null hypothesis that the con-vergence levels in the two treatments (column

[8]) are the same While the proportion of-equilibrium play is much higher than the pro-portion of equilibrium play the results of thepermutation tests largely follow similar pat-terns We now formally test our hypothesesregarding convergence level Parts (i) to (iii) ofResult 1 present the effects of the degree ofstrategic complementarity (-effects) Parts (iv)and (v) present effects due to changes in (-effects)

RESULT 1 (Level of Convergence in theLast 20 Rounds)

(i) When 20 increasing from 0 to 18and from 0 to 20 significantly10 increasesthe level of convergence in p1 p2 and -p2

(ii) When 20 increasing from 18 to 20does not change the level of convergencesignificantly

(iii) When 20 increasing from 20 to 40does not change the level of convergencesignificantly

(iv) When 0 increasing from 10 to 20weakly increases the level of convergencein p2 and significantly increases the levelof convergence in -p2 but has no signif-icant effects on p1 or -p1

(v) When 20 increasing from 10 to 20weakly increases the level of convergencein p1 but not in -p1 p2 or -p2

SUPPORT The last two columns of Table2 report the corresponding alternative hypothe-ses and permutation test results

By part (i) of Result 1 we accept Hypotheses1(a) and 2(a) Part (i) also confirms previousexperimental findings that supermodular gamesperform significantly better than those far fromthe supermodular threshold Furthermore near-supermodular games also perform significantlybetter than those far from the threshold

By part (ii) however we reject Hypothesis3(a) This is the first experimental result thatshows that from a little below the supermodularthreshold ( 18) to the threshold ( 20)

9 The permutation test also known as the Fisher random-ization test is a nonparametric version of a difference oftwo means t-test (See eg Sidney Siegel and N JohnCastellan 1988) The idea is simple and intuitive by pool-ing all independent observations the p-value is the exactprobability of observing a separation between the two treat-ments as the one observed when the pooled observations arerandomly divided into two equal-sized groups This testuses all of the information in the sample and thus has100-percent power-efficiency It is among the most power-ful of all statistical tests

10 When presenting results throughout the paper wefollow the convention that a significance level of 5 percentor less is significant while a significance level between 5percent and 10 percent is weakly significant

1513VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

improvement in convergence level is statisti-cally insignificant In other words we do notsee a dramatic improvement at the thresholdThis implies that the performance of near-supermodular games such as the Falkingermechanism ought to be comparable to that ofsupermodular games

By contrast we accept Hypothesis 4(a) bypart (iii) This is the first experimental resultsystematically comparing convergence levels ofsupermodular games where theory is silentThe convergence level does not significantlyimprove as increases from the threshold 20to 40 Therefore the marginal returns for being

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β00 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β00 Player 1

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β20 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β20 Player 1

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α20β18 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β40 Player 1

Pri

ce A

nn

ou

nce

d

Round

FIGURE 1 DISTRIBUTION OF ANNOUNCED PRICES IN EXPERIMENTAL TREATMENTS PLAYER 1

1514 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

ldquomore supermodularrdquo diminish once the payoffsbecome supermodular

Proposition 2 predicts convergence underCournot best reply for any 0 Howeverthere is a significant difference in convergencelevel as increases from 0 to 18 20 and

beyond We investigate in Section V whetherthis difference persists in the long run

While parts (i) to (iii) present the -effectsparts (iv) and (v) examine the -effects and wepartially accept Hypothesis 5(a) Recall fromequation (5) and subsequent discussions that at

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β00 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β00 Player 2

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β20 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β20 Player 2

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α20β18 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β40 Player 2

Pri

ce A

nn

ou

nce

d

Round

FIGURE 2 DISTRIBUTION OF ANNOUNCED PRICES IN EXPERIMENTAL TREATMENTS PLAYER 2

1515VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the threshold of strategic complementarity 20 player 2rsquos Nash equilibrium strategy is alsoa dominant strategy The finding of no -effecton player 2rsquos equilibrium play when 20 isconsistent with this observation

To determine the effects of strategic comple-mentarities and other factors on convergencespeed and to investigate further their effects onconvergence level we use probit models withclustering at the individual level The resultsfrom these models are presented in Table 3 Thedependent variable is -p1 in specifications (1)and (3) and -p2 in specifications (2) and (4)

In specifications (1) and (2) the independentvariables are treatment dummies Dy wherey 1000 2000 18 1020 and 40ln(Round) and a constant We use dummies fordifferent values of and rather than directparameter values to avoid assuming a linearrelationship of parameter effects and omit thedummy for 2020 Therefore in these twospecifications restricting learning speed to bethe same across treatments the estimated coef-ficient of Dy captures the difference in theconvergence level between treatments y and2020 Results from these specifications are

TABLE 2mdashLEVEL OF CONVERGENCE IN THE LAST 20 ROUNDS

Panel A Proportion of player 1 Nash equilibrium price (p1) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0000 0008 0158 0008 0044 1000 2000 03972000 0000 0083 0083 0042 0058 0053 2000 2020 00082018 0125 0117 0100 0192 0133 2000 2018 00081020 0242 0067 0067 0067 0208 0130 1020 2020 00642020 0325 0267 0208 0058 0367 0245 2018 2020 00642040 0175 0400 0158 0133 0217 2020 2040 0643

Panel B Proportion of player 2 Nash equilibrium price (p2) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0025 0042 0025 0067 0040 1000 2000 00562000 0067 0083 0033 0075 0050 0062 2000 2020 00322018 0133 0292 0108 0258 0198 2000 2018 00081020 0392 0108 0175 0067 0667 0282 1020 2020 05042020 0308 0117 0483 0017 0450 0275 2018 2020 02382040 0325 0492 0233 0233 0321 2020 2040 0365

Panel C Proportion of player 1 -Nash equilibrium price (-p1) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0108 0200 0350 0158 0204 1000 2000 01832000 0067 0458 0358 0183 0425 0298 2000 2020 00872018 0317 0625 0300 0583 0456 2000 2018 01031020 0475 0200 0300 0375 0492 0368 1020 2020 01942020 0450 0558 0475 0133 0775 0478 2018 2020 04442040 0458 0492 0375 0442 0442 2020 2040 0643

Panel D Proportion of player 2 -Nash equilibrium price (-p2) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0133 0200 0242 0225 0200 1000 2000 00162000 0300 0400 0200 0275 0333 0302 2000 2020 00162018 0717 0667 0342 0700 0606 2000 2018 00161020 0650 0517 0542 0400 0892 0600 1020 2020 05642020 0533 0592 0642 0225 0900 0578 2018 2020 05562040 0825 0792 0467 0517 0650 2020 2040 0294

Note Significant at 10-percent level 5-percent level 1-percent level

1516 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

largely consistent with Result 1 which uses amore conservative test The coefficients ofln(Round) are both positive and highly signifi-cant indicating that players learn to play equilib-rium strategies over time The concave functionalform ln(Round) which yields a better log-likelihood than either the linear or quadratic func-tional form indicates that learning is rapid at thebeginning and decreases over time We will ex-amine learning in more detail in Section V

In specifications (3) and (4) we useln(Round) interaction of each of the treatmentdummies with ln(Round) and a constant asindependent variables The interaction term al-lows different slopes for different treatmentsCompared with the coefficient of ln(Round) the

coefficient for the interaction term Dyln(Round) captures the slope differences be-tween treatment y and 2020 By Observation1 as we cannot reject the hypotheses that initialround probability of equilibrium play is thesame across all treatments11 we can comparethe slope of L(t) L(t) between different treat-ments From the probit specification L(t) 13(constant Db ln(t)) where D is the 1 5vector of treatment dummies and b is the 5 1vector of estimated coefficients we can derive

11 When including treatment dummies in this specifica-tion Wald tests on the hypothesis that the treatment dummycoefficients are all zero yield p-values of 02703 for -p1and 03383 for -p2

TABLE 3mdashCONVERGENCE SPEED PROBIT MODELS WITH CLUSTERING AT INDIVIDUAL LEVEL

Dependent variable -Nash equilibrium play

(1) -Nashprice 1

(2) -Nashprice 2

(3) -Nashprice 1

(4) -Nashprice 2

D1000 0143 0268(0048) (0050)

D2000 0090 0217(0060) (0057)

D18 0005 0004(0071) (0069)

D1020 0080 0025(0060) (0073)

D40 0000 0078(0068) (0078)

ln(Round) 0115 0167 0131 0183(0014) (0017) (0021) (0022)

D1000ln(Round) 0050 0095(0019) (0022)

D2000ln(Round) 0031 0073(0020) (0021)

D18ln(Round) 0001 0004(0020) (0021)

D1020ln(Round) 0024 0008(0019) (0022)

D40ln(Round) 0002 0023(0019) (0024)

Observations 9720 9720 9720 9720Number of groups 162 162 162 162Log pseudo-likelihood 5562890 5824055 5557424 5793013

D1000 D2000 150 131D1000ln(Round) D2000ln(Round) 133 123D1020 D1000 D2000 005 107D1020ln(Round) D1000ln(Round)

D2000ln(Round)005 103

Notes Coefficients are probability derivatives Robust standard errors in parentheses are adjusted for clustering at theindividual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummy variable fortreatment y Excluded dummy is D2020 The bottom panel presents null hypotheses and Wald 2(1) test statistics

1517VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the slope of the probability function for treat-ment y with respect to t Ly(t) 13(con-stant Db)byt All coefficients reported inTable 3 are increments of probability deriva-tives 13(constant Db)by compared to thebaseline case of 2020

RESULT 2 (Speed of Convergence)

(i) When 20 increasing from 0 to 18and from 0 to 20 significantly increases thespeed of convergence in -p1 and -p2

(ii) When 20 increasing from 18 to 20has no significant effect on convergencespeed

(iii) When 20 increasing from 20 to 40has no significant effect on convergencespeed

(iv) When 0 increasing from 10 to 20has no significant effects on convergencespeed

(v) When 20 increasing from 10 to 20has no significant effects on convergencespeed

SUPPORT Models (3) and (4) in Table 3 re-port analyses on convergence speed For eachindependent variable the coefficients standarderrors (in parentheses) and significance levelsare reported The second Wald test looks atwhether the coefficient of D1000ln(Round)equals that of D2000ln(Round)

Part (i) of Result 2 supports Hypotheses 1(b)and 2(b) while by part (ii) we reject Hypothesis3(b) Part (iii) supports Hypothesis 4(b) Parts(iv) and (v) reject Hypothesis 5(b) Result 2provides the first empirical evidence on the roleof strategic complementarity and the speed ofconvergence

Although part (ii) indicates that increasing from 18 to 20 does not significantly changeconvergence speed we now investigate whetherthere are any differences between 18 and thesupermodular treatments In particular we com-pare 2018 with 2020 and 204012 InResult 1 we show that these treatments achieve

the same convergence level Also we cannotreject the hypothesis that round-one prices foreach player are drawn for the same distributionfor each treatment13 As the treatments all startand converge to similar levels of equilibriumplay we use a more flexible functional form tolook for differences in speed

In Table 4 we report the results of probitregressions comparing 2018 with the two20 supermodular treatments We use Roundln(Round) and their interactions with 2018as independent variables to allow different con-vergence speeds to the same level of conver-gence In model (1) the dependent variable isplayer 1 -equilibrium play There is no signif-icant difference between 2018 and the 20supermodular treatments In model (2) the de-pendent variable is player 2 -equilibrium playThere are significant differences between thenear-supermodular and 20 supermodular treat-ments In early rounds ln(Round) is large relativeto Round The negative and weakly significantcoefficient on D2018ln(Round) implies a lowerprobability of early-round equilibrium play in

12 We omit 1020 from this analysis First it does notachieve the same -p1 convergence level as the other su-permodular treatments Second omitting it avoids the pos-sibility of an -effect

13 Kolmogorov-Smirnov test result tables are availableby request

TABLE 4mdashCONVERGENCE SPEED

Probit models with individual-level clustering comparing2018 with the 20 supermodular treatments achieving

similar convergence levels

Dependent variable -Nashequilibrium play

(1) -Nashprice 1

(2) -Nashprice 2

ln(Round) 0054 0216(0040) (0043)

Round 0004 0001(0002) (0002)

D2018ln(Round) 0013 0066(0032) (0035)

D2018Round 0001 0006(0002) (0003)

Observations 4680 4680Log pseudo-likelihood 287907 299385

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses Significant at 10-percent level 5-percent level 1-percent level Dy isthe dummy variable for treatment y Excluded dummies areD2020 and D2040

1518 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

2018 relative to the supermodular treatmentsThe 18 treatment does catch up to these treat-ments which is reflected in the positive andsignificant coefficient on D2018 interactedwith Round This result suggests a differencebetween supermodular and near-supermodulartreatments while they achieve similar conver-gence levels the supermodular treatments per-form better in early rounds and thus mightreach certain convergence levels faster thantheir near-supermodular counterparts

The previous discussion examines the sepa-rate effects of strategic complementarity (-effects) and (-effects) on convergenceVarying allows us however to test the im-portance of strategic complementarity relativeto other features of mechanism design Having afull factorial design in the parameters 1020 and 0 20 we can study whether affects the role of strategic complementarity Inparticular as increases from 0 to 20 weexpect improvement in convergence level andspeed We study whether this improvementchanges when increases from 10 to 20

The last two Wald tests in the bottom panelof Table 3 separately examine the -effectson the change in convergence level and speedresulting from increasing from 0 to 20(-effects on -effects) Changing from 10 to20 does not significantly change the improve-ment in either convergence level or speed Thisis the first empirical result examining the ef-fects of other factors on the role of strategiccomplementarity

So far we have discussed the performance ofthe mechanism relative to equilibrium predic-tions and individual behavior We now turn togroup-level welfare results Since the compen-sation mechanism balances the budget only in

equilibrium total payoffs off the equilibriumpath can be only weakly related to efficientpayoffs Therefore we use two separate mea-sures to capture welfare implications an effi-ciency measure and a measure of budgetimbalance

We first define the per-round efficiency mea-sure which includes neither the taxsubsidy northe penalty terms ie

et rxt cxt ext

rx cx ex

where x is the efficient quantity The efficiencyachieved in a block e(t1 t2) where 0 t1 t2 60 is then defined as

et1 t2 t t1

t2 et

t2 t1 1

RESULT 3 (Efficiency in the Last 20Rounds)

(i) When 20 increasing from 0 to 18significantly improves efficiency while in-creasing from 0 to 20 weakly improvesefficiency

(ii) When 20 increasing from 18 to 20has no significant effect on efficiency

(iii) When 20 increasing from 20 to 40weakly improves efficiency

(iv) When 0 increasing from 10 to 20weakly improves efficiency

(v) When 20 increasing from 10 to 20has no significant effect on efficiency

SUPPORT Table 5 presents the efficiencymeasure for each session in each treatment

TABLE 5mdashEFFICIENCY MEASURE

Efficiency Last 20 rounds

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0600 0671 0804 0858 0733 1000 2000 00872000 0650 0870 0907 0873 0825 0825 2000 2020 00952018 0963 0952 0889 0959 0941 2000 2018 00161020 0958 0919 0905 0881 0984 0929 1020 2020 07142020 0888 0937 0939 0780 0971 0903 2018 2020 08332040 0973 0962 0943 0949 0957 2020 2040 0064

Note Significant at 10-percent level 5-percent level 1-percent level

1519VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

in the last 20 rounds the alternative hypoth-eses and the results of one-tailed permutationtests

Result 3 is largely consistent with Result 1indicating that supermodular and near-supermodular mechanisms induce a signifi-cantly higher proportion of equilibrium playthan mechanisms far from the supermodularthreshold The new finding is that increasing from the threshold 20 to 40 weakly improvesefficiency

Second the budget surplus at round t is thesum of the penalty terms plus tax minus sub-sidy in that round ie s(t) ( )(p1(t) p2(t))2 p2(t)x(t) p1(t)x(t) Examining thesession-level budget surplus across all roundswe find the following results First 22 out of 27sessions result in a budget surplus This comesfrom a combination of our choice of parametersand the dynamics of play Second the onlysignificant difference across treatments is thatbudget surpluses under 2018 and 2020are significantly lower than those under2040 ( p-values 0029 and 0024 respec-tively) This finding points to a potential costof driving up the punishment parameter Inother words before the system equilibrates ahigh punishment parameter can worsen thebudget imbalance problem inherent in themechanism

Overall Results 1 through 3 suggest the fol-lowing observations regarding games with stra-tegic complementarities First in terms ofconvergence level and speed supermodular andnear-supermodular games perform significantlybetter than those far under the threshold Sec-ond while there is no significant differencebetween supermodular and near-supermodulargames in terms of convergence level early per-formance differences may lead to supermodulartreatments reaching certain convergence levelsmore quickly Third beyond the threshold in-creasing has no significant effect on either thelevel or the speed of convergence but has thedisadvantage of risking higher budget imbal-ance Last for a given increasing haspartial effects on convergence level but nosignificant effect on convergence speed Tocheck the persistence of these experimental re-sults in the long run we use simulations inSection V

V Simulation Results Continued Dynamics

Our experiment examines the relationship be-tween strategic complementarity and conver-gence to equilibrium Learning theory predictslong-run convergence Figures 1 and 2 showthat convergence continues to improve in laterrounds for several treatments suggesting con-tinued dynamics Due to time attention andresource constraints it was not feasible to runthe experiment for much longer than 60rounds Therefore we rely on simulations tostudy continued convergence beyond 60rounds

To do so we look for a learning algorithmwhich when calibrated closely approximatesthe observed dynamic paths over 60 roundsThe large empirical literature on learning ingames (see eg Camerer 2003 for a survey)suggests many models Our interest here is notto compare the performance of various learningmodels We look for learning models that matchtwo criteria First we require a model that per-forms well in a variety of experimental gamesSecond given our experiment has complete in-formation about the payoff structure the modelneeds to incorporate this information In thefollowing subsections we first introduce threelearning models which meet our criteria Wethen look at how well the learning models pre-dict the experimental data by calibrating eachalgorithm on a subset of the experimental dataand validating the models on a hold-out sampleWe then report the forecasting results using oneof the calibrated algorithms

A Three Learning Models

The models we examine are stochastic ficti-tious play with discounting (hereafter shortenedas sFP) (Yin-Wong Cheung and Daniel Fried-man 1997 Fudenberg and Levine 1998) func-tional EWA (fEWA) (Teck-Hua Ho et al2001) and the payoff assessment learning model(PA) (Rajiv Sarin and Farshid Vahid 1999)We now give a brief overview of each modelInterested readers are referred to the originalsfor complete descriptions

The particular version of sFP that we use islogistic fictitious play (see eg Fudenberg andLevine 1998) A player predicts the matchrsquosprice in the next round pi(t 1) according to

1520 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(6) pi t 1

pi t yen 1

t 1

rpi t

1 yen 1

t 1

r

for some discount factor r [0 1] Note r 0corresponds to the Cournot best reply assess-ment pi(t 1) pi(t) When r 1 it yieldsthe standard fictitious play assessment Theusual adaptive learning model assumes 0 r 1All observations influence the expected state butmore recent observations have greater weight

As opposed to standard fictitious play stochas-tic fictitious play allows decision randomizationand thus better captures the human learning pro-cess Omitting time subscripts the probability thata player announces price pi is given by

(7) Pr pipi exp13 pi pi

yenj 0

40

exp13 pj pj

Given a predicted price a player is thus morelikely to play strategies that yield higher pay-offs How much more likely is determined by 13the sensitivity parameter As 13 increases theprobability of a best response to pi increases

The second model we consider fEWA is aone-parameter variant of the experience-weighted attraction (EWA) model (Camererand Ho 1999) In this model strategy probabil-ities are determined by logit probabilities simi-lar to equation (7) with actual payoffs ((pipi)) replaced by strategy attraction In bothvariants strategy attraction Ai

j(t) and an expe-rience weight N(t) are updated after everyperiod The experience weight is updated ac-cording to N(t) (1 ) N(t 1) 1where is the change-detection parameter and controls exploration (low ) versus exploita-tion Attractions are updated according to thefollowing rule

(8) Aijt

Nt 1Ai

jt 1 1 I pij pi ti pj pi t

Nt

where the indicator function I(pij pi(t)) equals

one if pij pi(t) and zero otherwise and [0

1] is the imagination weight In EWA all pa-rameters are estimated whereas in fEWA theseparameters (except for 13) are endogenously de-termined by the following functions Thechange-detector function i(t) is given by

(9) i t 1 5 j 1

mi Ipij pi t

1

yen 1

t

I pij pi t

t 2

This function will be close to 1 when recenthistory resembles previous history The imagi-nation weight i(t) equals i(t) while equalsthe Gini coefficient of previous choice frequen-cies EWA models encompass a variety offamiliar learning models cumulative reinforce-ment learning ( 0 1 N(0) 1)weighted reinforcement learning ( 0 0N(0) 1) weighted fictitious play ( 1 0) standard fictitious play ( 1 0)and Cournot best reply ( 1 1)

Finally we introduce the main componentsof the PA model For simplicity we omit allsubscripts that represent player i and let j(t)represent the actual payoff of strategy j in roundt Since the game has a large strategy space weincorporate similarity functions into the modelto represent agent use of strategy similarity Asstrategies in this game are naturally ordered bytheir labels we use the Bartlett similarity func-tion fjk(h t) to denote the similarity betweenthe played strategy k and an unplayed strategyj at period t

fjk h t 1 j kh if j k h0 otherwise

In this function the parameter h determines theh 1 unplayed strategies on either side of theplayed strategy to be updated When h 1fjk(1 t) degenerates into an indicator functionequal to one if strategy j is chosen in round t andzero otherwise

The PA model assumes that a player is amyopic subjective maximizer That is hechooses strategies based on assessed payoffs

1521VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

and does not explicitly take into account thelikelihood of alternate states Let uj(t) denotethe subjective assessment of strategy pj at timet and r the discount factor Payoff assessmentsare updated through a weighted average of hisprevious assessments and the payoff he actuallyobtains at time t If strategy k is chosen at timet then

(10) uj t 1 1 rfjk h tuj t

rfjk h tk t j

Each period the assessed strategy payoffs aresubject to zero-mean symmetrically distributedshocks zj(t) The decision maker chooses on thebasis of his shock-distorted subjective assess-ments uj(t) uj(t) zj(t) At time t he choosesstrategy pk if

(11) uk t uj t 0 pj pk

Note that mood shocks affect only his choicesand not the manner in which assessments areupdated

B Calibration

Literature assessing the performance oflearning models contains two approaches to cal-ibration and validation The first approach cal-ibrates the model on the first t rounds andvalidates on the remaining rounds The secondapproach uses half of the sessions in each treat-ment to calibrate and the other half to validateWe choose the latter approach for two reasonsFirst this approach is feasible as we have mul-tiple independent sessions for each treatmentSecond we need not assume that the parametersof later rounds are the same as those in earlierrounds We thus calibrate the parameters ofeach model in blocks of 15 rounds using theexperimental data from the first two sessions ofeach treatment We then evaluate each model bymeasuring how well the parameterized modelpredicts play in the remaining two or threesessions

For parameter estimation we conduct MonteCarlo simulations designed to replicate thecharacteristics of the experimental settings In

all calibrations we exclude the last two rounds(59 and 60) to avoid any end-of-game effectsWe then compare the simulated paths with theexperimental data to find those parameters thatminimize the mean-squared deviation (MSD)scores Since the final distributions of our pricedata are unimodal the simulated mean is aninformative statistic and is well captured byMSD (Ernan Haruvy and Dale Stahl 2000) Inall simulations we use the k-period-aheadrather than the one-period-ahead approach14 be-cause we are interested in forecasting the long-run mechanism performance In doing so wechoose k 10 15 20 30 and 58 We look atblocks of 10 15 20 and 30 rounds because asplayers gain information and experience infor-mation use may change over time We use k 15 rather than the other values because it bestcaptures the actual dynamics in the experimen-tal data

Each simulation consists of 1500 games(18000 players) and the following steps

(i) Simulated players are randomly matchedinto pairs at the beginning of each round

(ii) Simulated players select price announce-ments(a) Initial round Almost half of all play-

ers selected a first-round price of 20(44 percent of player 1s and 43 percentof player 2s) There are also consider-able spikes in the frequency of select-ing other prices divisible by 515 Basedon Kolmogorov-Smirnov tests on theactual round-one price distribution wereject the null hypotheses of uniformdistribution (d 0250 p-value 0000) and normal distribution (d 0381 p-value 0000) We thus fol-low the convention (eg Ho et al2001) and use the actual first-roundempirical distribution of choices togenerate the first-round choices

(b) Subsequent rounds Simulated playerstrategies are determined via equation(7) for sFP and fEWA and equation(11) for PA

14 Ido Erev and Haruvy (2000) discuss the tradeoffs ofthe two approaches

15 For example the seven most frequently selected pricesare divisible by 5

1522 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iii) Simulated player 1rsquos quantity choice isbased on the following steps(a) Determine each player 1rsquos best re-

sponse to p2(b) Determine whether player 1 will devi-

ate from the best response via the ac-tual probability of errors for eachblock

(c) If yes deviate via the actual mean andstandard deviation for the block Oth-erwise play the best response

(iv) Payoffs are determined by the payoff func-tion of the compensation mechanismequation (1) for each treatment

(v) Assessments are updated according toequation (8) for fEWA and equation (10)for PA

(vi) Proceed to the next round

The discount factor r [0 1] is searched ata grid size of 01 The parameter 13 is searchedat a grid size of 01 in the interval [15 105] forfEWA and [15 250] for sFP The size of thesimilarity window h [1 10] is searched at agrid size of 1 Mood shocks z are drawn from

a uniform distribution16 on an interval [a a]where a is searched on [0 500] with a step sizeof 50 For all parameters intervals and gridsizes are determined by payoff magnitude

Table 6 reports the calibrated parameters(discount factor sensitivity parameter moodshock interval and similarity window size) forthe first two sessions of each treatment in 15-round blocks17 Estimated parameters for thesupermodular and near-supermodular treatmentsare consistent with the increased level of con-vergence over time while the 00 treatmentsare not The second column (sFP) reports thebest-fit parameters for the stochastic fictitiousplay model With the exception of treatment2000 the discount factor is close to 10

16 Chen and Yuri Khoroshilov (2003) compare threeversions of PA models where shocks were drawn fromuniform logistic and double exponential distributions ontwo sets of experimental data and find that the performanceof the three versions of PA were statistically indistinguish-able

17 We also calibrate all parameters for the entire 60rounds but do not report results due to space limitations

TABLE 6mdashCALIBRATION OF THREE LEARNING MODELS IN FIFTEEN-ROUND BLOCKS

Model sFP fEWA PA

Block 1 2 3 4 1 2 3 4 1 2 3 4

Discount rate (r) Discount rate (r)1000 10 07 09 10 01 00 09 102000 07 06 01 01 05 10 02 012018 10 10 09 09 02 10 03 021020 10 10 10 09 01 03 06 032020 10 10 10 09 02 01 05 022040 10 10 10 07 01 10 04 03

Sensitivity (13) Sensitivity (13) Shock interval (a)1000 15 44 45 28 16 45 44 37 500 500 250 502000 15 43 140 180 17 40 51 74 50 100 150 502018 16 80 115 183 15 53 62 95 50 300 500 1501020 19 59 86 138 18 47 62 82 100 500 450 3002020 28 56 81 112 24 38 61 67 200 300 350 3502040 34 65 140 205 32 49 67 99 150 50 150 50

Similarity window (h)1000 1 1 10 92000 10 10 9 62018 5 4 6 11020 1 1 8 32020 2 2 7 12040 1 3 2 2

1523VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

indicating that players keep track of all past playwhen forming beliefs about opponentsrsquo nextmove18 Given these beliefs the increasing sen-sitivity parameter 13 for all treatments except1000 indicates that the likelihood of bestresponse increases over time The third columnreports calibration results for the fEWA modelAgain with the exception of treatment 1000the parameter 13 increases over time The finalcolumn reports calibration of parameters in thePA model For each treatment except 1000 amiddle block has the highest discount factorindicating more weight on new informationabout a strategyrsquos performance The second pa-rameter in the PA model represents the upperbound of the interval from which shocks aredrawn a Experimentation should decrease inthe final rounds Indeed for all treatments theestimated mood-shock ranges (weakly) de-crease from the third to the last block Finallythe decreasing similarity windows from thethird to final block indicate decreasing strategyspillover Relatively large discount factors inthe third block combined with relatively largesimilarity windows flatten the payoff assess-ments around the most recently played strate-gies consistent with the local experimentation(or relatively stabilized play) observed in thedata

C Validation

Using the parameters calibrated on the firsttwo sessions of each treatment we next com-pare the performance of the three learning mod-els in predicting play in the hold-out sampleFor comparison we also present the perfor-mance of two static models The random choicemodel assumes that each player randomlychooses any strategy with equal probability forall rounds This model incorporates only num-ber of strategies and thus provides a minimumstandard for a dynamic learning model Theequilibrium model assumes that each playerplays the subgame perfect Nash equilibriumevery round Its fit conveys the same informa-tion as the proportion of equilibrium play pre-

sented in Section IV but with a different metric(MSD)

Table 7 presents each modelrsquos MSD scoresfor each hold-out session Recall that two ofeach treatmentrsquos four or five independent ses-sions are used for calibration and the rest forvalidation The results indicate that all threelearning models perform significantly betterthan the random choice model (p-value 001one-sided permutation tests) and by a largermargin significantly better than the equilibriummodel (p-value 001 one-sided permutationtests) While the equilibrium model does a poorjob of explaining overall experimental data itsperformance improvement over time19 justifiesthe use of learning models to explain the dy-namics of play Within each of the top threepanels (ie learning models) session-levelMSD scores are lower for the supermodular andnear-supermodular treatments indicating thateach learning model does a better job explainingbehavior in those treatments Indeed in the em-pirical learning literature learning models fitbetter in experiments with better equilibriumconvergence (see eg Chen and Khoroshilov2003)

RESULT 4 (Comparison of Learning Mod-els) The performance of PA is weakly betterthan that of fEWA and strictly better than that ofsFP The performances of fEWA and sFP arenot significantly different

SUPPORT Table 7 reports the MSD scores foreach independent session in the hold-out sampleunder each model The Wilcoxon signed-rankstests show that the MSD scores under PA areweakly lower than those under fEWA (z 0085 p-value 0068) and strictly lowerthan those under sFP (z 0028 p-value 0023) Using the same test we cannot reject thehypothesis that fEWA and sFP yield the sameMSD scores (z 0662 p-value 0508)

Although the performance of PA is onlyweakly better than that of fEWA the overallMSD scores for PA are lower than those for

18 As the discount factor is zero in Cournot best replywe can reject this model based on our estimation of thediscount factors

19 We omit the table to show the performance of theequilibrium model in blocks of 15 rounds as the informa-tion is repetitive with Figures 1 and 2

1524 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

fEWA for every treatment Therefore we usePA for forecasting beyond 60 rounds

Figure 3 presents the simulated path of thecalibrated PA model and compares it with theactual data in the hold-out sample by superim-posing the simulated mean (black line) plus andminus one standard deviation (grey lines) on theactual mean (black box) and standard deviation(error bars) The simulation does a good job oftracking both the mean and the standard devia-tion of the actual data In treatments 1000and 2040 however the reduction in variancelags that seen in the middle rounds of theexperiment

D Forecasting

We now report the results from the PAmodel As the strategic complementarity predic-tions concern long-run performance we use thecalibrated parameters for the first 58 rounds tosimulate play in later rounds This exercise al-

lows us to study convergence and efficiency inlong but finite horizons

In all forecasting exercises we base all post-58-round parameters on those from the lastblock (calibrated from rounds 46 to 58) Sincewe do not know how these parameters mightchange beyond 60 rounds we use three differ-ent specifications First we retain all final-blockparameters Second we exponentially decay insubsequent blocks the probability of deviationfrom the best response in the production stage20

Third we exponentially decay the shock inter-val in subsequent blocks As all three specifica-tions yield qualitatively similar results wereport results from only the third specificationdue to space limitations21 In presenting the

20 In block k 4 we use the probability of deviationPrk minPr4(k 4)2 108 We set a lower bound of108 to avoid the division-by-zero problem

21 Different sequences of random numbers produceslightly different parameters estimates While the conver-

TABLE 7mdashVALIDATION ON HOLD-OUT SESSIONS

Session

Stochastic fictitious play

1000 2000 2018 1020 2020 2040

1 0955 0934 0940 0930 0888 09402 0960 0946 0890 0917 0973 08783 0948 0883 0873Overall 0957 0943 0915 0910 0912 0909

fEWA

1 0956 0934 0942 0928 0887 09482 0960 0946 0890 0922 0978 08863 0946 0879 0871Overall 0958 0942 0916 0910 0912 0917

Payoff assessment

1 0952 0933 0914 0941 0888 09162 0957 0944 0899 0922 0917 08983 0947 0894 0903Overall 0955 0941 0907 0919 0902 0907

Equilibrium play

1 1867 1853 1833 1833 1489 17922 1889 1919 1606 1839 1953 16693 1917 1447 1539Overall 1878 1896 1719 1706 1660 1731

Random play

Overall 0976 0976 0976 0976 0976 0976

1525VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

results we use the shorthand notation x y todenote a measure under treatment x is signifi-cantly higher than that under treatment y at the5-percent level or less and x y to denote thatthe measures are not significantly different atthe 5-percent level

Figure 4 reports the simulated proportion of-equilibrium play We report only the resultsfor the first 500 rounds since the dynamics donot change much thereafter In our simulationall treatments reach higher convergence levelsthan those achieved in the 60 rounds of the

experiment Since the simulated variance reduc-tion lags that of the experiment round 60 resultsare not achieved until approximately 90 roundsof simulation We further note that these im-provements slow down after 200 rounds Theproportion of -equilibrium play is bounded by72 percent for player 1 and 93 percent for player2 We now outline the simulation results

RESULT 5 (Level of Convergence in Round500) In the simulated data for player 1 atround 500 we have the following level of con-vergence ranking in

(i) p1 2018 2040 2020 1020 2000 1000

(ii) -p1 2018 2040 2020 1020 2000 1000

(iii) p2 2040 2018 1020 2020 2000 1000 and

gence level of a given treatment for a given set of parameterestimates is not affected by the sequence of random num-bers this convergence level is somewhat sensitive to theexact parameter calibration We therefore conduct four dif-ferent calibrations for each treatment and select the param-eters that yield the highest level The relative rankings oftreatments are quite robust with respect to ordinal selected

FIGURE 3 SIMULATED DYNAMIC PATH VERSUS ACTUAL DATA

1526 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iv) -p2 2040 2018 1020 2020 2000 1000

SUPPORT The fourth and seventh columns ofTable 8 report p-values for t-tests of the preced-ing alternative hypotheses for players 1 and 2respectively

Comparing Results 1 and 5 we first note thatthe four supermodular and near-supermodulartreatments continue to dominate the 00 treat-ments In fact the simulations suggest that thegap remains constant compared with 2000and actually increases compared with 1000Unlike Result 1 however the four dominanttreatments differ In particular 2018 per-forms better in terms of player 1rsquos convergenceand 2040 in terms of player 2rsquos although thedifferences within the top four treatments aresmaller than the differences between these four

and the 00 treatments In addition an increasein significantly improves convergence by alarge margin (40 to 80 percent) for both playerswhen 0

It is instructive to compare these results withthose concerning the speed of convergence inour experimental treatments In terms of con-vergence to -p2 our regression results (Table4) indicate that 2018rsquos performance in laterrounds enables it to achieve the same conver-gence level as the supermodular treatmentsThis trend continues in our simulations as2018 performs robustly well compared to thesupermodular treatments Likewise in our anal-ysis of the experimental results the conver-gence speed of 2040 was not significantlydifferent from those of the other dominant treat-ments In our simulations however this treat-ment dominates all treatments in player 2equilibrium play

FIGURE 3 CONTINUED

1527VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

In our simulations the convergence improve-ment due to increasing from 0 to 20 doesdepend on (-effect on -effect) An increasein significantly decreases the improvement in

convergence level (all p-values for one-sidedt-tests are less than 001) Due to the relativelypoor performance of 1000 in our simula-tions increasing from 10 to 20 when 0

Player 1

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500Round

Ep

silo

n-e

qu

ilib

riu

m p

lay

α10β00α20β00α20β18α10β20α20β20α20β40

Player 2

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500

Round

Ep

silo

n-e

qu

ilib

riu

m p

lay α10β00

α20β00α20β18α10β20α20β20α20β40

FIGURE 4 SIMULATED -EQUILIBRIUM PLAY

1528 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

improves convergence more dramatically inround 500 than in rounds 40 to 60 In the longrun an increase in is a partial substitute for anincrease in

We now use definition 2 to examine the con-vergence speed in the long run Table 9 presentsresults for the first time a treatment reaches thelevel of convergence L We omit the two 00treatments since they do not converge to thesame levels as the other treatments In terms ofplayer 1 2018 achieves the fastest conver-gence for all levels L The picture for player 2convergence however is more subtle First forthe 20 and 18 treatments the speeds of con-vergence to all L are remarkably similarWhile 2040 lags the other treatments interms of achieving 80-percent -equilibriumplay for player 2 it is the only treatment toachieve L 90 percent

Apart from convergence to equilibrium it isalso important to look at long-run welfare prop-erties We evaluate mechanism performance us-ing the efficiency measure in Section IV

Figure 5 summarizes the simulated and actualefficiency achieved by each of the treatmentsThe top panel reports simulated results whilethe bottom panel reports efficiency in the ex-perimental data In the long run supermodularand near-supermodular treatments continue todiffer from the 00 treatments with 20 supe-rior to 10 in the 00 treatments

VI Interpretation and Discussion

The underlying force for convergence ingames with strategic complementarities is thecombination of the slope of the best-responsefunctions and adaptive learning by players In

TABLE 9mdashSPEED OF CONVERGENCE IN SIMULATED DATA

(Threshold is the percent of players playing epsilon-equilibrium strategy Entry indicates round in which went overthreshold for good where ldquomdashrdquo indicates that treatment never achieved threshold)

Threshold

Player 1 Player 2

04 05 06 07 04 05 06 07 08 09

2018 54 83 126 316 38 52 62 87 127 mdash1020 129 221 mdash mdash 36 48 63 91 148 mdash2020 61 91 149 mdash 39 51 61 84 137 mdash2040 101 166 263 496 30 38 56 94 166 339

TABLE 8mdashRESULTS OF t-TESTS COMPARING LEVEL OF CONVERGENCE OF SIMULATIONS OF EXPERIMENTAL TREATMENTS

(Values are for round 500 and based on 1500 simulated games)

Treatment

Player 1 equilibrium price Player 2 equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0073 00822000 0188 2000 1000 0000 0213 2000 1000 00002018 0265 2018 2040 0187 0381 2018 1020 02641020 0211 1020 2000 0000 0376 1020 2020 00012020 0243 2020 1020 0000 0353 2020 2000 00002040 0259 2040 2020 0007 0442 2040 2018 0000

Treatment

Player 1 -equilibrium price Player 2 -equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0216 02322000 0519 2000 1000 0000 0573 2000 1000 00002018 0713 2018 2040 0045 0870 2018 1020 00081020 0584 1020 2000 0000 0857 1020 2020 00002020 0662 2020 1020 0000 0834 2020 2000 00002040 0701 2040 2020 0000 0925 2040 2018 0000

1529VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the compensation mechanisms the slope ofplayer 2rsquos best response function is determinedby As and determine the penalty for

mismatched prices we seriously consider thepossibility that convergence in supermodularand near-supermodular treatments to an equilib-

Simulation Data

75

80

85

90

95

100

50 150 250 350 450

Round

α10β00α20β00α20β18α10β20α20β20α20β40

Experimental Data

40

45

50

55

60

65

70

75

80

85

90

0 10 20 30 40 50 60

Round

Fra

ctio

n o

f M

axim

um

Wel

fare

α10β00α20β00α20β18α10β20α20β20α20β40

Fra

ctio

n o

f M

axim

um

Wel

fare

FIGURE 5 EFFICIENCY IN THE SIMULATED AND ACTUAL DATA

1530 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium where both players announce the sameprice is not due to strategic complementaritiesbut rather to increases in and making thepenalty terms more prominent and matching strat-egies more focal22 We thus have two hypoth-eses to explain the observed improvement inconvergence to equilibrium play in supermodu-lar and near-supermodular treatments a best-response hypothesis and a matching hypothesisIn this section we present evidence that theimprovement in convergence is due to payoff-relevant changes to best responses and not tomatching becoming more focal

While the focal point hypothesis suggeststhat players converge to a match it does notspecify the price at which they match In thefirst round of our experiments almost 50 per-cent of all prices were 20 and less than 10percent were in the -equilibrium range of[15 17] In the final rounds of our experi-ments play in the four supermodular or near-supermodular treatments converges stronglyto this range suggesting more than simplematching

By equation (4) player 1rsquos best response isalways to match regardless of p2 By theGeneral Incentive Hypothesis we expectmore matching behavior by player 1 as increases By equation (5) however match-ing is a best response for player 2 only inequilibrium Therefore data for player 2 canhelp us separate the best-response and match-ing hypotheses

We first investigate which model better ex-plains our experimental data We operationalizethe best response hypothesis by the stochasticfictitious play model of Section V and thematching hypothesis with a stochastic matchingmodel23 In both models the predicted player 1price is specified by equation (6) In the match-ing model player 2 plays this price with prob-ability 1 and one of the other 40 priceswith probability 40 We estimate using thefirst two sessions of each treatment24 We then

compare the performance of the matchingmodel with the sFP model using the hold-outsample Using the Wilcoxon signed-rank teststo compare the MSD scores in the hold-outsample we conclude that sFP explains player2rsquos behavior significantly better than the match-ing model (p-value 0006) We conclude thatbest response to a predicted price explains be-havior significantly better than the matchingmodel

We next investigate whether differences inthe cost of matching defined as the differencebetween matching and best-response profitscan explain all of the differences in match-ing behavior among treatments or whetherthere exists additional matching behaviorthat cannot be explained by payoff-relevantfactors

We examine the probability that player 2 triesto match the anticipated player 1 price We donot know what price player 2 anticipates Toestimate how players weigh history we cali-brate a matching model in a manner similar tothat outlined in Section V B We assume aplayer matches the price he anticipates givenby equation (6) and we find the discount ratefor each treatment that minimizes MSD25 Thisgives us the anticipated price that best explainsthe matching hypothesis Using this discountrate we calculate for each t 1 an anticipatedplayer 1 price p1 for each player 2 We alsocalculate the opportunity cost of matching p1equal to the difference between best-responseand matching profits This opportunity cost cap-tures the payoff relevant effects of and alsodepends on the price to be matched

We next regress the probability that player 2matches the anticipated price Pr(p2 p1) onln(Round) treatment dummies and treatmentdummies interacted with ln(Round) In onespecification we include the cost of matching asa regressor If parameter changes make match-ing more focal then changes in the cost ofmatching will not explain all of the changes inprobability of matching

22 We thank an anonymous referee and the co-editor forpointing this out

23 We do not report the results of the deterministicmatching model as it performs significantly worse than itsstochastic counterpart

24 Estimated in each block are as follows 1000 10 7 6 8 2000 7 6 5 4 2018 7

2 3 3 1020 5 3 3 2 2020 1 0 0 02040 2 0 0 1

25 In all treatments the calibrated discount rate is r 01

1531VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Table 10 presents the results of two probitmodels with clustering at the individual level Inthe first specification we do not include the costof matching as a regressor In this specificationthe coefficient for D00 is negative and signifi-cant thus reducing from 20 to 0 decreases theprobability that player 2 will try to match player1rsquos price Including the cost of matching in theregression however we find that neither thedummies nor their interactions are signifi-cantmdashthe coefficient for the previously signifi-cant D00 dummy falls almost by a half whilethe precision of the estimate stays about thesame The coefficient on the cost of matchinghowever is significant and negative Thisfinding suggests that the rise in matching that

occurs when we increase from 0 to 20 is notbecause matching is more focal but ratherbecause an increase of decreases the cost ofmatching26

VII Concluding Remarks

The appeal of games with strategic comple-mentarities is simple as long as players areadaptive learners the game will in the limitconverge to the set bounded by the largest andsmallest Nash equilibria This convergence de-pends on neither initial conditions nor the as-sumption of a particular learning modelUnfortunately while many competitive envi-ronments are supermodular many are not Inthis study we examine whether there are non-supermodular games which have the same con-vergence properties as supermodular ones Wealso study whether there exists a clear conver-gence ranking among games with strategiccomplementarities

Our results confirm the findings of previousexperimental studies that supermodular gamesperform significantly better than games far be-low the supermodular threshold From a littlebelow the threshold to the threshold howeverthe change in convergence level is statisticallyinsignificant This results suggests that in thecontext of mechanism design the designerneed not be overly concerned with setting pa-rameters that are firmly above the supermodularthreshold close is just as good It also enlargesthe set of robustly stable games For examplethe Falkinger mechanism (Falkinger 1996) isnot supermodular but close These results sug-gest that near-supermodular games perform likesupermodular ones

Our next result concerns convergence perfor-mance within the class of games with strategiccomplementarities Variations in the degree ofcomplementarities have no significant effect onperformance within the 60 experimental roundsOur simulations suggest however an increased

26 We consider other specifications of anticipated priceincluding the averages of the last one two three and fourprices seen In only one specification was a dummy or itsinteraction with ln(Round) significant In that specificationthe coefficient for D18 is positive and its interaction withln(Round) is negative a finding which is not consistent witha focal point hypothesis

TABLE 10mdashPLAYER 2 MATCHING HYPOTHESIS

(Probit models with individual-level clustering comparingmodels with and without the cost to Player 2 of matching

the anticipated price of Player 1)

Model

Dependent variableProbability player 2 priceequals anticipated player 1

price

(1) (2)

D10 0014 0017(0043) (0041)

D00 0077 0043(0035) (0036)

D18 0046 0032(0053) (0056)

D40 0008 0041(0061) (0042)

ln(Round) 0041 0028(0011) (0010)

Match Cost 0001(0000)

ln(Round)D10 0014 0012(0012) (0012)

ln(Round)D00 0003 0000(0013) (0012)

ln(Round)D18 0006 0002(0021) (0020)

ln(Round)D40 0001 0015(0018) (0016)

Observations 9588 9588Log pseudo-likelihood 324321 317716

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses are adjusted for cluster-ing at the individual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummyvariable for treatment y Excluded dummy is D2020Match Cost is the difference in cents of matching andbest-responding to anticipated Player 1 price

1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 9: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

rium for a stage game is (p1 p2 X) (16 1632) Since the strategy space in this experimentis rather large and the payoff function is rela-tively flat near equilibrium a small deviationfrom equilibrium is not very costly For exam-ple in the 2020 treatment a one-unit unilat-eral deviation from equilibrium prices costsplayer 1 $0005 and player 2 $0014 Thereforewe check the -equilibrium play by looking atthe proportion of price announcements within1 of the equilibrium price and the quantityannouncement within 4 of the equilibriumquantity since a one-unit change in p2 results ina four-unit best-response quantity changeTherefore the -equilibrium prediction is (-p1-p2 -x) (15 16 17 15 16 1728 32 36)

Figures 1 and 2 contain box and whiskersplots for the prices of each treatment for all 60rounds by players 1 and 2 respectively The boxrepresents the ranges of the twenty-fifth andseventy-fifth percentiles of prices while thewhiskers extend to the minimum and maximumprices in each round The horizontal line withineach box represents the median price Com-pared with the 00 treatments equilibriumprice convergence is clearly more pronouncedin the supermodular and near-supermodulartreatments

To analyze the performance of the compen-sation mechanism we first compare the conver-gence level achieved in the last 20 rounds ofeach treatment Table 2 reports the level ofconvergence (Lb(41 60)) to subgame-perfectNash equilibrium (panels A and B) and -Nashequilibrium (panels C and D) for each sessionunder each of the six different treatments aswell as the alternative hypotheses and the cor-responding p-values of one-tailed permutationtests9 under the null hypothesis that the con-vergence levels in the two treatments (column

[8]) are the same While the proportion of-equilibrium play is much higher than the pro-portion of equilibrium play the results of thepermutation tests largely follow similar pat-terns We now formally test our hypothesesregarding convergence level Parts (i) to (iii) ofResult 1 present the effects of the degree ofstrategic complementarity (-effects) Parts (iv)and (v) present effects due to changes in (-effects)

RESULT 1 (Level of Convergence in theLast 20 Rounds)

(i) When 20 increasing from 0 to 18and from 0 to 20 significantly10 increasesthe level of convergence in p1 p2 and -p2

(ii) When 20 increasing from 18 to 20does not change the level of convergencesignificantly

(iii) When 20 increasing from 20 to 40does not change the level of convergencesignificantly

(iv) When 0 increasing from 10 to 20weakly increases the level of convergencein p2 and significantly increases the levelof convergence in -p2 but has no signif-icant effects on p1 or -p1

(v) When 20 increasing from 10 to 20weakly increases the level of convergencein p1 but not in -p1 p2 or -p2

SUPPORT The last two columns of Table2 report the corresponding alternative hypothe-ses and permutation test results

By part (i) of Result 1 we accept Hypotheses1(a) and 2(a) Part (i) also confirms previousexperimental findings that supermodular gamesperform significantly better than those far fromthe supermodular threshold Furthermore near-supermodular games also perform significantlybetter than those far from the threshold

By part (ii) however we reject Hypothesis3(a) This is the first experimental result thatshows that from a little below the supermodularthreshold ( 18) to the threshold ( 20)

9 The permutation test also known as the Fisher random-ization test is a nonparametric version of a difference oftwo means t-test (See eg Sidney Siegel and N JohnCastellan 1988) The idea is simple and intuitive by pool-ing all independent observations the p-value is the exactprobability of observing a separation between the two treat-ments as the one observed when the pooled observations arerandomly divided into two equal-sized groups This testuses all of the information in the sample and thus has100-percent power-efficiency It is among the most power-ful of all statistical tests

10 When presenting results throughout the paper wefollow the convention that a significance level of 5 percentor less is significant while a significance level between 5percent and 10 percent is weakly significant

1513VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

improvement in convergence level is statisti-cally insignificant In other words we do notsee a dramatic improvement at the thresholdThis implies that the performance of near-supermodular games such as the Falkingermechanism ought to be comparable to that ofsupermodular games

By contrast we accept Hypothesis 4(a) bypart (iii) This is the first experimental resultsystematically comparing convergence levels ofsupermodular games where theory is silentThe convergence level does not significantlyimprove as increases from the threshold 20to 40 Therefore the marginal returns for being

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β00 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β00 Player 1

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β20 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β20 Player 1

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α20β18 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β40 Player 1

Pri

ce A

nn

ou

nce

d

Round

FIGURE 1 DISTRIBUTION OF ANNOUNCED PRICES IN EXPERIMENTAL TREATMENTS PLAYER 1

1514 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

ldquomore supermodularrdquo diminish once the payoffsbecome supermodular

Proposition 2 predicts convergence underCournot best reply for any 0 Howeverthere is a significant difference in convergencelevel as increases from 0 to 18 20 and

beyond We investigate in Section V whetherthis difference persists in the long run

While parts (i) to (iii) present the -effectsparts (iv) and (v) examine the -effects and wepartially accept Hypothesis 5(a) Recall fromequation (5) and subsequent discussions that at

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β00 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β00 Player 2

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β20 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β20 Player 2

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α20β18 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β40 Player 2

Pri

ce A

nn

ou

nce

d

Round

FIGURE 2 DISTRIBUTION OF ANNOUNCED PRICES IN EXPERIMENTAL TREATMENTS PLAYER 2

1515VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the threshold of strategic complementarity 20 player 2rsquos Nash equilibrium strategy is alsoa dominant strategy The finding of no -effecton player 2rsquos equilibrium play when 20 isconsistent with this observation

To determine the effects of strategic comple-mentarities and other factors on convergencespeed and to investigate further their effects onconvergence level we use probit models withclustering at the individual level The resultsfrom these models are presented in Table 3 Thedependent variable is -p1 in specifications (1)and (3) and -p2 in specifications (2) and (4)

In specifications (1) and (2) the independentvariables are treatment dummies Dy wherey 1000 2000 18 1020 and 40ln(Round) and a constant We use dummies fordifferent values of and rather than directparameter values to avoid assuming a linearrelationship of parameter effects and omit thedummy for 2020 Therefore in these twospecifications restricting learning speed to bethe same across treatments the estimated coef-ficient of Dy captures the difference in theconvergence level between treatments y and2020 Results from these specifications are

TABLE 2mdashLEVEL OF CONVERGENCE IN THE LAST 20 ROUNDS

Panel A Proportion of player 1 Nash equilibrium price (p1) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0000 0008 0158 0008 0044 1000 2000 03972000 0000 0083 0083 0042 0058 0053 2000 2020 00082018 0125 0117 0100 0192 0133 2000 2018 00081020 0242 0067 0067 0067 0208 0130 1020 2020 00642020 0325 0267 0208 0058 0367 0245 2018 2020 00642040 0175 0400 0158 0133 0217 2020 2040 0643

Panel B Proportion of player 2 Nash equilibrium price (p2) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0025 0042 0025 0067 0040 1000 2000 00562000 0067 0083 0033 0075 0050 0062 2000 2020 00322018 0133 0292 0108 0258 0198 2000 2018 00081020 0392 0108 0175 0067 0667 0282 1020 2020 05042020 0308 0117 0483 0017 0450 0275 2018 2020 02382040 0325 0492 0233 0233 0321 2020 2040 0365

Panel C Proportion of player 1 -Nash equilibrium price (-p1) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0108 0200 0350 0158 0204 1000 2000 01832000 0067 0458 0358 0183 0425 0298 2000 2020 00872018 0317 0625 0300 0583 0456 2000 2018 01031020 0475 0200 0300 0375 0492 0368 1020 2020 01942020 0450 0558 0475 0133 0775 0478 2018 2020 04442040 0458 0492 0375 0442 0442 2020 2040 0643

Panel D Proportion of player 2 -Nash equilibrium price (-p2) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0133 0200 0242 0225 0200 1000 2000 00162000 0300 0400 0200 0275 0333 0302 2000 2020 00162018 0717 0667 0342 0700 0606 2000 2018 00161020 0650 0517 0542 0400 0892 0600 1020 2020 05642020 0533 0592 0642 0225 0900 0578 2018 2020 05562040 0825 0792 0467 0517 0650 2020 2040 0294

Note Significant at 10-percent level 5-percent level 1-percent level

1516 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

largely consistent with Result 1 which uses amore conservative test The coefficients ofln(Round) are both positive and highly signifi-cant indicating that players learn to play equilib-rium strategies over time The concave functionalform ln(Round) which yields a better log-likelihood than either the linear or quadratic func-tional form indicates that learning is rapid at thebeginning and decreases over time We will ex-amine learning in more detail in Section V

In specifications (3) and (4) we useln(Round) interaction of each of the treatmentdummies with ln(Round) and a constant asindependent variables The interaction term al-lows different slopes for different treatmentsCompared with the coefficient of ln(Round) the

coefficient for the interaction term Dyln(Round) captures the slope differences be-tween treatment y and 2020 By Observation1 as we cannot reject the hypotheses that initialround probability of equilibrium play is thesame across all treatments11 we can comparethe slope of L(t) L(t) between different treat-ments From the probit specification L(t) 13(constant Db ln(t)) where D is the 1 5vector of treatment dummies and b is the 5 1vector of estimated coefficients we can derive

11 When including treatment dummies in this specifica-tion Wald tests on the hypothesis that the treatment dummycoefficients are all zero yield p-values of 02703 for -p1and 03383 for -p2

TABLE 3mdashCONVERGENCE SPEED PROBIT MODELS WITH CLUSTERING AT INDIVIDUAL LEVEL

Dependent variable -Nash equilibrium play

(1) -Nashprice 1

(2) -Nashprice 2

(3) -Nashprice 1

(4) -Nashprice 2

D1000 0143 0268(0048) (0050)

D2000 0090 0217(0060) (0057)

D18 0005 0004(0071) (0069)

D1020 0080 0025(0060) (0073)

D40 0000 0078(0068) (0078)

ln(Round) 0115 0167 0131 0183(0014) (0017) (0021) (0022)

D1000ln(Round) 0050 0095(0019) (0022)

D2000ln(Round) 0031 0073(0020) (0021)

D18ln(Round) 0001 0004(0020) (0021)

D1020ln(Round) 0024 0008(0019) (0022)

D40ln(Round) 0002 0023(0019) (0024)

Observations 9720 9720 9720 9720Number of groups 162 162 162 162Log pseudo-likelihood 5562890 5824055 5557424 5793013

D1000 D2000 150 131D1000ln(Round) D2000ln(Round) 133 123D1020 D1000 D2000 005 107D1020ln(Round) D1000ln(Round)

D2000ln(Round)005 103

Notes Coefficients are probability derivatives Robust standard errors in parentheses are adjusted for clustering at theindividual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummy variable fortreatment y Excluded dummy is D2020 The bottom panel presents null hypotheses and Wald 2(1) test statistics

1517VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the slope of the probability function for treat-ment y with respect to t Ly(t) 13(con-stant Db)byt All coefficients reported inTable 3 are increments of probability deriva-tives 13(constant Db)by compared to thebaseline case of 2020

RESULT 2 (Speed of Convergence)

(i) When 20 increasing from 0 to 18and from 0 to 20 significantly increases thespeed of convergence in -p1 and -p2

(ii) When 20 increasing from 18 to 20has no significant effect on convergencespeed

(iii) When 20 increasing from 20 to 40has no significant effect on convergencespeed

(iv) When 0 increasing from 10 to 20has no significant effects on convergencespeed

(v) When 20 increasing from 10 to 20has no significant effects on convergencespeed

SUPPORT Models (3) and (4) in Table 3 re-port analyses on convergence speed For eachindependent variable the coefficients standarderrors (in parentheses) and significance levelsare reported The second Wald test looks atwhether the coefficient of D1000ln(Round)equals that of D2000ln(Round)

Part (i) of Result 2 supports Hypotheses 1(b)and 2(b) while by part (ii) we reject Hypothesis3(b) Part (iii) supports Hypothesis 4(b) Parts(iv) and (v) reject Hypothesis 5(b) Result 2provides the first empirical evidence on the roleof strategic complementarity and the speed ofconvergence

Although part (ii) indicates that increasing from 18 to 20 does not significantly changeconvergence speed we now investigate whetherthere are any differences between 18 and thesupermodular treatments In particular we com-pare 2018 with 2020 and 204012 InResult 1 we show that these treatments achieve

the same convergence level Also we cannotreject the hypothesis that round-one prices foreach player are drawn for the same distributionfor each treatment13 As the treatments all startand converge to similar levels of equilibriumplay we use a more flexible functional form tolook for differences in speed

In Table 4 we report the results of probitregressions comparing 2018 with the two20 supermodular treatments We use Roundln(Round) and their interactions with 2018as independent variables to allow different con-vergence speeds to the same level of conver-gence In model (1) the dependent variable isplayer 1 -equilibrium play There is no signif-icant difference between 2018 and the 20supermodular treatments In model (2) the de-pendent variable is player 2 -equilibrium playThere are significant differences between thenear-supermodular and 20 supermodular treat-ments In early rounds ln(Round) is large relativeto Round The negative and weakly significantcoefficient on D2018ln(Round) implies a lowerprobability of early-round equilibrium play in

12 We omit 1020 from this analysis First it does notachieve the same -p1 convergence level as the other su-permodular treatments Second omitting it avoids the pos-sibility of an -effect

13 Kolmogorov-Smirnov test result tables are availableby request

TABLE 4mdashCONVERGENCE SPEED

Probit models with individual-level clustering comparing2018 with the 20 supermodular treatments achieving

similar convergence levels

Dependent variable -Nashequilibrium play

(1) -Nashprice 1

(2) -Nashprice 2

ln(Round) 0054 0216(0040) (0043)

Round 0004 0001(0002) (0002)

D2018ln(Round) 0013 0066(0032) (0035)

D2018Round 0001 0006(0002) (0003)

Observations 4680 4680Log pseudo-likelihood 287907 299385

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses Significant at 10-percent level 5-percent level 1-percent level Dy isthe dummy variable for treatment y Excluded dummies areD2020 and D2040

1518 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

2018 relative to the supermodular treatmentsThe 18 treatment does catch up to these treat-ments which is reflected in the positive andsignificant coefficient on D2018 interactedwith Round This result suggests a differencebetween supermodular and near-supermodulartreatments while they achieve similar conver-gence levels the supermodular treatments per-form better in early rounds and thus mightreach certain convergence levels faster thantheir near-supermodular counterparts

The previous discussion examines the sepa-rate effects of strategic complementarity (-effects) and (-effects) on convergenceVarying allows us however to test the im-portance of strategic complementarity relativeto other features of mechanism design Having afull factorial design in the parameters 1020 and 0 20 we can study whether affects the role of strategic complementarity Inparticular as increases from 0 to 20 weexpect improvement in convergence level andspeed We study whether this improvementchanges when increases from 10 to 20

The last two Wald tests in the bottom panelof Table 3 separately examine the -effectson the change in convergence level and speedresulting from increasing from 0 to 20(-effects on -effects) Changing from 10 to20 does not significantly change the improve-ment in either convergence level or speed Thisis the first empirical result examining the ef-fects of other factors on the role of strategiccomplementarity

So far we have discussed the performance ofthe mechanism relative to equilibrium predic-tions and individual behavior We now turn togroup-level welfare results Since the compen-sation mechanism balances the budget only in

equilibrium total payoffs off the equilibriumpath can be only weakly related to efficientpayoffs Therefore we use two separate mea-sures to capture welfare implications an effi-ciency measure and a measure of budgetimbalance

We first define the per-round efficiency mea-sure which includes neither the taxsubsidy northe penalty terms ie

et rxt cxt ext

rx cx ex

where x is the efficient quantity The efficiencyachieved in a block e(t1 t2) where 0 t1 t2 60 is then defined as

et1 t2 t t1

t2 et

t2 t1 1

RESULT 3 (Efficiency in the Last 20Rounds)

(i) When 20 increasing from 0 to 18significantly improves efficiency while in-creasing from 0 to 20 weakly improvesefficiency

(ii) When 20 increasing from 18 to 20has no significant effect on efficiency

(iii) When 20 increasing from 20 to 40weakly improves efficiency

(iv) When 0 increasing from 10 to 20weakly improves efficiency

(v) When 20 increasing from 10 to 20has no significant effect on efficiency

SUPPORT Table 5 presents the efficiencymeasure for each session in each treatment

TABLE 5mdashEFFICIENCY MEASURE

Efficiency Last 20 rounds

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0600 0671 0804 0858 0733 1000 2000 00872000 0650 0870 0907 0873 0825 0825 2000 2020 00952018 0963 0952 0889 0959 0941 2000 2018 00161020 0958 0919 0905 0881 0984 0929 1020 2020 07142020 0888 0937 0939 0780 0971 0903 2018 2020 08332040 0973 0962 0943 0949 0957 2020 2040 0064

Note Significant at 10-percent level 5-percent level 1-percent level

1519VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

in the last 20 rounds the alternative hypoth-eses and the results of one-tailed permutationtests

Result 3 is largely consistent with Result 1indicating that supermodular and near-supermodular mechanisms induce a signifi-cantly higher proportion of equilibrium playthan mechanisms far from the supermodularthreshold The new finding is that increasing from the threshold 20 to 40 weakly improvesefficiency

Second the budget surplus at round t is thesum of the penalty terms plus tax minus sub-sidy in that round ie s(t) ( )(p1(t) p2(t))2 p2(t)x(t) p1(t)x(t) Examining thesession-level budget surplus across all roundswe find the following results First 22 out of 27sessions result in a budget surplus This comesfrom a combination of our choice of parametersand the dynamics of play Second the onlysignificant difference across treatments is thatbudget surpluses under 2018 and 2020are significantly lower than those under2040 ( p-values 0029 and 0024 respec-tively) This finding points to a potential costof driving up the punishment parameter Inother words before the system equilibrates ahigh punishment parameter can worsen thebudget imbalance problem inherent in themechanism

Overall Results 1 through 3 suggest the fol-lowing observations regarding games with stra-tegic complementarities First in terms ofconvergence level and speed supermodular andnear-supermodular games perform significantlybetter than those far under the threshold Sec-ond while there is no significant differencebetween supermodular and near-supermodulargames in terms of convergence level early per-formance differences may lead to supermodulartreatments reaching certain convergence levelsmore quickly Third beyond the threshold in-creasing has no significant effect on either thelevel or the speed of convergence but has thedisadvantage of risking higher budget imbal-ance Last for a given increasing haspartial effects on convergence level but nosignificant effect on convergence speed Tocheck the persistence of these experimental re-sults in the long run we use simulations inSection V

V Simulation Results Continued Dynamics

Our experiment examines the relationship be-tween strategic complementarity and conver-gence to equilibrium Learning theory predictslong-run convergence Figures 1 and 2 showthat convergence continues to improve in laterrounds for several treatments suggesting con-tinued dynamics Due to time attention andresource constraints it was not feasible to runthe experiment for much longer than 60rounds Therefore we rely on simulations tostudy continued convergence beyond 60rounds

To do so we look for a learning algorithmwhich when calibrated closely approximatesthe observed dynamic paths over 60 roundsThe large empirical literature on learning ingames (see eg Camerer 2003 for a survey)suggests many models Our interest here is notto compare the performance of various learningmodels We look for learning models that matchtwo criteria First we require a model that per-forms well in a variety of experimental gamesSecond given our experiment has complete in-formation about the payoff structure the modelneeds to incorporate this information In thefollowing subsections we first introduce threelearning models which meet our criteria Wethen look at how well the learning models pre-dict the experimental data by calibrating eachalgorithm on a subset of the experimental dataand validating the models on a hold-out sampleWe then report the forecasting results using oneof the calibrated algorithms

A Three Learning Models

The models we examine are stochastic ficti-tious play with discounting (hereafter shortenedas sFP) (Yin-Wong Cheung and Daniel Fried-man 1997 Fudenberg and Levine 1998) func-tional EWA (fEWA) (Teck-Hua Ho et al2001) and the payoff assessment learning model(PA) (Rajiv Sarin and Farshid Vahid 1999)We now give a brief overview of each modelInterested readers are referred to the originalsfor complete descriptions

The particular version of sFP that we use islogistic fictitious play (see eg Fudenberg andLevine 1998) A player predicts the matchrsquosprice in the next round pi(t 1) according to

1520 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(6) pi t 1

pi t yen 1

t 1

rpi t

1 yen 1

t 1

r

for some discount factor r [0 1] Note r 0corresponds to the Cournot best reply assess-ment pi(t 1) pi(t) When r 1 it yieldsthe standard fictitious play assessment Theusual adaptive learning model assumes 0 r 1All observations influence the expected state butmore recent observations have greater weight

As opposed to standard fictitious play stochas-tic fictitious play allows decision randomizationand thus better captures the human learning pro-cess Omitting time subscripts the probability thata player announces price pi is given by

(7) Pr pipi exp13 pi pi

yenj 0

40

exp13 pj pj

Given a predicted price a player is thus morelikely to play strategies that yield higher pay-offs How much more likely is determined by 13the sensitivity parameter As 13 increases theprobability of a best response to pi increases

The second model we consider fEWA is aone-parameter variant of the experience-weighted attraction (EWA) model (Camererand Ho 1999) In this model strategy probabil-ities are determined by logit probabilities simi-lar to equation (7) with actual payoffs ((pipi)) replaced by strategy attraction In bothvariants strategy attraction Ai

j(t) and an expe-rience weight N(t) are updated after everyperiod The experience weight is updated ac-cording to N(t) (1 ) N(t 1) 1where is the change-detection parameter and controls exploration (low ) versus exploita-tion Attractions are updated according to thefollowing rule

(8) Aijt

Nt 1Ai

jt 1 1 I pij pi ti pj pi t

Nt

where the indicator function I(pij pi(t)) equals

one if pij pi(t) and zero otherwise and [0

1] is the imagination weight In EWA all pa-rameters are estimated whereas in fEWA theseparameters (except for 13) are endogenously de-termined by the following functions Thechange-detector function i(t) is given by

(9) i t 1 5 j 1

mi Ipij pi t

1

yen 1

t

I pij pi t

t 2

This function will be close to 1 when recenthistory resembles previous history The imagi-nation weight i(t) equals i(t) while equalsthe Gini coefficient of previous choice frequen-cies EWA models encompass a variety offamiliar learning models cumulative reinforce-ment learning ( 0 1 N(0) 1)weighted reinforcement learning ( 0 0N(0) 1) weighted fictitious play ( 1 0) standard fictitious play ( 1 0)and Cournot best reply ( 1 1)

Finally we introduce the main componentsof the PA model For simplicity we omit allsubscripts that represent player i and let j(t)represent the actual payoff of strategy j in roundt Since the game has a large strategy space weincorporate similarity functions into the modelto represent agent use of strategy similarity Asstrategies in this game are naturally ordered bytheir labels we use the Bartlett similarity func-tion fjk(h t) to denote the similarity betweenthe played strategy k and an unplayed strategyj at period t

fjk h t 1 j kh if j k h0 otherwise

In this function the parameter h determines theh 1 unplayed strategies on either side of theplayed strategy to be updated When h 1fjk(1 t) degenerates into an indicator functionequal to one if strategy j is chosen in round t andzero otherwise

The PA model assumes that a player is amyopic subjective maximizer That is hechooses strategies based on assessed payoffs

1521VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

and does not explicitly take into account thelikelihood of alternate states Let uj(t) denotethe subjective assessment of strategy pj at timet and r the discount factor Payoff assessmentsare updated through a weighted average of hisprevious assessments and the payoff he actuallyobtains at time t If strategy k is chosen at timet then

(10) uj t 1 1 rfjk h tuj t

rfjk h tk t j

Each period the assessed strategy payoffs aresubject to zero-mean symmetrically distributedshocks zj(t) The decision maker chooses on thebasis of his shock-distorted subjective assess-ments uj(t) uj(t) zj(t) At time t he choosesstrategy pk if

(11) uk t uj t 0 pj pk

Note that mood shocks affect only his choicesand not the manner in which assessments areupdated

B Calibration

Literature assessing the performance oflearning models contains two approaches to cal-ibration and validation The first approach cal-ibrates the model on the first t rounds andvalidates on the remaining rounds The secondapproach uses half of the sessions in each treat-ment to calibrate and the other half to validateWe choose the latter approach for two reasonsFirst this approach is feasible as we have mul-tiple independent sessions for each treatmentSecond we need not assume that the parametersof later rounds are the same as those in earlierrounds We thus calibrate the parameters ofeach model in blocks of 15 rounds using theexperimental data from the first two sessions ofeach treatment We then evaluate each model bymeasuring how well the parameterized modelpredicts play in the remaining two or threesessions

For parameter estimation we conduct MonteCarlo simulations designed to replicate thecharacteristics of the experimental settings In

all calibrations we exclude the last two rounds(59 and 60) to avoid any end-of-game effectsWe then compare the simulated paths with theexperimental data to find those parameters thatminimize the mean-squared deviation (MSD)scores Since the final distributions of our pricedata are unimodal the simulated mean is aninformative statistic and is well captured byMSD (Ernan Haruvy and Dale Stahl 2000) Inall simulations we use the k-period-aheadrather than the one-period-ahead approach14 be-cause we are interested in forecasting the long-run mechanism performance In doing so wechoose k 10 15 20 30 and 58 We look atblocks of 10 15 20 and 30 rounds because asplayers gain information and experience infor-mation use may change over time We use k 15 rather than the other values because it bestcaptures the actual dynamics in the experimen-tal data

Each simulation consists of 1500 games(18000 players) and the following steps

(i) Simulated players are randomly matchedinto pairs at the beginning of each round

(ii) Simulated players select price announce-ments(a) Initial round Almost half of all play-

ers selected a first-round price of 20(44 percent of player 1s and 43 percentof player 2s) There are also consider-able spikes in the frequency of select-ing other prices divisible by 515 Basedon Kolmogorov-Smirnov tests on theactual round-one price distribution wereject the null hypotheses of uniformdistribution (d 0250 p-value 0000) and normal distribution (d 0381 p-value 0000) We thus fol-low the convention (eg Ho et al2001) and use the actual first-roundempirical distribution of choices togenerate the first-round choices

(b) Subsequent rounds Simulated playerstrategies are determined via equation(7) for sFP and fEWA and equation(11) for PA

14 Ido Erev and Haruvy (2000) discuss the tradeoffs ofthe two approaches

15 For example the seven most frequently selected pricesare divisible by 5

1522 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iii) Simulated player 1rsquos quantity choice isbased on the following steps(a) Determine each player 1rsquos best re-

sponse to p2(b) Determine whether player 1 will devi-

ate from the best response via the ac-tual probability of errors for eachblock

(c) If yes deviate via the actual mean andstandard deviation for the block Oth-erwise play the best response

(iv) Payoffs are determined by the payoff func-tion of the compensation mechanismequation (1) for each treatment

(v) Assessments are updated according toequation (8) for fEWA and equation (10)for PA

(vi) Proceed to the next round

The discount factor r [0 1] is searched ata grid size of 01 The parameter 13 is searchedat a grid size of 01 in the interval [15 105] forfEWA and [15 250] for sFP The size of thesimilarity window h [1 10] is searched at agrid size of 1 Mood shocks z are drawn from

a uniform distribution16 on an interval [a a]where a is searched on [0 500] with a step sizeof 50 For all parameters intervals and gridsizes are determined by payoff magnitude

Table 6 reports the calibrated parameters(discount factor sensitivity parameter moodshock interval and similarity window size) forthe first two sessions of each treatment in 15-round blocks17 Estimated parameters for thesupermodular and near-supermodular treatmentsare consistent with the increased level of con-vergence over time while the 00 treatmentsare not The second column (sFP) reports thebest-fit parameters for the stochastic fictitiousplay model With the exception of treatment2000 the discount factor is close to 10

16 Chen and Yuri Khoroshilov (2003) compare threeversions of PA models where shocks were drawn fromuniform logistic and double exponential distributions ontwo sets of experimental data and find that the performanceof the three versions of PA were statistically indistinguish-able

17 We also calibrate all parameters for the entire 60rounds but do not report results due to space limitations

TABLE 6mdashCALIBRATION OF THREE LEARNING MODELS IN FIFTEEN-ROUND BLOCKS

Model sFP fEWA PA

Block 1 2 3 4 1 2 3 4 1 2 3 4

Discount rate (r) Discount rate (r)1000 10 07 09 10 01 00 09 102000 07 06 01 01 05 10 02 012018 10 10 09 09 02 10 03 021020 10 10 10 09 01 03 06 032020 10 10 10 09 02 01 05 022040 10 10 10 07 01 10 04 03

Sensitivity (13) Sensitivity (13) Shock interval (a)1000 15 44 45 28 16 45 44 37 500 500 250 502000 15 43 140 180 17 40 51 74 50 100 150 502018 16 80 115 183 15 53 62 95 50 300 500 1501020 19 59 86 138 18 47 62 82 100 500 450 3002020 28 56 81 112 24 38 61 67 200 300 350 3502040 34 65 140 205 32 49 67 99 150 50 150 50

Similarity window (h)1000 1 1 10 92000 10 10 9 62018 5 4 6 11020 1 1 8 32020 2 2 7 12040 1 3 2 2

1523VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

indicating that players keep track of all past playwhen forming beliefs about opponentsrsquo nextmove18 Given these beliefs the increasing sen-sitivity parameter 13 for all treatments except1000 indicates that the likelihood of bestresponse increases over time The third columnreports calibration results for the fEWA modelAgain with the exception of treatment 1000the parameter 13 increases over time The finalcolumn reports calibration of parameters in thePA model For each treatment except 1000 amiddle block has the highest discount factorindicating more weight on new informationabout a strategyrsquos performance The second pa-rameter in the PA model represents the upperbound of the interval from which shocks aredrawn a Experimentation should decrease inthe final rounds Indeed for all treatments theestimated mood-shock ranges (weakly) de-crease from the third to the last block Finallythe decreasing similarity windows from thethird to final block indicate decreasing strategyspillover Relatively large discount factors inthe third block combined with relatively largesimilarity windows flatten the payoff assess-ments around the most recently played strate-gies consistent with the local experimentation(or relatively stabilized play) observed in thedata

C Validation

Using the parameters calibrated on the firsttwo sessions of each treatment we next com-pare the performance of the three learning mod-els in predicting play in the hold-out sampleFor comparison we also present the perfor-mance of two static models The random choicemodel assumes that each player randomlychooses any strategy with equal probability forall rounds This model incorporates only num-ber of strategies and thus provides a minimumstandard for a dynamic learning model Theequilibrium model assumes that each playerplays the subgame perfect Nash equilibriumevery round Its fit conveys the same informa-tion as the proportion of equilibrium play pre-

sented in Section IV but with a different metric(MSD)

Table 7 presents each modelrsquos MSD scoresfor each hold-out session Recall that two ofeach treatmentrsquos four or five independent ses-sions are used for calibration and the rest forvalidation The results indicate that all threelearning models perform significantly betterthan the random choice model (p-value 001one-sided permutation tests) and by a largermargin significantly better than the equilibriummodel (p-value 001 one-sided permutationtests) While the equilibrium model does a poorjob of explaining overall experimental data itsperformance improvement over time19 justifiesthe use of learning models to explain the dy-namics of play Within each of the top threepanels (ie learning models) session-levelMSD scores are lower for the supermodular andnear-supermodular treatments indicating thateach learning model does a better job explainingbehavior in those treatments Indeed in the em-pirical learning literature learning models fitbetter in experiments with better equilibriumconvergence (see eg Chen and Khoroshilov2003)

RESULT 4 (Comparison of Learning Mod-els) The performance of PA is weakly betterthan that of fEWA and strictly better than that ofsFP The performances of fEWA and sFP arenot significantly different

SUPPORT Table 7 reports the MSD scores foreach independent session in the hold-out sampleunder each model The Wilcoxon signed-rankstests show that the MSD scores under PA areweakly lower than those under fEWA (z 0085 p-value 0068) and strictly lowerthan those under sFP (z 0028 p-value 0023) Using the same test we cannot reject thehypothesis that fEWA and sFP yield the sameMSD scores (z 0662 p-value 0508)

Although the performance of PA is onlyweakly better than that of fEWA the overallMSD scores for PA are lower than those for

18 As the discount factor is zero in Cournot best replywe can reject this model based on our estimation of thediscount factors

19 We omit the table to show the performance of theequilibrium model in blocks of 15 rounds as the informa-tion is repetitive with Figures 1 and 2

1524 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

fEWA for every treatment Therefore we usePA for forecasting beyond 60 rounds

Figure 3 presents the simulated path of thecalibrated PA model and compares it with theactual data in the hold-out sample by superim-posing the simulated mean (black line) plus andminus one standard deviation (grey lines) on theactual mean (black box) and standard deviation(error bars) The simulation does a good job oftracking both the mean and the standard devia-tion of the actual data In treatments 1000and 2040 however the reduction in variancelags that seen in the middle rounds of theexperiment

D Forecasting

We now report the results from the PAmodel As the strategic complementarity predic-tions concern long-run performance we use thecalibrated parameters for the first 58 rounds tosimulate play in later rounds This exercise al-

lows us to study convergence and efficiency inlong but finite horizons

In all forecasting exercises we base all post-58-round parameters on those from the lastblock (calibrated from rounds 46 to 58) Sincewe do not know how these parameters mightchange beyond 60 rounds we use three differ-ent specifications First we retain all final-blockparameters Second we exponentially decay insubsequent blocks the probability of deviationfrom the best response in the production stage20

Third we exponentially decay the shock inter-val in subsequent blocks As all three specifica-tions yield qualitatively similar results wereport results from only the third specificationdue to space limitations21 In presenting the

20 In block k 4 we use the probability of deviationPrk minPr4(k 4)2 108 We set a lower bound of108 to avoid the division-by-zero problem

21 Different sequences of random numbers produceslightly different parameters estimates While the conver-

TABLE 7mdashVALIDATION ON HOLD-OUT SESSIONS

Session

Stochastic fictitious play

1000 2000 2018 1020 2020 2040

1 0955 0934 0940 0930 0888 09402 0960 0946 0890 0917 0973 08783 0948 0883 0873Overall 0957 0943 0915 0910 0912 0909

fEWA

1 0956 0934 0942 0928 0887 09482 0960 0946 0890 0922 0978 08863 0946 0879 0871Overall 0958 0942 0916 0910 0912 0917

Payoff assessment

1 0952 0933 0914 0941 0888 09162 0957 0944 0899 0922 0917 08983 0947 0894 0903Overall 0955 0941 0907 0919 0902 0907

Equilibrium play

1 1867 1853 1833 1833 1489 17922 1889 1919 1606 1839 1953 16693 1917 1447 1539Overall 1878 1896 1719 1706 1660 1731

Random play

Overall 0976 0976 0976 0976 0976 0976

1525VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

results we use the shorthand notation x y todenote a measure under treatment x is signifi-cantly higher than that under treatment y at the5-percent level or less and x y to denote thatthe measures are not significantly different atthe 5-percent level

Figure 4 reports the simulated proportion of-equilibrium play We report only the resultsfor the first 500 rounds since the dynamics donot change much thereafter In our simulationall treatments reach higher convergence levelsthan those achieved in the 60 rounds of the

experiment Since the simulated variance reduc-tion lags that of the experiment round 60 resultsare not achieved until approximately 90 roundsof simulation We further note that these im-provements slow down after 200 rounds Theproportion of -equilibrium play is bounded by72 percent for player 1 and 93 percent for player2 We now outline the simulation results

RESULT 5 (Level of Convergence in Round500) In the simulated data for player 1 atround 500 we have the following level of con-vergence ranking in

(i) p1 2018 2040 2020 1020 2000 1000

(ii) -p1 2018 2040 2020 1020 2000 1000

(iii) p2 2040 2018 1020 2020 2000 1000 and

gence level of a given treatment for a given set of parameterestimates is not affected by the sequence of random num-bers this convergence level is somewhat sensitive to theexact parameter calibration We therefore conduct four dif-ferent calibrations for each treatment and select the param-eters that yield the highest level The relative rankings oftreatments are quite robust with respect to ordinal selected

FIGURE 3 SIMULATED DYNAMIC PATH VERSUS ACTUAL DATA

1526 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iv) -p2 2040 2018 1020 2020 2000 1000

SUPPORT The fourth and seventh columns ofTable 8 report p-values for t-tests of the preced-ing alternative hypotheses for players 1 and 2respectively

Comparing Results 1 and 5 we first note thatthe four supermodular and near-supermodulartreatments continue to dominate the 00 treat-ments In fact the simulations suggest that thegap remains constant compared with 2000and actually increases compared with 1000Unlike Result 1 however the four dominanttreatments differ In particular 2018 per-forms better in terms of player 1rsquos convergenceand 2040 in terms of player 2rsquos although thedifferences within the top four treatments aresmaller than the differences between these four

and the 00 treatments In addition an increasein significantly improves convergence by alarge margin (40 to 80 percent) for both playerswhen 0

It is instructive to compare these results withthose concerning the speed of convergence inour experimental treatments In terms of con-vergence to -p2 our regression results (Table4) indicate that 2018rsquos performance in laterrounds enables it to achieve the same conver-gence level as the supermodular treatmentsThis trend continues in our simulations as2018 performs robustly well compared to thesupermodular treatments Likewise in our anal-ysis of the experimental results the conver-gence speed of 2040 was not significantlydifferent from those of the other dominant treat-ments In our simulations however this treat-ment dominates all treatments in player 2equilibrium play

FIGURE 3 CONTINUED

1527VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

In our simulations the convergence improve-ment due to increasing from 0 to 20 doesdepend on (-effect on -effect) An increasein significantly decreases the improvement in

convergence level (all p-values for one-sidedt-tests are less than 001) Due to the relativelypoor performance of 1000 in our simula-tions increasing from 10 to 20 when 0

Player 1

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500Round

Ep

silo

n-e

qu

ilib

riu

m p

lay

α10β00α20β00α20β18α10β20α20β20α20β40

Player 2

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500

Round

Ep

silo

n-e

qu

ilib

riu

m p

lay α10β00

α20β00α20β18α10β20α20β20α20β40

FIGURE 4 SIMULATED -EQUILIBRIUM PLAY

1528 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

improves convergence more dramatically inround 500 than in rounds 40 to 60 In the longrun an increase in is a partial substitute for anincrease in

We now use definition 2 to examine the con-vergence speed in the long run Table 9 presentsresults for the first time a treatment reaches thelevel of convergence L We omit the two 00treatments since they do not converge to thesame levels as the other treatments In terms ofplayer 1 2018 achieves the fastest conver-gence for all levels L The picture for player 2convergence however is more subtle First forthe 20 and 18 treatments the speeds of con-vergence to all L are remarkably similarWhile 2040 lags the other treatments interms of achieving 80-percent -equilibriumplay for player 2 it is the only treatment toachieve L 90 percent

Apart from convergence to equilibrium it isalso important to look at long-run welfare prop-erties We evaluate mechanism performance us-ing the efficiency measure in Section IV

Figure 5 summarizes the simulated and actualefficiency achieved by each of the treatmentsThe top panel reports simulated results whilethe bottom panel reports efficiency in the ex-perimental data In the long run supermodularand near-supermodular treatments continue todiffer from the 00 treatments with 20 supe-rior to 10 in the 00 treatments

VI Interpretation and Discussion

The underlying force for convergence ingames with strategic complementarities is thecombination of the slope of the best-responsefunctions and adaptive learning by players In

TABLE 9mdashSPEED OF CONVERGENCE IN SIMULATED DATA

(Threshold is the percent of players playing epsilon-equilibrium strategy Entry indicates round in which went overthreshold for good where ldquomdashrdquo indicates that treatment never achieved threshold)

Threshold

Player 1 Player 2

04 05 06 07 04 05 06 07 08 09

2018 54 83 126 316 38 52 62 87 127 mdash1020 129 221 mdash mdash 36 48 63 91 148 mdash2020 61 91 149 mdash 39 51 61 84 137 mdash2040 101 166 263 496 30 38 56 94 166 339

TABLE 8mdashRESULTS OF t-TESTS COMPARING LEVEL OF CONVERGENCE OF SIMULATIONS OF EXPERIMENTAL TREATMENTS

(Values are for round 500 and based on 1500 simulated games)

Treatment

Player 1 equilibrium price Player 2 equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0073 00822000 0188 2000 1000 0000 0213 2000 1000 00002018 0265 2018 2040 0187 0381 2018 1020 02641020 0211 1020 2000 0000 0376 1020 2020 00012020 0243 2020 1020 0000 0353 2020 2000 00002040 0259 2040 2020 0007 0442 2040 2018 0000

Treatment

Player 1 -equilibrium price Player 2 -equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0216 02322000 0519 2000 1000 0000 0573 2000 1000 00002018 0713 2018 2040 0045 0870 2018 1020 00081020 0584 1020 2000 0000 0857 1020 2020 00002020 0662 2020 1020 0000 0834 2020 2000 00002040 0701 2040 2020 0000 0925 2040 2018 0000

1529VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the compensation mechanisms the slope ofplayer 2rsquos best response function is determinedby As and determine the penalty for

mismatched prices we seriously consider thepossibility that convergence in supermodularand near-supermodular treatments to an equilib-

Simulation Data

75

80

85

90

95

100

50 150 250 350 450

Round

α10β00α20β00α20β18α10β20α20β20α20β40

Experimental Data

40

45

50

55

60

65

70

75

80

85

90

0 10 20 30 40 50 60

Round

Fra

ctio

n o

f M

axim

um

Wel

fare

α10β00α20β00α20β18α10β20α20β20α20β40

Fra

ctio

n o

f M

axim

um

Wel

fare

FIGURE 5 EFFICIENCY IN THE SIMULATED AND ACTUAL DATA

1530 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium where both players announce the sameprice is not due to strategic complementaritiesbut rather to increases in and making thepenalty terms more prominent and matching strat-egies more focal22 We thus have two hypoth-eses to explain the observed improvement inconvergence to equilibrium play in supermodu-lar and near-supermodular treatments a best-response hypothesis and a matching hypothesisIn this section we present evidence that theimprovement in convergence is due to payoff-relevant changes to best responses and not tomatching becoming more focal

While the focal point hypothesis suggeststhat players converge to a match it does notspecify the price at which they match In thefirst round of our experiments almost 50 per-cent of all prices were 20 and less than 10percent were in the -equilibrium range of[15 17] In the final rounds of our experi-ments play in the four supermodular or near-supermodular treatments converges stronglyto this range suggesting more than simplematching

By equation (4) player 1rsquos best response isalways to match regardless of p2 By theGeneral Incentive Hypothesis we expectmore matching behavior by player 1 as increases By equation (5) however match-ing is a best response for player 2 only inequilibrium Therefore data for player 2 canhelp us separate the best-response and match-ing hypotheses

We first investigate which model better ex-plains our experimental data We operationalizethe best response hypothesis by the stochasticfictitious play model of Section V and thematching hypothesis with a stochastic matchingmodel23 In both models the predicted player 1price is specified by equation (6) In the match-ing model player 2 plays this price with prob-ability 1 and one of the other 40 priceswith probability 40 We estimate using thefirst two sessions of each treatment24 We then

compare the performance of the matchingmodel with the sFP model using the hold-outsample Using the Wilcoxon signed-rank teststo compare the MSD scores in the hold-outsample we conclude that sFP explains player2rsquos behavior significantly better than the match-ing model (p-value 0006) We conclude thatbest response to a predicted price explains be-havior significantly better than the matchingmodel

We next investigate whether differences inthe cost of matching defined as the differencebetween matching and best-response profitscan explain all of the differences in match-ing behavior among treatments or whetherthere exists additional matching behaviorthat cannot be explained by payoff-relevantfactors

We examine the probability that player 2 triesto match the anticipated player 1 price We donot know what price player 2 anticipates Toestimate how players weigh history we cali-brate a matching model in a manner similar tothat outlined in Section V B We assume aplayer matches the price he anticipates givenby equation (6) and we find the discount ratefor each treatment that minimizes MSD25 Thisgives us the anticipated price that best explainsthe matching hypothesis Using this discountrate we calculate for each t 1 an anticipatedplayer 1 price p1 for each player 2 We alsocalculate the opportunity cost of matching p1equal to the difference between best-responseand matching profits This opportunity cost cap-tures the payoff relevant effects of and alsodepends on the price to be matched

We next regress the probability that player 2matches the anticipated price Pr(p2 p1) onln(Round) treatment dummies and treatmentdummies interacted with ln(Round) In onespecification we include the cost of matching asa regressor If parameter changes make match-ing more focal then changes in the cost ofmatching will not explain all of the changes inprobability of matching

22 We thank an anonymous referee and the co-editor forpointing this out

23 We do not report the results of the deterministicmatching model as it performs significantly worse than itsstochastic counterpart

24 Estimated in each block are as follows 1000 10 7 6 8 2000 7 6 5 4 2018 7

2 3 3 1020 5 3 3 2 2020 1 0 0 02040 2 0 0 1

25 In all treatments the calibrated discount rate is r 01

1531VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Table 10 presents the results of two probitmodels with clustering at the individual level Inthe first specification we do not include the costof matching as a regressor In this specificationthe coefficient for D00 is negative and signifi-cant thus reducing from 20 to 0 decreases theprobability that player 2 will try to match player1rsquos price Including the cost of matching in theregression however we find that neither thedummies nor their interactions are signifi-cantmdashthe coefficient for the previously signifi-cant D00 dummy falls almost by a half whilethe precision of the estimate stays about thesame The coefficient on the cost of matchinghowever is significant and negative Thisfinding suggests that the rise in matching that

occurs when we increase from 0 to 20 is notbecause matching is more focal but ratherbecause an increase of decreases the cost ofmatching26

VII Concluding Remarks

The appeal of games with strategic comple-mentarities is simple as long as players areadaptive learners the game will in the limitconverge to the set bounded by the largest andsmallest Nash equilibria This convergence de-pends on neither initial conditions nor the as-sumption of a particular learning modelUnfortunately while many competitive envi-ronments are supermodular many are not Inthis study we examine whether there are non-supermodular games which have the same con-vergence properties as supermodular ones Wealso study whether there exists a clear conver-gence ranking among games with strategiccomplementarities

Our results confirm the findings of previousexperimental studies that supermodular gamesperform significantly better than games far be-low the supermodular threshold From a littlebelow the threshold to the threshold howeverthe change in convergence level is statisticallyinsignificant This results suggests that in thecontext of mechanism design the designerneed not be overly concerned with setting pa-rameters that are firmly above the supermodularthreshold close is just as good It also enlargesthe set of robustly stable games For examplethe Falkinger mechanism (Falkinger 1996) isnot supermodular but close These results sug-gest that near-supermodular games perform likesupermodular ones

Our next result concerns convergence perfor-mance within the class of games with strategiccomplementarities Variations in the degree ofcomplementarities have no significant effect onperformance within the 60 experimental roundsOur simulations suggest however an increased

26 We consider other specifications of anticipated priceincluding the averages of the last one two three and fourprices seen In only one specification was a dummy or itsinteraction with ln(Round) significant In that specificationthe coefficient for D18 is positive and its interaction withln(Round) is negative a finding which is not consistent witha focal point hypothesis

TABLE 10mdashPLAYER 2 MATCHING HYPOTHESIS

(Probit models with individual-level clustering comparingmodels with and without the cost to Player 2 of matching

the anticipated price of Player 1)

Model

Dependent variableProbability player 2 priceequals anticipated player 1

price

(1) (2)

D10 0014 0017(0043) (0041)

D00 0077 0043(0035) (0036)

D18 0046 0032(0053) (0056)

D40 0008 0041(0061) (0042)

ln(Round) 0041 0028(0011) (0010)

Match Cost 0001(0000)

ln(Round)D10 0014 0012(0012) (0012)

ln(Round)D00 0003 0000(0013) (0012)

ln(Round)D18 0006 0002(0021) (0020)

ln(Round)D40 0001 0015(0018) (0016)

Observations 9588 9588Log pseudo-likelihood 324321 317716

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses are adjusted for cluster-ing at the individual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummyvariable for treatment y Excluded dummy is D2020Match Cost is the difference in cents of matching andbest-responding to anticipated Player 1 price

1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 10: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

improvement in convergence level is statisti-cally insignificant In other words we do notsee a dramatic improvement at the thresholdThis implies that the performance of near-supermodular games such as the Falkingermechanism ought to be comparable to that ofsupermodular games

By contrast we accept Hypothesis 4(a) bypart (iii) This is the first experimental resultsystematically comparing convergence levels ofsupermodular games where theory is silentThe convergence level does not significantlyimprove as increases from the threshold 20to 40 Therefore the marginal returns for being

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β00 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β00 Player 1

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β20 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β20 Player 1

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α20β18 Player 1

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β40 Player 1

Pri

ce A

nn

ou

nce

d

Round

FIGURE 1 DISTRIBUTION OF ANNOUNCED PRICES IN EXPERIMENTAL TREATMENTS PLAYER 1

1514 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

ldquomore supermodularrdquo diminish once the payoffsbecome supermodular

Proposition 2 predicts convergence underCournot best reply for any 0 Howeverthere is a significant difference in convergencelevel as increases from 0 to 18 20 and

beyond We investigate in Section V whetherthis difference persists in the long run

While parts (i) to (iii) present the -effectsparts (iv) and (v) examine the -effects and wepartially accept Hypothesis 5(a) Recall fromequation (5) and subsequent discussions that at

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β00 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β00 Player 2

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β20 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β20 Player 2

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α20β18 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β40 Player 2

Pri

ce A

nn

ou

nce

d

Round

FIGURE 2 DISTRIBUTION OF ANNOUNCED PRICES IN EXPERIMENTAL TREATMENTS PLAYER 2

1515VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the threshold of strategic complementarity 20 player 2rsquos Nash equilibrium strategy is alsoa dominant strategy The finding of no -effecton player 2rsquos equilibrium play when 20 isconsistent with this observation

To determine the effects of strategic comple-mentarities and other factors on convergencespeed and to investigate further their effects onconvergence level we use probit models withclustering at the individual level The resultsfrom these models are presented in Table 3 Thedependent variable is -p1 in specifications (1)and (3) and -p2 in specifications (2) and (4)

In specifications (1) and (2) the independentvariables are treatment dummies Dy wherey 1000 2000 18 1020 and 40ln(Round) and a constant We use dummies fordifferent values of and rather than directparameter values to avoid assuming a linearrelationship of parameter effects and omit thedummy for 2020 Therefore in these twospecifications restricting learning speed to bethe same across treatments the estimated coef-ficient of Dy captures the difference in theconvergence level between treatments y and2020 Results from these specifications are

TABLE 2mdashLEVEL OF CONVERGENCE IN THE LAST 20 ROUNDS

Panel A Proportion of player 1 Nash equilibrium price (p1) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0000 0008 0158 0008 0044 1000 2000 03972000 0000 0083 0083 0042 0058 0053 2000 2020 00082018 0125 0117 0100 0192 0133 2000 2018 00081020 0242 0067 0067 0067 0208 0130 1020 2020 00642020 0325 0267 0208 0058 0367 0245 2018 2020 00642040 0175 0400 0158 0133 0217 2020 2040 0643

Panel B Proportion of player 2 Nash equilibrium price (p2) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0025 0042 0025 0067 0040 1000 2000 00562000 0067 0083 0033 0075 0050 0062 2000 2020 00322018 0133 0292 0108 0258 0198 2000 2018 00081020 0392 0108 0175 0067 0667 0282 1020 2020 05042020 0308 0117 0483 0017 0450 0275 2018 2020 02382040 0325 0492 0233 0233 0321 2020 2040 0365

Panel C Proportion of player 1 -Nash equilibrium price (-p1) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0108 0200 0350 0158 0204 1000 2000 01832000 0067 0458 0358 0183 0425 0298 2000 2020 00872018 0317 0625 0300 0583 0456 2000 2018 01031020 0475 0200 0300 0375 0492 0368 1020 2020 01942020 0450 0558 0475 0133 0775 0478 2018 2020 04442040 0458 0492 0375 0442 0442 2020 2040 0643

Panel D Proportion of player 2 -Nash equilibrium price (-p2) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0133 0200 0242 0225 0200 1000 2000 00162000 0300 0400 0200 0275 0333 0302 2000 2020 00162018 0717 0667 0342 0700 0606 2000 2018 00161020 0650 0517 0542 0400 0892 0600 1020 2020 05642020 0533 0592 0642 0225 0900 0578 2018 2020 05562040 0825 0792 0467 0517 0650 2020 2040 0294

Note Significant at 10-percent level 5-percent level 1-percent level

1516 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

largely consistent with Result 1 which uses amore conservative test The coefficients ofln(Round) are both positive and highly signifi-cant indicating that players learn to play equilib-rium strategies over time The concave functionalform ln(Round) which yields a better log-likelihood than either the linear or quadratic func-tional form indicates that learning is rapid at thebeginning and decreases over time We will ex-amine learning in more detail in Section V

In specifications (3) and (4) we useln(Round) interaction of each of the treatmentdummies with ln(Round) and a constant asindependent variables The interaction term al-lows different slopes for different treatmentsCompared with the coefficient of ln(Round) the

coefficient for the interaction term Dyln(Round) captures the slope differences be-tween treatment y and 2020 By Observation1 as we cannot reject the hypotheses that initialround probability of equilibrium play is thesame across all treatments11 we can comparethe slope of L(t) L(t) between different treat-ments From the probit specification L(t) 13(constant Db ln(t)) where D is the 1 5vector of treatment dummies and b is the 5 1vector of estimated coefficients we can derive

11 When including treatment dummies in this specifica-tion Wald tests on the hypothesis that the treatment dummycoefficients are all zero yield p-values of 02703 for -p1and 03383 for -p2

TABLE 3mdashCONVERGENCE SPEED PROBIT MODELS WITH CLUSTERING AT INDIVIDUAL LEVEL

Dependent variable -Nash equilibrium play

(1) -Nashprice 1

(2) -Nashprice 2

(3) -Nashprice 1

(4) -Nashprice 2

D1000 0143 0268(0048) (0050)

D2000 0090 0217(0060) (0057)

D18 0005 0004(0071) (0069)

D1020 0080 0025(0060) (0073)

D40 0000 0078(0068) (0078)

ln(Round) 0115 0167 0131 0183(0014) (0017) (0021) (0022)

D1000ln(Round) 0050 0095(0019) (0022)

D2000ln(Round) 0031 0073(0020) (0021)

D18ln(Round) 0001 0004(0020) (0021)

D1020ln(Round) 0024 0008(0019) (0022)

D40ln(Round) 0002 0023(0019) (0024)

Observations 9720 9720 9720 9720Number of groups 162 162 162 162Log pseudo-likelihood 5562890 5824055 5557424 5793013

D1000 D2000 150 131D1000ln(Round) D2000ln(Round) 133 123D1020 D1000 D2000 005 107D1020ln(Round) D1000ln(Round)

D2000ln(Round)005 103

Notes Coefficients are probability derivatives Robust standard errors in parentheses are adjusted for clustering at theindividual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummy variable fortreatment y Excluded dummy is D2020 The bottom panel presents null hypotheses and Wald 2(1) test statistics

1517VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the slope of the probability function for treat-ment y with respect to t Ly(t) 13(con-stant Db)byt All coefficients reported inTable 3 are increments of probability deriva-tives 13(constant Db)by compared to thebaseline case of 2020

RESULT 2 (Speed of Convergence)

(i) When 20 increasing from 0 to 18and from 0 to 20 significantly increases thespeed of convergence in -p1 and -p2

(ii) When 20 increasing from 18 to 20has no significant effect on convergencespeed

(iii) When 20 increasing from 20 to 40has no significant effect on convergencespeed

(iv) When 0 increasing from 10 to 20has no significant effects on convergencespeed

(v) When 20 increasing from 10 to 20has no significant effects on convergencespeed

SUPPORT Models (3) and (4) in Table 3 re-port analyses on convergence speed For eachindependent variable the coefficients standarderrors (in parentheses) and significance levelsare reported The second Wald test looks atwhether the coefficient of D1000ln(Round)equals that of D2000ln(Round)

Part (i) of Result 2 supports Hypotheses 1(b)and 2(b) while by part (ii) we reject Hypothesis3(b) Part (iii) supports Hypothesis 4(b) Parts(iv) and (v) reject Hypothesis 5(b) Result 2provides the first empirical evidence on the roleof strategic complementarity and the speed ofconvergence

Although part (ii) indicates that increasing from 18 to 20 does not significantly changeconvergence speed we now investigate whetherthere are any differences between 18 and thesupermodular treatments In particular we com-pare 2018 with 2020 and 204012 InResult 1 we show that these treatments achieve

the same convergence level Also we cannotreject the hypothesis that round-one prices foreach player are drawn for the same distributionfor each treatment13 As the treatments all startand converge to similar levels of equilibriumplay we use a more flexible functional form tolook for differences in speed

In Table 4 we report the results of probitregressions comparing 2018 with the two20 supermodular treatments We use Roundln(Round) and their interactions with 2018as independent variables to allow different con-vergence speeds to the same level of conver-gence In model (1) the dependent variable isplayer 1 -equilibrium play There is no signif-icant difference between 2018 and the 20supermodular treatments In model (2) the de-pendent variable is player 2 -equilibrium playThere are significant differences between thenear-supermodular and 20 supermodular treat-ments In early rounds ln(Round) is large relativeto Round The negative and weakly significantcoefficient on D2018ln(Round) implies a lowerprobability of early-round equilibrium play in

12 We omit 1020 from this analysis First it does notachieve the same -p1 convergence level as the other su-permodular treatments Second omitting it avoids the pos-sibility of an -effect

13 Kolmogorov-Smirnov test result tables are availableby request

TABLE 4mdashCONVERGENCE SPEED

Probit models with individual-level clustering comparing2018 with the 20 supermodular treatments achieving

similar convergence levels

Dependent variable -Nashequilibrium play

(1) -Nashprice 1

(2) -Nashprice 2

ln(Round) 0054 0216(0040) (0043)

Round 0004 0001(0002) (0002)

D2018ln(Round) 0013 0066(0032) (0035)

D2018Round 0001 0006(0002) (0003)

Observations 4680 4680Log pseudo-likelihood 287907 299385

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses Significant at 10-percent level 5-percent level 1-percent level Dy isthe dummy variable for treatment y Excluded dummies areD2020 and D2040

1518 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

2018 relative to the supermodular treatmentsThe 18 treatment does catch up to these treat-ments which is reflected in the positive andsignificant coefficient on D2018 interactedwith Round This result suggests a differencebetween supermodular and near-supermodulartreatments while they achieve similar conver-gence levels the supermodular treatments per-form better in early rounds and thus mightreach certain convergence levels faster thantheir near-supermodular counterparts

The previous discussion examines the sepa-rate effects of strategic complementarity (-effects) and (-effects) on convergenceVarying allows us however to test the im-portance of strategic complementarity relativeto other features of mechanism design Having afull factorial design in the parameters 1020 and 0 20 we can study whether affects the role of strategic complementarity Inparticular as increases from 0 to 20 weexpect improvement in convergence level andspeed We study whether this improvementchanges when increases from 10 to 20

The last two Wald tests in the bottom panelof Table 3 separately examine the -effectson the change in convergence level and speedresulting from increasing from 0 to 20(-effects on -effects) Changing from 10 to20 does not significantly change the improve-ment in either convergence level or speed Thisis the first empirical result examining the ef-fects of other factors on the role of strategiccomplementarity

So far we have discussed the performance ofthe mechanism relative to equilibrium predic-tions and individual behavior We now turn togroup-level welfare results Since the compen-sation mechanism balances the budget only in

equilibrium total payoffs off the equilibriumpath can be only weakly related to efficientpayoffs Therefore we use two separate mea-sures to capture welfare implications an effi-ciency measure and a measure of budgetimbalance

We first define the per-round efficiency mea-sure which includes neither the taxsubsidy northe penalty terms ie

et rxt cxt ext

rx cx ex

where x is the efficient quantity The efficiencyachieved in a block e(t1 t2) where 0 t1 t2 60 is then defined as

et1 t2 t t1

t2 et

t2 t1 1

RESULT 3 (Efficiency in the Last 20Rounds)

(i) When 20 increasing from 0 to 18significantly improves efficiency while in-creasing from 0 to 20 weakly improvesefficiency

(ii) When 20 increasing from 18 to 20has no significant effect on efficiency

(iii) When 20 increasing from 20 to 40weakly improves efficiency

(iv) When 0 increasing from 10 to 20weakly improves efficiency

(v) When 20 increasing from 10 to 20has no significant effect on efficiency

SUPPORT Table 5 presents the efficiencymeasure for each session in each treatment

TABLE 5mdashEFFICIENCY MEASURE

Efficiency Last 20 rounds

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0600 0671 0804 0858 0733 1000 2000 00872000 0650 0870 0907 0873 0825 0825 2000 2020 00952018 0963 0952 0889 0959 0941 2000 2018 00161020 0958 0919 0905 0881 0984 0929 1020 2020 07142020 0888 0937 0939 0780 0971 0903 2018 2020 08332040 0973 0962 0943 0949 0957 2020 2040 0064

Note Significant at 10-percent level 5-percent level 1-percent level

1519VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

in the last 20 rounds the alternative hypoth-eses and the results of one-tailed permutationtests

Result 3 is largely consistent with Result 1indicating that supermodular and near-supermodular mechanisms induce a signifi-cantly higher proportion of equilibrium playthan mechanisms far from the supermodularthreshold The new finding is that increasing from the threshold 20 to 40 weakly improvesefficiency

Second the budget surplus at round t is thesum of the penalty terms plus tax minus sub-sidy in that round ie s(t) ( )(p1(t) p2(t))2 p2(t)x(t) p1(t)x(t) Examining thesession-level budget surplus across all roundswe find the following results First 22 out of 27sessions result in a budget surplus This comesfrom a combination of our choice of parametersand the dynamics of play Second the onlysignificant difference across treatments is thatbudget surpluses under 2018 and 2020are significantly lower than those under2040 ( p-values 0029 and 0024 respec-tively) This finding points to a potential costof driving up the punishment parameter Inother words before the system equilibrates ahigh punishment parameter can worsen thebudget imbalance problem inherent in themechanism

Overall Results 1 through 3 suggest the fol-lowing observations regarding games with stra-tegic complementarities First in terms ofconvergence level and speed supermodular andnear-supermodular games perform significantlybetter than those far under the threshold Sec-ond while there is no significant differencebetween supermodular and near-supermodulargames in terms of convergence level early per-formance differences may lead to supermodulartreatments reaching certain convergence levelsmore quickly Third beyond the threshold in-creasing has no significant effect on either thelevel or the speed of convergence but has thedisadvantage of risking higher budget imbal-ance Last for a given increasing haspartial effects on convergence level but nosignificant effect on convergence speed Tocheck the persistence of these experimental re-sults in the long run we use simulations inSection V

V Simulation Results Continued Dynamics

Our experiment examines the relationship be-tween strategic complementarity and conver-gence to equilibrium Learning theory predictslong-run convergence Figures 1 and 2 showthat convergence continues to improve in laterrounds for several treatments suggesting con-tinued dynamics Due to time attention andresource constraints it was not feasible to runthe experiment for much longer than 60rounds Therefore we rely on simulations tostudy continued convergence beyond 60rounds

To do so we look for a learning algorithmwhich when calibrated closely approximatesthe observed dynamic paths over 60 roundsThe large empirical literature on learning ingames (see eg Camerer 2003 for a survey)suggests many models Our interest here is notto compare the performance of various learningmodels We look for learning models that matchtwo criteria First we require a model that per-forms well in a variety of experimental gamesSecond given our experiment has complete in-formation about the payoff structure the modelneeds to incorporate this information In thefollowing subsections we first introduce threelearning models which meet our criteria Wethen look at how well the learning models pre-dict the experimental data by calibrating eachalgorithm on a subset of the experimental dataand validating the models on a hold-out sampleWe then report the forecasting results using oneof the calibrated algorithms

A Three Learning Models

The models we examine are stochastic ficti-tious play with discounting (hereafter shortenedas sFP) (Yin-Wong Cheung and Daniel Fried-man 1997 Fudenberg and Levine 1998) func-tional EWA (fEWA) (Teck-Hua Ho et al2001) and the payoff assessment learning model(PA) (Rajiv Sarin and Farshid Vahid 1999)We now give a brief overview of each modelInterested readers are referred to the originalsfor complete descriptions

The particular version of sFP that we use islogistic fictitious play (see eg Fudenberg andLevine 1998) A player predicts the matchrsquosprice in the next round pi(t 1) according to

1520 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(6) pi t 1

pi t yen 1

t 1

rpi t

1 yen 1

t 1

r

for some discount factor r [0 1] Note r 0corresponds to the Cournot best reply assess-ment pi(t 1) pi(t) When r 1 it yieldsthe standard fictitious play assessment Theusual adaptive learning model assumes 0 r 1All observations influence the expected state butmore recent observations have greater weight

As opposed to standard fictitious play stochas-tic fictitious play allows decision randomizationand thus better captures the human learning pro-cess Omitting time subscripts the probability thata player announces price pi is given by

(7) Pr pipi exp13 pi pi

yenj 0

40

exp13 pj pj

Given a predicted price a player is thus morelikely to play strategies that yield higher pay-offs How much more likely is determined by 13the sensitivity parameter As 13 increases theprobability of a best response to pi increases

The second model we consider fEWA is aone-parameter variant of the experience-weighted attraction (EWA) model (Camererand Ho 1999) In this model strategy probabil-ities are determined by logit probabilities simi-lar to equation (7) with actual payoffs ((pipi)) replaced by strategy attraction In bothvariants strategy attraction Ai

j(t) and an expe-rience weight N(t) are updated after everyperiod The experience weight is updated ac-cording to N(t) (1 ) N(t 1) 1where is the change-detection parameter and controls exploration (low ) versus exploita-tion Attractions are updated according to thefollowing rule

(8) Aijt

Nt 1Ai

jt 1 1 I pij pi ti pj pi t

Nt

where the indicator function I(pij pi(t)) equals

one if pij pi(t) and zero otherwise and [0

1] is the imagination weight In EWA all pa-rameters are estimated whereas in fEWA theseparameters (except for 13) are endogenously de-termined by the following functions Thechange-detector function i(t) is given by

(9) i t 1 5 j 1

mi Ipij pi t

1

yen 1

t

I pij pi t

t 2

This function will be close to 1 when recenthistory resembles previous history The imagi-nation weight i(t) equals i(t) while equalsthe Gini coefficient of previous choice frequen-cies EWA models encompass a variety offamiliar learning models cumulative reinforce-ment learning ( 0 1 N(0) 1)weighted reinforcement learning ( 0 0N(0) 1) weighted fictitious play ( 1 0) standard fictitious play ( 1 0)and Cournot best reply ( 1 1)

Finally we introduce the main componentsof the PA model For simplicity we omit allsubscripts that represent player i and let j(t)represent the actual payoff of strategy j in roundt Since the game has a large strategy space weincorporate similarity functions into the modelto represent agent use of strategy similarity Asstrategies in this game are naturally ordered bytheir labels we use the Bartlett similarity func-tion fjk(h t) to denote the similarity betweenthe played strategy k and an unplayed strategyj at period t

fjk h t 1 j kh if j k h0 otherwise

In this function the parameter h determines theh 1 unplayed strategies on either side of theplayed strategy to be updated When h 1fjk(1 t) degenerates into an indicator functionequal to one if strategy j is chosen in round t andzero otherwise

The PA model assumes that a player is amyopic subjective maximizer That is hechooses strategies based on assessed payoffs

1521VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

and does not explicitly take into account thelikelihood of alternate states Let uj(t) denotethe subjective assessment of strategy pj at timet and r the discount factor Payoff assessmentsare updated through a weighted average of hisprevious assessments and the payoff he actuallyobtains at time t If strategy k is chosen at timet then

(10) uj t 1 1 rfjk h tuj t

rfjk h tk t j

Each period the assessed strategy payoffs aresubject to zero-mean symmetrically distributedshocks zj(t) The decision maker chooses on thebasis of his shock-distorted subjective assess-ments uj(t) uj(t) zj(t) At time t he choosesstrategy pk if

(11) uk t uj t 0 pj pk

Note that mood shocks affect only his choicesand not the manner in which assessments areupdated

B Calibration

Literature assessing the performance oflearning models contains two approaches to cal-ibration and validation The first approach cal-ibrates the model on the first t rounds andvalidates on the remaining rounds The secondapproach uses half of the sessions in each treat-ment to calibrate and the other half to validateWe choose the latter approach for two reasonsFirst this approach is feasible as we have mul-tiple independent sessions for each treatmentSecond we need not assume that the parametersof later rounds are the same as those in earlierrounds We thus calibrate the parameters ofeach model in blocks of 15 rounds using theexperimental data from the first two sessions ofeach treatment We then evaluate each model bymeasuring how well the parameterized modelpredicts play in the remaining two or threesessions

For parameter estimation we conduct MonteCarlo simulations designed to replicate thecharacteristics of the experimental settings In

all calibrations we exclude the last two rounds(59 and 60) to avoid any end-of-game effectsWe then compare the simulated paths with theexperimental data to find those parameters thatminimize the mean-squared deviation (MSD)scores Since the final distributions of our pricedata are unimodal the simulated mean is aninformative statistic and is well captured byMSD (Ernan Haruvy and Dale Stahl 2000) Inall simulations we use the k-period-aheadrather than the one-period-ahead approach14 be-cause we are interested in forecasting the long-run mechanism performance In doing so wechoose k 10 15 20 30 and 58 We look atblocks of 10 15 20 and 30 rounds because asplayers gain information and experience infor-mation use may change over time We use k 15 rather than the other values because it bestcaptures the actual dynamics in the experimen-tal data

Each simulation consists of 1500 games(18000 players) and the following steps

(i) Simulated players are randomly matchedinto pairs at the beginning of each round

(ii) Simulated players select price announce-ments(a) Initial round Almost half of all play-

ers selected a first-round price of 20(44 percent of player 1s and 43 percentof player 2s) There are also consider-able spikes in the frequency of select-ing other prices divisible by 515 Basedon Kolmogorov-Smirnov tests on theactual round-one price distribution wereject the null hypotheses of uniformdistribution (d 0250 p-value 0000) and normal distribution (d 0381 p-value 0000) We thus fol-low the convention (eg Ho et al2001) and use the actual first-roundempirical distribution of choices togenerate the first-round choices

(b) Subsequent rounds Simulated playerstrategies are determined via equation(7) for sFP and fEWA and equation(11) for PA

14 Ido Erev and Haruvy (2000) discuss the tradeoffs ofthe two approaches

15 For example the seven most frequently selected pricesare divisible by 5

1522 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iii) Simulated player 1rsquos quantity choice isbased on the following steps(a) Determine each player 1rsquos best re-

sponse to p2(b) Determine whether player 1 will devi-

ate from the best response via the ac-tual probability of errors for eachblock

(c) If yes deviate via the actual mean andstandard deviation for the block Oth-erwise play the best response

(iv) Payoffs are determined by the payoff func-tion of the compensation mechanismequation (1) for each treatment

(v) Assessments are updated according toequation (8) for fEWA and equation (10)for PA

(vi) Proceed to the next round

The discount factor r [0 1] is searched ata grid size of 01 The parameter 13 is searchedat a grid size of 01 in the interval [15 105] forfEWA and [15 250] for sFP The size of thesimilarity window h [1 10] is searched at agrid size of 1 Mood shocks z are drawn from

a uniform distribution16 on an interval [a a]where a is searched on [0 500] with a step sizeof 50 For all parameters intervals and gridsizes are determined by payoff magnitude

Table 6 reports the calibrated parameters(discount factor sensitivity parameter moodshock interval and similarity window size) forthe first two sessions of each treatment in 15-round blocks17 Estimated parameters for thesupermodular and near-supermodular treatmentsare consistent with the increased level of con-vergence over time while the 00 treatmentsare not The second column (sFP) reports thebest-fit parameters for the stochastic fictitiousplay model With the exception of treatment2000 the discount factor is close to 10

16 Chen and Yuri Khoroshilov (2003) compare threeversions of PA models where shocks were drawn fromuniform logistic and double exponential distributions ontwo sets of experimental data and find that the performanceof the three versions of PA were statistically indistinguish-able

17 We also calibrate all parameters for the entire 60rounds but do not report results due to space limitations

TABLE 6mdashCALIBRATION OF THREE LEARNING MODELS IN FIFTEEN-ROUND BLOCKS

Model sFP fEWA PA

Block 1 2 3 4 1 2 3 4 1 2 3 4

Discount rate (r) Discount rate (r)1000 10 07 09 10 01 00 09 102000 07 06 01 01 05 10 02 012018 10 10 09 09 02 10 03 021020 10 10 10 09 01 03 06 032020 10 10 10 09 02 01 05 022040 10 10 10 07 01 10 04 03

Sensitivity (13) Sensitivity (13) Shock interval (a)1000 15 44 45 28 16 45 44 37 500 500 250 502000 15 43 140 180 17 40 51 74 50 100 150 502018 16 80 115 183 15 53 62 95 50 300 500 1501020 19 59 86 138 18 47 62 82 100 500 450 3002020 28 56 81 112 24 38 61 67 200 300 350 3502040 34 65 140 205 32 49 67 99 150 50 150 50

Similarity window (h)1000 1 1 10 92000 10 10 9 62018 5 4 6 11020 1 1 8 32020 2 2 7 12040 1 3 2 2

1523VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

indicating that players keep track of all past playwhen forming beliefs about opponentsrsquo nextmove18 Given these beliefs the increasing sen-sitivity parameter 13 for all treatments except1000 indicates that the likelihood of bestresponse increases over time The third columnreports calibration results for the fEWA modelAgain with the exception of treatment 1000the parameter 13 increases over time The finalcolumn reports calibration of parameters in thePA model For each treatment except 1000 amiddle block has the highest discount factorindicating more weight on new informationabout a strategyrsquos performance The second pa-rameter in the PA model represents the upperbound of the interval from which shocks aredrawn a Experimentation should decrease inthe final rounds Indeed for all treatments theestimated mood-shock ranges (weakly) de-crease from the third to the last block Finallythe decreasing similarity windows from thethird to final block indicate decreasing strategyspillover Relatively large discount factors inthe third block combined with relatively largesimilarity windows flatten the payoff assess-ments around the most recently played strate-gies consistent with the local experimentation(or relatively stabilized play) observed in thedata

C Validation

Using the parameters calibrated on the firsttwo sessions of each treatment we next com-pare the performance of the three learning mod-els in predicting play in the hold-out sampleFor comparison we also present the perfor-mance of two static models The random choicemodel assumes that each player randomlychooses any strategy with equal probability forall rounds This model incorporates only num-ber of strategies and thus provides a minimumstandard for a dynamic learning model Theequilibrium model assumes that each playerplays the subgame perfect Nash equilibriumevery round Its fit conveys the same informa-tion as the proportion of equilibrium play pre-

sented in Section IV but with a different metric(MSD)

Table 7 presents each modelrsquos MSD scoresfor each hold-out session Recall that two ofeach treatmentrsquos four or five independent ses-sions are used for calibration and the rest forvalidation The results indicate that all threelearning models perform significantly betterthan the random choice model (p-value 001one-sided permutation tests) and by a largermargin significantly better than the equilibriummodel (p-value 001 one-sided permutationtests) While the equilibrium model does a poorjob of explaining overall experimental data itsperformance improvement over time19 justifiesthe use of learning models to explain the dy-namics of play Within each of the top threepanels (ie learning models) session-levelMSD scores are lower for the supermodular andnear-supermodular treatments indicating thateach learning model does a better job explainingbehavior in those treatments Indeed in the em-pirical learning literature learning models fitbetter in experiments with better equilibriumconvergence (see eg Chen and Khoroshilov2003)

RESULT 4 (Comparison of Learning Mod-els) The performance of PA is weakly betterthan that of fEWA and strictly better than that ofsFP The performances of fEWA and sFP arenot significantly different

SUPPORT Table 7 reports the MSD scores foreach independent session in the hold-out sampleunder each model The Wilcoxon signed-rankstests show that the MSD scores under PA areweakly lower than those under fEWA (z 0085 p-value 0068) and strictly lowerthan those under sFP (z 0028 p-value 0023) Using the same test we cannot reject thehypothesis that fEWA and sFP yield the sameMSD scores (z 0662 p-value 0508)

Although the performance of PA is onlyweakly better than that of fEWA the overallMSD scores for PA are lower than those for

18 As the discount factor is zero in Cournot best replywe can reject this model based on our estimation of thediscount factors

19 We omit the table to show the performance of theequilibrium model in blocks of 15 rounds as the informa-tion is repetitive with Figures 1 and 2

1524 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

fEWA for every treatment Therefore we usePA for forecasting beyond 60 rounds

Figure 3 presents the simulated path of thecalibrated PA model and compares it with theactual data in the hold-out sample by superim-posing the simulated mean (black line) plus andminus one standard deviation (grey lines) on theactual mean (black box) and standard deviation(error bars) The simulation does a good job oftracking both the mean and the standard devia-tion of the actual data In treatments 1000and 2040 however the reduction in variancelags that seen in the middle rounds of theexperiment

D Forecasting

We now report the results from the PAmodel As the strategic complementarity predic-tions concern long-run performance we use thecalibrated parameters for the first 58 rounds tosimulate play in later rounds This exercise al-

lows us to study convergence and efficiency inlong but finite horizons

In all forecasting exercises we base all post-58-round parameters on those from the lastblock (calibrated from rounds 46 to 58) Sincewe do not know how these parameters mightchange beyond 60 rounds we use three differ-ent specifications First we retain all final-blockparameters Second we exponentially decay insubsequent blocks the probability of deviationfrom the best response in the production stage20

Third we exponentially decay the shock inter-val in subsequent blocks As all three specifica-tions yield qualitatively similar results wereport results from only the third specificationdue to space limitations21 In presenting the

20 In block k 4 we use the probability of deviationPrk minPr4(k 4)2 108 We set a lower bound of108 to avoid the division-by-zero problem

21 Different sequences of random numbers produceslightly different parameters estimates While the conver-

TABLE 7mdashVALIDATION ON HOLD-OUT SESSIONS

Session

Stochastic fictitious play

1000 2000 2018 1020 2020 2040

1 0955 0934 0940 0930 0888 09402 0960 0946 0890 0917 0973 08783 0948 0883 0873Overall 0957 0943 0915 0910 0912 0909

fEWA

1 0956 0934 0942 0928 0887 09482 0960 0946 0890 0922 0978 08863 0946 0879 0871Overall 0958 0942 0916 0910 0912 0917

Payoff assessment

1 0952 0933 0914 0941 0888 09162 0957 0944 0899 0922 0917 08983 0947 0894 0903Overall 0955 0941 0907 0919 0902 0907

Equilibrium play

1 1867 1853 1833 1833 1489 17922 1889 1919 1606 1839 1953 16693 1917 1447 1539Overall 1878 1896 1719 1706 1660 1731

Random play

Overall 0976 0976 0976 0976 0976 0976

1525VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

results we use the shorthand notation x y todenote a measure under treatment x is signifi-cantly higher than that under treatment y at the5-percent level or less and x y to denote thatthe measures are not significantly different atthe 5-percent level

Figure 4 reports the simulated proportion of-equilibrium play We report only the resultsfor the first 500 rounds since the dynamics donot change much thereafter In our simulationall treatments reach higher convergence levelsthan those achieved in the 60 rounds of the

experiment Since the simulated variance reduc-tion lags that of the experiment round 60 resultsare not achieved until approximately 90 roundsof simulation We further note that these im-provements slow down after 200 rounds Theproportion of -equilibrium play is bounded by72 percent for player 1 and 93 percent for player2 We now outline the simulation results

RESULT 5 (Level of Convergence in Round500) In the simulated data for player 1 atround 500 we have the following level of con-vergence ranking in

(i) p1 2018 2040 2020 1020 2000 1000

(ii) -p1 2018 2040 2020 1020 2000 1000

(iii) p2 2040 2018 1020 2020 2000 1000 and

gence level of a given treatment for a given set of parameterestimates is not affected by the sequence of random num-bers this convergence level is somewhat sensitive to theexact parameter calibration We therefore conduct four dif-ferent calibrations for each treatment and select the param-eters that yield the highest level The relative rankings oftreatments are quite robust with respect to ordinal selected

FIGURE 3 SIMULATED DYNAMIC PATH VERSUS ACTUAL DATA

1526 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iv) -p2 2040 2018 1020 2020 2000 1000

SUPPORT The fourth and seventh columns ofTable 8 report p-values for t-tests of the preced-ing alternative hypotheses for players 1 and 2respectively

Comparing Results 1 and 5 we first note thatthe four supermodular and near-supermodulartreatments continue to dominate the 00 treat-ments In fact the simulations suggest that thegap remains constant compared with 2000and actually increases compared with 1000Unlike Result 1 however the four dominanttreatments differ In particular 2018 per-forms better in terms of player 1rsquos convergenceand 2040 in terms of player 2rsquos although thedifferences within the top four treatments aresmaller than the differences between these four

and the 00 treatments In addition an increasein significantly improves convergence by alarge margin (40 to 80 percent) for both playerswhen 0

It is instructive to compare these results withthose concerning the speed of convergence inour experimental treatments In terms of con-vergence to -p2 our regression results (Table4) indicate that 2018rsquos performance in laterrounds enables it to achieve the same conver-gence level as the supermodular treatmentsThis trend continues in our simulations as2018 performs robustly well compared to thesupermodular treatments Likewise in our anal-ysis of the experimental results the conver-gence speed of 2040 was not significantlydifferent from those of the other dominant treat-ments In our simulations however this treat-ment dominates all treatments in player 2equilibrium play

FIGURE 3 CONTINUED

1527VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

In our simulations the convergence improve-ment due to increasing from 0 to 20 doesdepend on (-effect on -effect) An increasein significantly decreases the improvement in

convergence level (all p-values for one-sidedt-tests are less than 001) Due to the relativelypoor performance of 1000 in our simula-tions increasing from 10 to 20 when 0

Player 1

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500Round

Ep

silo

n-e

qu

ilib

riu

m p

lay

α10β00α20β00α20β18α10β20α20β20α20β40

Player 2

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500

Round

Ep

silo

n-e

qu

ilib

riu

m p

lay α10β00

α20β00α20β18α10β20α20β20α20β40

FIGURE 4 SIMULATED -EQUILIBRIUM PLAY

1528 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

improves convergence more dramatically inround 500 than in rounds 40 to 60 In the longrun an increase in is a partial substitute for anincrease in

We now use definition 2 to examine the con-vergence speed in the long run Table 9 presentsresults for the first time a treatment reaches thelevel of convergence L We omit the two 00treatments since they do not converge to thesame levels as the other treatments In terms ofplayer 1 2018 achieves the fastest conver-gence for all levels L The picture for player 2convergence however is more subtle First forthe 20 and 18 treatments the speeds of con-vergence to all L are remarkably similarWhile 2040 lags the other treatments interms of achieving 80-percent -equilibriumplay for player 2 it is the only treatment toachieve L 90 percent

Apart from convergence to equilibrium it isalso important to look at long-run welfare prop-erties We evaluate mechanism performance us-ing the efficiency measure in Section IV

Figure 5 summarizes the simulated and actualefficiency achieved by each of the treatmentsThe top panel reports simulated results whilethe bottom panel reports efficiency in the ex-perimental data In the long run supermodularand near-supermodular treatments continue todiffer from the 00 treatments with 20 supe-rior to 10 in the 00 treatments

VI Interpretation and Discussion

The underlying force for convergence ingames with strategic complementarities is thecombination of the slope of the best-responsefunctions and adaptive learning by players In

TABLE 9mdashSPEED OF CONVERGENCE IN SIMULATED DATA

(Threshold is the percent of players playing epsilon-equilibrium strategy Entry indicates round in which went overthreshold for good where ldquomdashrdquo indicates that treatment never achieved threshold)

Threshold

Player 1 Player 2

04 05 06 07 04 05 06 07 08 09

2018 54 83 126 316 38 52 62 87 127 mdash1020 129 221 mdash mdash 36 48 63 91 148 mdash2020 61 91 149 mdash 39 51 61 84 137 mdash2040 101 166 263 496 30 38 56 94 166 339

TABLE 8mdashRESULTS OF t-TESTS COMPARING LEVEL OF CONVERGENCE OF SIMULATIONS OF EXPERIMENTAL TREATMENTS

(Values are for round 500 and based on 1500 simulated games)

Treatment

Player 1 equilibrium price Player 2 equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0073 00822000 0188 2000 1000 0000 0213 2000 1000 00002018 0265 2018 2040 0187 0381 2018 1020 02641020 0211 1020 2000 0000 0376 1020 2020 00012020 0243 2020 1020 0000 0353 2020 2000 00002040 0259 2040 2020 0007 0442 2040 2018 0000

Treatment

Player 1 -equilibrium price Player 2 -equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0216 02322000 0519 2000 1000 0000 0573 2000 1000 00002018 0713 2018 2040 0045 0870 2018 1020 00081020 0584 1020 2000 0000 0857 1020 2020 00002020 0662 2020 1020 0000 0834 2020 2000 00002040 0701 2040 2020 0000 0925 2040 2018 0000

1529VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the compensation mechanisms the slope ofplayer 2rsquos best response function is determinedby As and determine the penalty for

mismatched prices we seriously consider thepossibility that convergence in supermodularand near-supermodular treatments to an equilib-

Simulation Data

75

80

85

90

95

100

50 150 250 350 450

Round

α10β00α20β00α20β18α10β20α20β20α20β40

Experimental Data

40

45

50

55

60

65

70

75

80

85

90

0 10 20 30 40 50 60

Round

Fra

ctio

n o

f M

axim

um

Wel

fare

α10β00α20β00α20β18α10β20α20β20α20β40

Fra

ctio

n o

f M

axim

um

Wel

fare

FIGURE 5 EFFICIENCY IN THE SIMULATED AND ACTUAL DATA

1530 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium where both players announce the sameprice is not due to strategic complementaritiesbut rather to increases in and making thepenalty terms more prominent and matching strat-egies more focal22 We thus have two hypoth-eses to explain the observed improvement inconvergence to equilibrium play in supermodu-lar and near-supermodular treatments a best-response hypothesis and a matching hypothesisIn this section we present evidence that theimprovement in convergence is due to payoff-relevant changes to best responses and not tomatching becoming more focal

While the focal point hypothesis suggeststhat players converge to a match it does notspecify the price at which they match In thefirst round of our experiments almost 50 per-cent of all prices were 20 and less than 10percent were in the -equilibrium range of[15 17] In the final rounds of our experi-ments play in the four supermodular or near-supermodular treatments converges stronglyto this range suggesting more than simplematching

By equation (4) player 1rsquos best response isalways to match regardless of p2 By theGeneral Incentive Hypothesis we expectmore matching behavior by player 1 as increases By equation (5) however match-ing is a best response for player 2 only inequilibrium Therefore data for player 2 canhelp us separate the best-response and match-ing hypotheses

We first investigate which model better ex-plains our experimental data We operationalizethe best response hypothesis by the stochasticfictitious play model of Section V and thematching hypothesis with a stochastic matchingmodel23 In both models the predicted player 1price is specified by equation (6) In the match-ing model player 2 plays this price with prob-ability 1 and one of the other 40 priceswith probability 40 We estimate using thefirst two sessions of each treatment24 We then

compare the performance of the matchingmodel with the sFP model using the hold-outsample Using the Wilcoxon signed-rank teststo compare the MSD scores in the hold-outsample we conclude that sFP explains player2rsquos behavior significantly better than the match-ing model (p-value 0006) We conclude thatbest response to a predicted price explains be-havior significantly better than the matchingmodel

We next investigate whether differences inthe cost of matching defined as the differencebetween matching and best-response profitscan explain all of the differences in match-ing behavior among treatments or whetherthere exists additional matching behaviorthat cannot be explained by payoff-relevantfactors

We examine the probability that player 2 triesto match the anticipated player 1 price We donot know what price player 2 anticipates Toestimate how players weigh history we cali-brate a matching model in a manner similar tothat outlined in Section V B We assume aplayer matches the price he anticipates givenby equation (6) and we find the discount ratefor each treatment that minimizes MSD25 Thisgives us the anticipated price that best explainsthe matching hypothesis Using this discountrate we calculate for each t 1 an anticipatedplayer 1 price p1 for each player 2 We alsocalculate the opportunity cost of matching p1equal to the difference between best-responseand matching profits This opportunity cost cap-tures the payoff relevant effects of and alsodepends on the price to be matched

We next regress the probability that player 2matches the anticipated price Pr(p2 p1) onln(Round) treatment dummies and treatmentdummies interacted with ln(Round) In onespecification we include the cost of matching asa regressor If parameter changes make match-ing more focal then changes in the cost ofmatching will not explain all of the changes inprobability of matching

22 We thank an anonymous referee and the co-editor forpointing this out

23 We do not report the results of the deterministicmatching model as it performs significantly worse than itsstochastic counterpart

24 Estimated in each block are as follows 1000 10 7 6 8 2000 7 6 5 4 2018 7

2 3 3 1020 5 3 3 2 2020 1 0 0 02040 2 0 0 1

25 In all treatments the calibrated discount rate is r 01

1531VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Table 10 presents the results of two probitmodels with clustering at the individual level Inthe first specification we do not include the costof matching as a regressor In this specificationthe coefficient for D00 is negative and signifi-cant thus reducing from 20 to 0 decreases theprobability that player 2 will try to match player1rsquos price Including the cost of matching in theregression however we find that neither thedummies nor their interactions are signifi-cantmdashthe coefficient for the previously signifi-cant D00 dummy falls almost by a half whilethe precision of the estimate stays about thesame The coefficient on the cost of matchinghowever is significant and negative Thisfinding suggests that the rise in matching that

occurs when we increase from 0 to 20 is notbecause matching is more focal but ratherbecause an increase of decreases the cost ofmatching26

VII Concluding Remarks

The appeal of games with strategic comple-mentarities is simple as long as players areadaptive learners the game will in the limitconverge to the set bounded by the largest andsmallest Nash equilibria This convergence de-pends on neither initial conditions nor the as-sumption of a particular learning modelUnfortunately while many competitive envi-ronments are supermodular many are not Inthis study we examine whether there are non-supermodular games which have the same con-vergence properties as supermodular ones Wealso study whether there exists a clear conver-gence ranking among games with strategiccomplementarities

Our results confirm the findings of previousexperimental studies that supermodular gamesperform significantly better than games far be-low the supermodular threshold From a littlebelow the threshold to the threshold howeverthe change in convergence level is statisticallyinsignificant This results suggests that in thecontext of mechanism design the designerneed not be overly concerned with setting pa-rameters that are firmly above the supermodularthreshold close is just as good It also enlargesthe set of robustly stable games For examplethe Falkinger mechanism (Falkinger 1996) isnot supermodular but close These results sug-gest that near-supermodular games perform likesupermodular ones

Our next result concerns convergence perfor-mance within the class of games with strategiccomplementarities Variations in the degree ofcomplementarities have no significant effect onperformance within the 60 experimental roundsOur simulations suggest however an increased

26 We consider other specifications of anticipated priceincluding the averages of the last one two three and fourprices seen In only one specification was a dummy or itsinteraction with ln(Round) significant In that specificationthe coefficient for D18 is positive and its interaction withln(Round) is negative a finding which is not consistent witha focal point hypothesis

TABLE 10mdashPLAYER 2 MATCHING HYPOTHESIS

(Probit models with individual-level clustering comparingmodels with and without the cost to Player 2 of matching

the anticipated price of Player 1)

Model

Dependent variableProbability player 2 priceequals anticipated player 1

price

(1) (2)

D10 0014 0017(0043) (0041)

D00 0077 0043(0035) (0036)

D18 0046 0032(0053) (0056)

D40 0008 0041(0061) (0042)

ln(Round) 0041 0028(0011) (0010)

Match Cost 0001(0000)

ln(Round)D10 0014 0012(0012) (0012)

ln(Round)D00 0003 0000(0013) (0012)

ln(Round)D18 0006 0002(0021) (0020)

ln(Round)D40 0001 0015(0018) (0016)

Observations 9588 9588Log pseudo-likelihood 324321 317716

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses are adjusted for cluster-ing at the individual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummyvariable for treatment y Excluded dummy is D2020Match Cost is the difference in cents of matching andbest-responding to anticipated Player 1 price

1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 11: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

ldquomore supermodularrdquo diminish once the payoffsbecome supermodular

Proposition 2 predicts convergence underCournot best reply for any 0 Howeverthere is a significant difference in convergencelevel as increases from 0 to 18 20 and

beyond We investigate in Section V whetherthis difference persists in the long run

While parts (i) to (iii) present the -effectsparts (iv) and (v) examine the -effects and wepartially accept Hypothesis 5(a) Recall fromequation (5) and subsequent discussions that at

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β00 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β00 Player 2

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α10β20 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β20 Player 2

Pri

ce A

nn

ou

nce

d

Round

1 5 10 15 20 25 30 35 40 45 50 55 600

5

10

15

20

25

30

35

40α20β18 Player 2

Pri

ce A

nn

ou

nce

d

Round1 5 10 15 20 25 30 35 40 45 50 55 60

0

5

10

15

20

25

30

35

40α20β40 Player 2

Pri

ce A

nn

ou

nce

d

Round

FIGURE 2 DISTRIBUTION OF ANNOUNCED PRICES IN EXPERIMENTAL TREATMENTS PLAYER 2

1515VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the threshold of strategic complementarity 20 player 2rsquos Nash equilibrium strategy is alsoa dominant strategy The finding of no -effecton player 2rsquos equilibrium play when 20 isconsistent with this observation

To determine the effects of strategic comple-mentarities and other factors on convergencespeed and to investigate further their effects onconvergence level we use probit models withclustering at the individual level The resultsfrom these models are presented in Table 3 Thedependent variable is -p1 in specifications (1)and (3) and -p2 in specifications (2) and (4)

In specifications (1) and (2) the independentvariables are treatment dummies Dy wherey 1000 2000 18 1020 and 40ln(Round) and a constant We use dummies fordifferent values of and rather than directparameter values to avoid assuming a linearrelationship of parameter effects and omit thedummy for 2020 Therefore in these twospecifications restricting learning speed to bethe same across treatments the estimated coef-ficient of Dy captures the difference in theconvergence level between treatments y and2020 Results from these specifications are

TABLE 2mdashLEVEL OF CONVERGENCE IN THE LAST 20 ROUNDS

Panel A Proportion of player 1 Nash equilibrium price (p1) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0000 0008 0158 0008 0044 1000 2000 03972000 0000 0083 0083 0042 0058 0053 2000 2020 00082018 0125 0117 0100 0192 0133 2000 2018 00081020 0242 0067 0067 0067 0208 0130 1020 2020 00642020 0325 0267 0208 0058 0367 0245 2018 2020 00642040 0175 0400 0158 0133 0217 2020 2040 0643

Panel B Proportion of player 2 Nash equilibrium price (p2) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0025 0042 0025 0067 0040 1000 2000 00562000 0067 0083 0033 0075 0050 0062 2000 2020 00322018 0133 0292 0108 0258 0198 2000 2018 00081020 0392 0108 0175 0067 0667 0282 1020 2020 05042020 0308 0117 0483 0017 0450 0275 2018 2020 02382040 0325 0492 0233 0233 0321 2020 2040 0365

Panel C Proportion of player 1 -Nash equilibrium price (-p1) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0108 0200 0350 0158 0204 1000 2000 01832000 0067 0458 0358 0183 0425 0298 2000 2020 00872018 0317 0625 0300 0583 0456 2000 2018 01031020 0475 0200 0300 0375 0492 0368 1020 2020 01942020 0450 0558 0475 0133 0775 0478 2018 2020 04442040 0458 0492 0375 0442 0442 2020 2040 0643

Panel D Proportion of player 2 -Nash equilibrium price (-p2) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0133 0200 0242 0225 0200 1000 2000 00162000 0300 0400 0200 0275 0333 0302 2000 2020 00162018 0717 0667 0342 0700 0606 2000 2018 00161020 0650 0517 0542 0400 0892 0600 1020 2020 05642020 0533 0592 0642 0225 0900 0578 2018 2020 05562040 0825 0792 0467 0517 0650 2020 2040 0294

Note Significant at 10-percent level 5-percent level 1-percent level

1516 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

largely consistent with Result 1 which uses amore conservative test The coefficients ofln(Round) are both positive and highly signifi-cant indicating that players learn to play equilib-rium strategies over time The concave functionalform ln(Round) which yields a better log-likelihood than either the linear or quadratic func-tional form indicates that learning is rapid at thebeginning and decreases over time We will ex-amine learning in more detail in Section V

In specifications (3) and (4) we useln(Round) interaction of each of the treatmentdummies with ln(Round) and a constant asindependent variables The interaction term al-lows different slopes for different treatmentsCompared with the coefficient of ln(Round) the

coefficient for the interaction term Dyln(Round) captures the slope differences be-tween treatment y and 2020 By Observation1 as we cannot reject the hypotheses that initialround probability of equilibrium play is thesame across all treatments11 we can comparethe slope of L(t) L(t) between different treat-ments From the probit specification L(t) 13(constant Db ln(t)) where D is the 1 5vector of treatment dummies and b is the 5 1vector of estimated coefficients we can derive

11 When including treatment dummies in this specifica-tion Wald tests on the hypothesis that the treatment dummycoefficients are all zero yield p-values of 02703 for -p1and 03383 for -p2

TABLE 3mdashCONVERGENCE SPEED PROBIT MODELS WITH CLUSTERING AT INDIVIDUAL LEVEL

Dependent variable -Nash equilibrium play

(1) -Nashprice 1

(2) -Nashprice 2

(3) -Nashprice 1

(4) -Nashprice 2

D1000 0143 0268(0048) (0050)

D2000 0090 0217(0060) (0057)

D18 0005 0004(0071) (0069)

D1020 0080 0025(0060) (0073)

D40 0000 0078(0068) (0078)

ln(Round) 0115 0167 0131 0183(0014) (0017) (0021) (0022)

D1000ln(Round) 0050 0095(0019) (0022)

D2000ln(Round) 0031 0073(0020) (0021)

D18ln(Round) 0001 0004(0020) (0021)

D1020ln(Round) 0024 0008(0019) (0022)

D40ln(Round) 0002 0023(0019) (0024)

Observations 9720 9720 9720 9720Number of groups 162 162 162 162Log pseudo-likelihood 5562890 5824055 5557424 5793013

D1000 D2000 150 131D1000ln(Round) D2000ln(Round) 133 123D1020 D1000 D2000 005 107D1020ln(Round) D1000ln(Round)

D2000ln(Round)005 103

Notes Coefficients are probability derivatives Robust standard errors in parentheses are adjusted for clustering at theindividual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummy variable fortreatment y Excluded dummy is D2020 The bottom panel presents null hypotheses and Wald 2(1) test statistics

1517VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the slope of the probability function for treat-ment y with respect to t Ly(t) 13(con-stant Db)byt All coefficients reported inTable 3 are increments of probability deriva-tives 13(constant Db)by compared to thebaseline case of 2020

RESULT 2 (Speed of Convergence)

(i) When 20 increasing from 0 to 18and from 0 to 20 significantly increases thespeed of convergence in -p1 and -p2

(ii) When 20 increasing from 18 to 20has no significant effect on convergencespeed

(iii) When 20 increasing from 20 to 40has no significant effect on convergencespeed

(iv) When 0 increasing from 10 to 20has no significant effects on convergencespeed

(v) When 20 increasing from 10 to 20has no significant effects on convergencespeed

SUPPORT Models (3) and (4) in Table 3 re-port analyses on convergence speed For eachindependent variable the coefficients standarderrors (in parentheses) and significance levelsare reported The second Wald test looks atwhether the coefficient of D1000ln(Round)equals that of D2000ln(Round)

Part (i) of Result 2 supports Hypotheses 1(b)and 2(b) while by part (ii) we reject Hypothesis3(b) Part (iii) supports Hypothesis 4(b) Parts(iv) and (v) reject Hypothesis 5(b) Result 2provides the first empirical evidence on the roleof strategic complementarity and the speed ofconvergence

Although part (ii) indicates that increasing from 18 to 20 does not significantly changeconvergence speed we now investigate whetherthere are any differences between 18 and thesupermodular treatments In particular we com-pare 2018 with 2020 and 204012 InResult 1 we show that these treatments achieve

the same convergence level Also we cannotreject the hypothesis that round-one prices foreach player are drawn for the same distributionfor each treatment13 As the treatments all startand converge to similar levels of equilibriumplay we use a more flexible functional form tolook for differences in speed

In Table 4 we report the results of probitregressions comparing 2018 with the two20 supermodular treatments We use Roundln(Round) and their interactions with 2018as independent variables to allow different con-vergence speeds to the same level of conver-gence In model (1) the dependent variable isplayer 1 -equilibrium play There is no signif-icant difference between 2018 and the 20supermodular treatments In model (2) the de-pendent variable is player 2 -equilibrium playThere are significant differences between thenear-supermodular and 20 supermodular treat-ments In early rounds ln(Round) is large relativeto Round The negative and weakly significantcoefficient on D2018ln(Round) implies a lowerprobability of early-round equilibrium play in

12 We omit 1020 from this analysis First it does notachieve the same -p1 convergence level as the other su-permodular treatments Second omitting it avoids the pos-sibility of an -effect

13 Kolmogorov-Smirnov test result tables are availableby request

TABLE 4mdashCONVERGENCE SPEED

Probit models with individual-level clustering comparing2018 with the 20 supermodular treatments achieving

similar convergence levels

Dependent variable -Nashequilibrium play

(1) -Nashprice 1

(2) -Nashprice 2

ln(Round) 0054 0216(0040) (0043)

Round 0004 0001(0002) (0002)

D2018ln(Round) 0013 0066(0032) (0035)

D2018Round 0001 0006(0002) (0003)

Observations 4680 4680Log pseudo-likelihood 287907 299385

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses Significant at 10-percent level 5-percent level 1-percent level Dy isthe dummy variable for treatment y Excluded dummies areD2020 and D2040

1518 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

2018 relative to the supermodular treatmentsThe 18 treatment does catch up to these treat-ments which is reflected in the positive andsignificant coefficient on D2018 interactedwith Round This result suggests a differencebetween supermodular and near-supermodulartreatments while they achieve similar conver-gence levels the supermodular treatments per-form better in early rounds and thus mightreach certain convergence levels faster thantheir near-supermodular counterparts

The previous discussion examines the sepa-rate effects of strategic complementarity (-effects) and (-effects) on convergenceVarying allows us however to test the im-portance of strategic complementarity relativeto other features of mechanism design Having afull factorial design in the parameters 1020 and 0 20 we can study whether affects the role of strategic complementarity Inparticular as increases from 0 to 20 weexpect improvement in convergence level andspeed We study whether this improvementchanges when increases from 10 to 20

The last two Wald tests in the bottom panelof Table 3 separately examine the -effectson the change in convergence level and speedresulting from increasing from 0 to 20(-effects on -effects) Changing from 10 to20 does not significantly change the improve-ment in either convergence level or speed Thisis the first empirical result examining the ef-fects of other factors on the role of strategiccomplementarity

So far we have discussed the performance ofthe mechanism relative to equilibrium predic-tions and individual behavior We now turn togroup-level welfare results Since the compen-sation mechanism balances the budget only in

equilibrium total payoffs off the equilibriumpath can be only weakly related to efficientpayoffs Therefore we use two separate mea-sures to capture welfare implications an effi-ciency measure and a measure of budgetimbalance

We first define the per-round efficiency mea-sure which includes neither the taxsubsidy northe penalty terms ie

et rxt cxt ext

rx cx ex

where x is the efficient quantity The efficiencyachieved in a block e(t1 t2) where 0 t1 t2 60 is then defined as

et1 t2 t t1

t2 et

t2 t1 1

RESULT 3 (Efficiency in the Last 20Rounds)

(i) When 20 increasing from 0 to 18significantly improves efficiency while in-creasing from 0 to 20 weakly improvesefficiency

(ii) When 20 increasing from 18 to 20has no significant effect on efficiency

(iii) When 20 increasing from 20 to 40weakly improves efficiency

(iv) When 0 increasing from 10 to 20weakly improves efficiency

(v) When 20 increasing from 10 to 20has no significant effect on efficiency

SUPPORT Table 5 presents the efficiencymeasure for each session in each treatment

TABLE 5mdashEFFICIENCY MEASURE

Efficiency Last 20 rounds

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0600 0671 0804 0858 0733 1000 2000 00872000 0650 0870 0907 0873 0825 0825 2000 2020 00952018 0963 0952 0889 0959 0941 2000 2018 00161020 0958 0919 0905 0881 0984 0929 1020 2020 07142020 0888 0937 0939 0780 0971 0903 2018 2020 08332040 0973 0962 0943 0949 0957 2020 2040 0064

Note Significant at 10-percent level 5-percent level 1-percent level

1519VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

in the last 20 rounds the alternative hypoth-eses and the results of one-tailed permutationtests

Result 3 is largely consistent with Result 1indicating that supermodular and near-supermodular mechanisms induce a signifi-cantly higher proportion of equilibrium playthan mechanisms far from the supermodularthreshold The new finding is that increasing from the threshold 20 to 40 weakly improvesefficiency

Second the budget surplus at round t is thesum of the penalty terms plus tax minus sub-sidy in that round ie s(t) ( )(p1(t) p2(t))2 p2(t)x(t) p1(t)x(t) Examining thesession-level budget surplus across all roundswe find the following results First 22 out of 27sessions result in a budget surplus This comesfrom a combination of our choice of parametersand the dynamics of play Second the onlysignificant difference across treatments is thatbudget surpluses under 2018 and 2020are significantly lower than those under2040 ( p-values 0029 and 0024 respec-tively) This finding points to a potential costof driving up the punishment parameter Inother words before the system equilibrates ahigh punishment parameter can worsen thebudget imbalance problem inherent in themechanism

Overall Results 1 through 3 suggest the fol-lowing observations regarding games with stra-tegic complementarities First in terms ofconvergence level and speed supermodular andnear-supermodular games perform significantlybetter than those far under the threshold Sec-ond while there is no significant differencebetween supermodular and near-supermodulargames in terms of convergence level early per-formance differences may lead to supermodulartreatments reaching certain convergence levelsmore quickly Third beyond the threshold in-creasing has no significant effect on either thelevel or the speed of convergence but has thedisadvantage of risking higher budget imbal-ance Last for a given increasing haspartial effects on convergence level but nosignificant effect on convergence speed Tocheck the persistence of these experimental re-sults in the long run we use simulations inSection V

V Simulation Results Continued Dynamics

Our experiment examines the relationship be-tween strategic complementarity and conver-gence to equilibrium Learning theory predictslong-run convergence Figures 1 and 2 showthat convergence continues to improve in laterrounds for several treatments suggesting con-tinued dynamics Due to time attention andresource constraints it was not feasible to runthe experiment for much longer than 60rounds Therefore we rely on simulations tostudy continued convergence beyond 60rounds

To do so we look for a learning algorithmwhich when calibrated closely approximatesthe observed dynamic paths over 60 roundsThe large empirical literature on learning ingames (see eg Camerer 2003 for a survey)suggests many models Our interest here is notto compare the performance of various learningmodels We look for learning models that matchtwo criteria First we require a model that per-forms well in a variety of experimental gamesSecond given our experiment has complete in-formation about the payoff structure the modelneeds to incorporate this information In thefollowing subsections we first introduce threelearning models which meet our criteria Wethen look at how well the learning models pre-dict the experimental data by calibrating eachalgorithm on a subset of the experimental dataand validating the models on a hold-out sampleWe then report the forecasting results using oneof the calibrated algorithms

A Three Learning Models

The models we examine are stochastic ficti-tious play with discounting (hereafter shortenedas sFP) (Yin-Wong Cheung and Daniel Fried-man 1997 Fudenberg and Levine 1998) func-tional EWA (fEWA) (Teck-Hua Ho et al2001) and the payoff assessment learning model(PA) (Rajiv Sarin and Farshid Vahid 1999)We now give a brief overview of each modelInterested readers are referred to the originalsfor complete descriptions

The particular version of sFP that we use islogistic fictitious play (see eg Fudenberg andLevine 1998) A player predicts the matchrsquosprice in the next round pi(t 1) according to

1520 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(6) pi t 1

pi t yen 1

t 1

rpi t

1 yen 1

t 1

r

for some discount factor r [0 1] Note r 0corresponds to the Cournot best reply assess-ment pi(t 1) pi(t) When r 1 it yieldsthe standard fictitious play assessment Theusual adaptive learning model assumes 0 r 1All observations influence the expected state butmore recent observations have greater weight

As opposed to standard fictitious play stochas-tic fictitious play allows decision randomizationand thus better captures the human learning pro-cess Omitting time subscripts the probability thata player announces price pi is given by

(7) Pr pipi exp13 pi pi

yenj 0

40

exp13 pj pj

Given a predicted price a player is thus morelikely to play strategies that yield higher pay-offs How much more likely is determined by 13the sensitivity parameter As 13 increases theprobability of a best response to pi increases

The second model we consider fEWA is aone-parameter variant of the experience-weighted attraction (EWA) model (Camererand Ho 1999) In this model strategy probabil-ities are determined by logit probabilities simi-lar to equation (7) with actual payoffs ((pipi)) replaced by strategy attraction In bothvariants strategy attraction Ai

j(t) and an expe-rience weight N(t) are updated after everyperiod The experience weight is updated ac-cording to N(t) (1 ) N(t 1) 1where is the change-detection parameter and controls exploration (low ) versus exploita-tion Attractions are updated according to thefollowing rule

(8) Aijt

Nt 1Ai

jt 1 1 I pij pi ti pj pi t

Nt

where the indicator function I(pij pi(t)) equals

one if pij pi(t) and zero otherwise and [0

1] is the imagination weight In EWA all pa-rameters are estimated whereas in fEWA theseparameters (except for 13) are endogenously de-termined by the following functions Thechange-detector function i(t) is given by

(9) i t 1 5 j 1

mi Ipij pi t

1

yen 1

t

I pij pi t

t 2

This function will be close to 1 when recenthistory resembles previous history The imagi-nation weight i(t) equals i(t) while equalsthe Gini coefficient of previous choice frequen-cies EWA models encompass a variety offamiliar learning models cumulative reinforce-ment learning ( 0 1 N(0) 1)weighted reinforcement learning ( 0 0N(0) 1) weighted fictitious play ( 1 0) standard fictitious play ( 1 0)and Cournot best reply ( 1 1)

Finally we introduce the main componentsof the PA model For simplicity we omit allsubscripts that represent player i and let j(t)represent the actual payoff of strategy j in roundt Since the game has a large strategy space weincorporate similarity functions into the modelto represent agent use of strategy similarity Asstrategies in this game are naturally ordered bytheir labels we use the Bartlett similarity func-tion fjk(h t) to denote the similarity betweenthe played strategy k and an unplayed strategyj at period t

fjk h t 1 j kh if j k h0 otherwise

In this function the parameter h determines theh 1 unplayed strategies on either side of theplayed strategy to be updated When h 1fjk(1 t) degenerates into an indicator functionequal to one if strategy j is chosen in round t andzero otherwise

The PA model assumes that a player is amyopic subjective maximizer That is hechooses strategies based on assessed payoffs

1521VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

and does not explicitly take into account thelikelihood of alternate states Let uj(t) denotethe subjective assessment of strategy pj at timet and r the discount factor Payoff assessmentsare updated through a weighted average of hisprevious assessments and the payoff he actuallyobtains at time t If strategy k is chosen at timet then

(10) uj t 1 1 rfjk h tuj t

rfjk h tk t j

Each period the assessed strategy payoffs aresubject to zero-mean symmetrically distributedshocks zj(t) The decision maker chooses on thebasis of his shock-distorted subjective assess-ments uj(t) uj(t) zj(t) At time t he choosesstrategy pk if

(11) uk t uj t 0 pj pk

Note that mood shocks affect only his choicesand not the manner in which assessments areupdated

B Calibration

Literature assessing the performance oflearning models contains two approaches to cal-ibration and validation The first approach cal-ibrates the model on the first t rounds andvalidates on the remaining rounds The secondapproach uses half of the sessions in each treat-ment to calibrate and the other half to validateWe choose the latter approach for two reasonsFirst this approach is feasible as we have mul-tiple independent sessions for each treatmentSecond we need not assume that the parametersof later rounds are the same as those in earlierrounds We thus calibrate the parameters ofeach model in blocks of 15 rounds using theexperimental data from the first two sessions ofeach treatment We then evaluate each model bymeasuring how well the parameterized modelpredicts play in the remaining two or threesessions

For parameter estimation we conduct MonteCarlo simulations designed to replicate thecharacteristics of the experimental settings In

all calibrations we exclude the last two rounds(59 and 60) to avoid any end-of-game effectsWe then compare the simulated paths with theexperimental data to find those parameters thatminimize the mean-squared deviation (MSD)scores Since the final distributions of our pricedata are unimodal the simulated mean is aninformative statistic and is well captured byMSD (Ernan Haruvy and Dale Stahl 2000) Inall simulations we use the k-period-aheadrather than the one-period-ahead approach14 be-cause we are interested in forecasting the long-run mechanism performance In doing so wechoose k 10 15 20 30 and 58 We look atblocks of 10 15 20 and 30 rounds because asplayers gain information and experience infor-mation use may change over time We use k 15 rather than the other values because it bestcaptures the actual dynamics in the experimen-tal data

Each simulation consists of 1500 games(18000 players) and the following steps

(i) Simulated players are randomly matchedinto pairs at the beginning of each round

(ii) Simulated players select price announce-ments(a) Initial round Almost half of all play-

ers selected a first-round price of 20(44 percent of player 1s and 43 percentof player 2s) There are also consider-able spikes in the frequency of select-ing other prices divisible by 515 Basedon Kolmogorov-Smirnov tests on theactual round-one price distribution wereject the null hypotheses of uniformdistribution (d 0250 p-value 0000) and normal distribution (d 0381 p-value 0000) We thus fol-low the convention (eg Ho et al2001) and use the actual first-roundempirical distribution of choices togenerate the first-round choices

(b) Subsequent rounds Simulated playerstrategies are determined via equation(7) for sFP and fEWA and equation(11) for PA

14 Ido Erev and Haruvy (2000) discuss the tradeoffs ofthe two approaches

15 For example the seven most frequently selected pricesare divisible by 5

1522 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iii) Simulated player 1rsquos quantity choice isbased on the following steps(a) Determine each player 1rsquos best re-

sponse to p2(b) Determine whether player 1 will devi-

ate from the best response via the ac-tual probability of errors for eachblock

(c) If yes deviate via the actual mean andstandard deviation for the block Oth-erwise play the best response

(iv) Payoffs are determined by the payoff func-tion of the compensation mechanismequation (1) for each treatment

(v) Assessments are updated according toequation (8) for fEWA and equation (10)for PA

(vi) Proceed to the next round

The discount factor r [0 1] is searched ata grid size of 01 The parameter 13 is searchedat a grid size of 01 in the interval [15 105] forfEWA and [15 250] for sFP The size of thesimilarity window h [1 10] is searched at agrid size of 1 Mood shocks z are drawn from

a uniform distribution16 on an interval [a a]where a is searched on [0 500] with a step sizeof 50 For all parameters intervals and gridsizes are determined by payoff magnitude

Table 6 reports the calibrated parameters(discount factor sensitivity parameter moodshock interval and similarity window size) forthe first two sessions of each treatment in 15-round blocks17 Estimated parameters for thesupermodular and near-supermodular treatmentsare consistent with the increased level of con-vergence over time while the 00 treatmentsare not The second column (sFP) reports thebest-fit parameters for the stochastic fictitiousplay model With the exception of treatment2000 the discount factor is close to 10

16 Chen and Yuri Khoroshilov (2003) compare threeversions of PA models where shocks were drawn fromuniform logistic and double exponential distributions ontwo sets of experimental data and find that the performanceof the three versions of PA were statistically indistinguish-able

17 We also calibrate all parameters for the entire 60rounds but do not report results due to space limitations

TABLE 6mdashCALIBRATION OF THREE LEARNING MODELS IN FIFTEEN-ROUND BLOCKS

Model sFP fEWA PA

Block 1 2 3 4 1 2 3 4 1 2 3 4

Discount rate (r) Discount rate (r)1000 10 07 09 10 01 00 09 102000 07 06 01 01 05 10 02 012018 10 10 09 09 02 10 03 021020 10 10 10 09 01 03 06 032020 10 10 10 09 02 01 05 022040 10 10 10 07 01 10 04 03

Sensitivity (13) Sensitivity (13) Shock interval (a)1000 15 44 45 28 16 45 44 37 500 500 250 502000 15 43 140 180 17 40 51 74 50 100 150 502018 16 80 115 183 15 53 62 95 50 300 500 1501020 19 59 86 138 18 47 62 82 100 500 450 3002020 28 56 81 112 24 38 61 67 200 300 350 3502040 34 65 140 205 32 49 67 99 150 50 150 50

Similarity window (h)1000 1 1 10 92000 10 10 9 62018 5 4 6 11020 1 1 8 32020 2 2 7 12040 1 3 2 2

1523VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

indicating that players keep track of all past playwhen forming beliefs about opponentsrsquo nextmove18 Given these beliefs the increasing sen-sitivity parameter 13 for all treatments except1000 indicates that the likelihood of bestresponse increases over time The third columnreports calibration results for the fEWA modelAgain with the exception of treatment 1000the parameter 13 increases over time The finalcolumn reports calibration of parameters in thePA model For each treatment except 1000 amiddle block has the highest discount factorindicating more weight on new informationabout a strategyrsquos performance The second pa-rameter in the PA model represents the upperbound of the interval from which shocks aredrawn a Experimentation should decrease inthe final rounds Indeed for all treatments theestimated mood-shock ranges (weakly) de-crease from the third to the last block Finallythe decreasing similarity windows from thethird to final block indicate decreasing strategyspillover Relatively large discount factors inthe third block combined with relatively largesimilarity windows flatten the payoff assess-ments around the most recently played strate-gies consistent with the local experimentation(or relatively stabilized play) observed in thedata

C Validation

Using the parameters calibrated on the firsttwo sessions of each treatment we next com-pare the performance of the three learning mod-els in predicting play in the hold-out sampleFor comparison we also present the perfor-mance of two static models The random choicemodel assumes that each player randomlychooses any strategy with equal probability forall rounds This model incorporates only num-ber of strategies and thus provides a minimumstandard for a dynamic learning model Theequilibrium model assumes that each playerplays the subgame perfect Nash equilibriumevery round Its fit conveys the same informa-tion as the proportion of equilibrium play pre-

sented in Section IV but with a different metric(MSD)

Table 7 presents each modelrsquos MSD scoresfor each hold-out session Recall that two ofeach treatmentrsquos four or five independent ses-sions are used for calibration and the rest forvalidation The results indicate that all threelearning models perform significantly betterthan the random choice model (p-value 001one-sided permutation tests) and by a largermargin significantly better than the equilibriummodel (p-value 001 one-sided permutationtests) While the equilibrium model does a poorjob of explaining overall experimental data itsperformance improvement over time19 justifiesthe use of learning models to explain the dy-namics of play Within each of the top threepanels (ie learning models) session-levelMSD scores are lower for the supermodular andnear-supermodular treatments indicating thateach learning model does a better job explainingbehavior in those treatments Indeed in the em-pirical learning literature learning models fitbetter in experiments with better equilibriumconvergence (see eg Chen and Khoroshilov2003)

RESULT 4 (Comparison of Learning Mod-els) The performance of PA is weakly betterthan that of fEWA and strictly better than that ofsFP The performances of fEWA and sFP arenot significantly different

SUPPORT Table 7 reports the MSD scores foreach independent session in the hold-out sampleunder each model The Wilcoxon signed-rankstests show that the MSD scores under PA areweakly lower than those under fEWA (z 0085 p-value 0068) and strictly lowerthan those under sFP (z 0028 p-value 0023) Using the same test we cannot reject thehypothesis that fEWA and sFP yield the sameMSD scores (z 0662 p-value 0508)

Although the performance of PA is onlyweakly better than that of fEWA the overallMSD scores for PA are lower than those for

18 As the discount factor is zero in Cournot best replywe can reject this model based on our estimation of thediscount factors

19 We omit the table to show the performance of theequilibrium model in blocks of 15 rounds as the informa-tion is repetitive with Figures 1 and 2

1524 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

fEWA for every treatment Therefore we usePA for forecasting beyond 60 rounds

Figure 3 presents the simulated path of thecalibrated PA model and compares it with theactual data in the hold-out sample by superim-posing the simulated mean (black line) plus andminus one standard deviation (grey lines) on theactual mean (black box) and standard deviation(error bars) The simulation does a good job oftracking both the mean and the standard devia-tion of the actual data In treatments 1000and 2040 however the reduction in variancelags that seen in the middle rounds of theexperiment

D Forecasting

We now report the results from the PAmodel As the strategic complementarity predic-tions concern long-run performance we use thecalibrated parameters for the first 58 rounds tosimulate play in later rounds This exercise al-

lows us to study convergence and efficiency inlong but finite horizons

In all forecasting exercises we base all post-58-round parameters on those from the lastblock (calibrated from rounds 46 to 58) Sincewe do not know how these parameters mightchange beyond 60 rounds we use three differ-ent specifications First we retain all final-blockparameters Second we exponentially decay insubsequent blocks the probability of deviationfrom the best response in the production stage20

Third we exponentially decay the shock inter-val in subsequent blocks As all three specifica-tions yield qualitatively similar results wereport results from only the third specificationdue to space limitations21 In presenting the

20 In block k 4 we use the probability of deviationPrk minPr4(k 4)2 108 We set a lower bound of108 to avoid the division-by-zero problem

21 Different sequences of random numbers produceslightly different parameters estimates While the conver-

TABLE 7mdashVALIDATION ON HOLD-OUT SESSIONS

Session

Stochastic fictitious play

1000 2000 2018 1020 2020 2040

1 0955 0934 0940 0930 0888 09402 0960 0946 0890 0917 0973 08783 0948 0883 0873Overall 0957 0943 0915 0910 0912 0909

fEWA

1 0956 0934 0942 0928 0887 09482 0960 0946 0890 0922 0978 08863 0946 0879 0871Overall 0958 0942 0916 0910 0912 0917

Payoff assessment

1 0952 0933 0914 0941 0888 09162 0957 0944 0899 0922 0917 08983 0947 0894 0903Overall 0955 0941 0907 0919 0902 0907

Equilibrium play

1 1867 1853 1833 1833 1489 17922 1889 1919 1606 1839 1953 16693 1917 1447 1539Overall 1878 1896 1719 1706 1660 1731

Random play

Overall 0976 0976 0976 0976 0976 0976

1525VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

results we use the shorthand notation x y todenote a measure under treatment x is signifi-cantly higher than that under treatment y at the5-percent level or less and x y to denote thatthe measures are not significantly different atthe 5-percent level

Figure 4 reports the simulated proportion of-equilibrium play We report only the resultsfor the first 500 rounds since the dynamics donot change much thereafter In our simulationall treatments reach higher convergence levelsthan those achieved in the 60 rounds of the

experiment Since the simulated variance reduc-tion lags that of the experiment round 60 resultsare not achieved until approximately 90 roundsof simulation We further note that these im-provements slow down after 200 rounds Theproportion of -equilibrium play is bounded by72 percent for player 1 and 93 percent for player2 We now outline the simulation results

RESULT 5 (Level of Convergence in Round500) In the simulated data for player 1 atround 500 we have the following level of con-vergence ranking in

(i) p1 2018 2040 2020 1020 2000 1000

(ii) -p1 2018 2040 2020 1020 2000 1000

(iii) p2 2040 2018 1020 2020 2000 1000 and

gence level of a given treatment for a given set of parameterestimates is not affected by the sequence of random num-bers this convergence level is somewhat sensitive to theexact parameter calibration We therefore conduct four dif-ferent calibrations for each treatment and select the param-eters that yield the highest level The relative rankings oftreatments are quite robust with respect to ordinal selected

FIGURE 3 SIMULATED DYNAMIC PATH VERSUS ACTUAL DATA

1526 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iv) -p2 2040 2018 1020 2020 2000 1000

SUPPORT The fourth and seventh columns ofTable 8 report p-values for t-tests of the preced-ing alternative hypotheses for players 1 and 2respectively

Comparing Results 1 and 5 we first note thatthe four supermodular and near-supermodulartreatments continue to dominate the 00 treat-ments In fact the simulations suggest that thegap remains constant compared with 2000and actually increases compared with 1000Unlike Result 1 however the four dominanttreatments differ In particular 2018 per-forms better in terms of player 1rsquos convergenceand 2040 in terms of player 2rsquos although thedifferences within the top four treatments aresmaller than the differences between these four

and the 00 treatments In addition an increasein significantly improves convergence by alarge margin (40 to 80 percent) for both playerswhen 0

It is instructive to compare these results withthose concerning the speed of convergence inour experimental treatments In terms of con-vergence to -p2 our regression results (Table4) indicate that 2018rsquos performance in laterrounds enables it to achieve the same conver-gence level as the supermodular treatmentsThis trend continues in our simulations as2018 performs robustly well compared to thesupermodular treatments Likewise in our anal-ysis of the experimental results the conver-gence speed of 2040 was not significantlydifferent from those of the other dominant treat-ments In our simulations however this treat-ment dominates all treatments in player 2equilibrium play

FIGURE 3 CONTINUED

1527VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

In our simulations the convergence improve-ment due to increasing from 0 to 20 doesdepend on (-effect on -effect) An increasein significantly decreases the improvement in

convergence level (all p-values for one-sidedt-tests are less than 001) Due to the relativelypoor performance of 1000 in our simula-tions increasing from 10 to 20 when 0

Player 1

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500Round

Ep

silo

n-e

qu

ilib

riu

m p

lay

α10β00α20β00α20β18α10β20α20β20α20β40

Player 2

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500

Round

Ep

silo

n-e

qu

ilib

riu

m p

lay α10β00

α20β00α20β18α10β20α20β20α20β40

FIGURE 4 SIMULATED -EQUILIBRIUM PLAY

1528 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

improves convergence more dramatically inround 500 than in rounds 40 to 60 In the longrun an increase in is a partial substitute for anincrease in

We now use definition 2 to examine the con-vergence speed in the long run Table 9 presentsresults for the first time a treatment reaches thelevel of convergence L We omit the two 00treatments since they do not converge to thesame levels as the other treatments In terms ofplayer 1 2018 achieves the fastest conver-gence for all levels L The picture for player 2convergence however is more subtle First forthe 20 and 18 treatments the speeds of con-vergence to all L are remarkably similarWhile 2040 lags the other treatments interms of achieving 80-percent -equilibriumplay for player 2 it is the only treatment toachieve L 90 percent

Apart from convergence to equilibrium it isalso important to look at long-run welfare prop-erties We evaluate mechanism performance us-ing the efficiency measure in Section IV

Figure 5 summarizes the simulated and actualefficiency achieved by each of the treatmentsThe top panel reports simulated results whilethe bottom panel reports efficiency in the ex-perimental data In the long run supermodularand near-supermodular treatments continue todiffer from the 00 treatments with 20 supe-rior to 10 in the 00 treatments

VI Interpretation and Discussion

The underlying force for convergence ingames with strategic complementarities is thecombination of the slope of the best-responsefunctions and adaptive learning by players In

TABLE 9mdashSPEED OF CONVERGENCE IN SIMULATED DATA

(Threshold is the percent of players playing epsilon-equilibrium strategy Entry indicates round in which went overthreshold for good where ldquomdashrdquo indicates that treatment never achieved threshold)

Threshold

Player 1 Player 2

04 05 06 07 04 05 06 07 08 09

2018 54 83 126 316 38 52 62 87 127 mdash1020 129 221 mdash mdash 36 48 63 91 148 mdash2020 61 91 149 mdash 39 51 61 84 137 mdash2040 101 166 263 496 30 38 56 94 166 339

TABLE 8mdashRESULTS OF t-TESTS COMPARING LEVEL OF CONVERGENCE OF SIMULATIONS OF EXPERIMENTAL TREATMENTS

(Values are for round 500 and based on 1500 simulated games)

Treatment

Player 1 equilibrium price Player 2 equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0073 00822000 0188 2000 1000 0000 0213 2000 1000 00002018 0265 2018 2040 0187 0381 2018 1020 02641020 0211 1020 2000 0000 0376 1020 2020 00012020 0243 2020 1020 0000 0353 2020 2000 00002040 0259 2040 2020 0007 0442 2040 2018 0000

Treatment

Player 1 -equilibrium price Player 2 -equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0216 02322000 0519 2000 1000 0000 0573 2000 1000 00002018 0713 2018 2040 0045 0870 2018 1020 00081020 0584 1020 2000 0000 0857 1020 2020 00002020 0662 2020 1020 0000 0834 2020 2000 00002040 0701 2040 2020 0000 0925 2040 2018 0000

1529VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the compensation mechanisms the slope ofplayer 2rsquos best response function is determinedby As and determine the penalty for

mismatched prices we seriously consider thepossibility that convergence in supermodularand near-supermodular treatments to an equilib-

Simulation Data

75

80

85

90

95

100

50 150 250 350 450

Round

α10β00α20β00α20β18α10β20α20β20α20β40

Experimental Data

40

45

50

55

60

65

70

75

80

85

90

0 10 20 30 40 50 60

Round

Fra

ctio

n o

f M

axim

um

Wel

fare

α10β00α20β00α20β18α10β20α20β20α20β40

Fra

ctio

n o

f M

axim

um

Wel

fare

FIGURE 5 EFFICIENCY IN THE SIMULATED AND ACTUAL DATA

1530 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium where both players announce the sameprice is not due to strategic complementaritiesbut rather to increases in and making thepenalty terms more prominent and matching strat-egies more focal22 We thus have two hypoth-eses to explain the observed improvement inconvergence to equilibrium play in supermodu-lar and near-supermodular treatments a best-response hypothesis and a matching hypothesisIn this section we present evidence that theimprovement in convergence is due to payoff-relevant changes to best responses and not tomatching becoming more focal

While the focal point hypothesis suggeststhat players converge to a match it does notspecify the price at which they match In thefirst round of our experiments almost 50 per-cent of all prices were 20 and less than 10percent were in the -equilibrium range of[15 17] In the final rounds of our experi-ments play in the four supermodular or near-supermodular treatments converges stronglyto this range suggesting more than simplematching

By equation (4) player 1rsquos best response isalways to match regardless of p2 By theGeneral Incentive Hypothesis we expectmore matching behavior by player 1 as increases By equation (5) however match-ing is a best response for player 2 only inequilibrium Therefore data for player 2 canhelp us separate the best-response and match-ing hypotheses

We first investigate which model better ex-plains our experimental data We operationalizethe best response hypothesis by the stochasticfictitious play model of Section V and thematching hypothesis with a stochastic matchingmodel23 In both models the predicted player 1price is specified by equation (6) In the match-ing model player 2 plays this price with prob-ability 1 and one of the other 40 priceswith probability 40 We estimate using thefirst two sessions of each treatment24 We then

compare the performance of the matchingmodel with the sFP model using the hold-outsample Using the Wilcoxon signed-rank teststo compare the MSD scores in the hold-outsample we conclude that sFP explains player2rsquos behavior significantly better than the match-ing model (p-value 0006) We conclude thatbest response to a predicted price explains be-havior significantly better than the matchingmodel

We next investigate whether differences inthe cost of matching defined as the differencebetween matching and best-response profitscan explain all of the differences in match-ing behavior among treatments or whetherthere exists additional matching behaviorthat cannot be explained by payoff-relevantfactors

We examine the probability that player 2 triesto match the anticipated player 1 price We donot know what price player 2 anticipates Toestimate how players weigh history we cali-brate a matching model in a manner similar tothat outlined in Section V B We assume aplayer matches the price he anticipates givenby equation (6) and we find the discount ratefor each treatment that minimizes MSD25 Thisgives us the anticipated price that best explainsthe matching hypothesis Using this discountrate we calculate for each t 1 an anticipatedplayer 1 price p1 for each player 2 We alsocalculate the opportunity cost of matching p1equal to the difference between best-responseand matching profits This opportunity cost cap-tures the payoff relevant effects of and alsodepends on the price to be matched

We next regress the probability that player 2matches the anticipated price Pr(p2 p1) onln(Round) treatment dummies and treatmentdummies interacted with ln(Round) In onespecification we include the cost of matching asa regressor If parameter changes make match-ing more focal then changes in the cost ofmatching will not explain all of the changes inprobability of matching

22 We thank an anonymous referee and the co-editor forpointing this out

23 We do not report the results of the deterministicmatching model as it performs significantly worse than itsstochastic counterpart

24 Estimated in each block are as follows 1000 10 7 6 8 2000 7 6 5 4 2018 7

2 3 3 1020 5 3 3 2 2020 1 0 0 02040 2 0 0 1

25 In all treatments the calibrated discount rate is r 01

1531VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Table 10 presents the results of two probitmodels with clustering at the individual level Inthe first specification we do not include the costof matching as a regressor In this specificationthe coefficient for D00 is negative and signifi-cant thus reducing from 20 to 0 decreases theprobability that player 2 will try to match player1rsquos price Including the cost of matching in theregression however we find that neither thedummies nor their interactions are signifi-cantmdashthe coefficient for the previously signifi-cant D00 dummy falls almost by a half whilethe precision of the estimate stays about thesame The coefficient on the cost of matchinghowever is significant and negative Thisfinding suggests that the rise in matching that

occurs when we increase from 0 to 20 is notbecause matching is more focal but ratherbecause an increase of decreases the cost ofmatching26

VII Concluding Remarks

The appeal of games with strategic comple-mentarities is simple as long as players areadaptive learners the game will in the limitconverge to the set bounded by the largest andsmallest Nash equilibria This convergence de-pends on neither initial conditions nor the as-sumption of a particular learning modelUnfortunately while many competitive envi-ronments are supermodular many are not Inthis study we examine whether there are non-supermodular games which have the same con-vergence properties as supermodular ones Wealso study whether there exists a clear conver-gence ranking among games with strategiccomplementarities

Our results confirm the findings of previousexperimental studies that supermodular gamesperform significantly better than games far be-low the supermodular threshold From a littlebelow the threshold to the threshold howeverthe change in convergence level is statisticallyinsignificant This results suggests that in thecontext of mechanism design the designerneed not be overly concerned with setting pa-rameters that are firmly above the supermodularthreshold close is just as good It also enlargesthe set of robustly stable games For examplethe Falkinger mechanism (Falkinger 1996) isnot supermodular but close These results sug-gest that near-supermodular games perform likesupermodular ones

Our next result concerns convergence perfor-mance within the class of games with strategiccomplementarities Variations in the degree ofcomplementarities have no significant effect onperformance within the 60 experimental roundsOur simulations suggest however an increased

26 We consider other specifications of anticipated priceincluding the averages of the last one two three and fourprices seen In only one specification was a dummy or itsinteraction with ln(Round) significant In that specificationthe coefficient for D18 is positive and its interaction withln(Round) is negative a finding which is not consistent witha focal point hypothesis

TABLE 10mdashPLAYER 2 MATCHING HYPOTHESIS

(Probit models with individual-level clustering comparingmodels with and without the cost to Player 2 of matching

the anticipated price of Player 1)

Model

Dependent variableProbability player 2 priceequals anticipated player 1

price

(1) (2)

D10 0014 0017(0043) (0041)

D00 0077 0043(0035) (0036)

D18 0046 0032(0053) (0056)

D40 0008 0041(0061) (0042)

ln(Round) 0041 0028(0011) (0010)

Match Cost 0001(0000)

ln(Round)D10 0014 0012(0012) (0012)

ln(Round)D00 0003 0000(0013) (0012)

ln(Round)D18 0006 0002(0021) (0020)

ln(Round)D40 0001 0015(0018) (0016)

Observations 9588 9588Log pseudo-likelihood 324321 317716

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses are adjusted for cluster-ing at the individual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummyvariable for treatment y Excluded dummy is D2020Match Cost is the difference in cents of matching andbest-responding to anticipated Player 1 price

1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 12: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

the threshold of strategic complementarity 20 player 2rsquos Nash equilibrium strategy is alsoa dominant strategy The finding of no -effecton player 2rsquos equilibrium play when 20 isconsistent with this observation

To determine the effects of strategic comple-mentarities and other factors on convergencespeed and to investigate further their effects onconvergence level we use probit models withclustering at the individual level The resultsfrom these models are presented in Table 3 Thedependent variable is -p1 in specifications (1)and (3) and -p2 in specifications (2) and (4)

In specifications (1) and (2) the independentvariables are treatment dummies Dy wherey 1000 2000 18 1020 and 40ln(Round) and a constant We use dummies fordifferent values of and rather than directparameter values to avoid assuming a linearrelationship of parameter effects and omit thedummy for 2020 Therefore in these twospecifications restricting learning speed to bethe same across treatments the estimated coef-ficient of Dy captures the difference in theconvergence level between treatments y and2020 Results from these specifications are

TABLE 2mdashLEVEL OF CONVERGENCE IN THE LAST 20 ROUNDS

Panel A Proportion of player 1 Nash equilibrium price (p1) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0000 0008 0158 0008 0044 1000 2000 03972000 0000 0083 0083 0042 0058 0053 2000 2020 00082018 0125 0117 0100 0192 0133 2000 2018 00081020 0242 0067 0067 0067 0208 0130 1020 2020 00642020 0325 0267 0208 0058 0367 0245 2018 2020 00642040 0175 0400 0158 0133 0217 2020 2040 0643

Panel B Proportion of player 2 Nash equilibrium price (p2) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0025 0042 0025 0067 0040 1000 2000 00562000 0067 0083 0033 0075 0050 0062 2000 2020 00322018 0133 0292 0108 0258 0198 2000 2018 00081020 0392 0108 0175 0067 0667 0282 1020 2020 05042020 0308 0117 0483 0017 0450 0275 2018 2020 02382040 0325 0492 0233 0233 0321 2020 2040 0365

Panel C Proportion of player 1 -Nash equilibrium price (-p1) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0108 0200 0350 0158 0204 1000 2000 01832000 0067 0458 0358 0183 0425 0298 2000 2020 00872018 0317 0625 0300 0583 0456 2000 2018 01031020 0475 0200 0300 0375 0492 0368 1020 2020 01942020 0450 0558 0475 0133 0775 0478 2018 2020 04442040 0458 0492 0375 0442 0442 2020 2040 0643

Panel D Proportion of player 2 -Nash equilibrium price (-p2) Permutation tests

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0133 0200 0242 0225 0200 1000 2000 00162000 0300 0400 0200 0275 0333 0302 2000 2020 00162018 0717 0667 0342 0700 0606 2000 2018 00161020 0650 0517 0542 0400 0892 0600 1020 2020 05642020 0533 0592 0642 0225 0900 0578 2018 2020 05562040 0825 0792 0467 0517 0650 2020 2040 0294

Note Significant at 10-percent level 5-percent level 1-percent level

1516 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

largely consistent with Result 1 which uses amore conservative test The coefficients ofln(Round) are both positive and highly signifi-cant indicating that players learn to play equilib-rium strategies over time The concave functionalform ln(Round) which yields a better log-likelihood than either the linear or quadratic func-tional form indicates that learning is rapid at thebeginning and decreases over time We will ex-amine learning in more detail in Section V

In specifications (3) and (4) we useln(Round) interaction of each of the treatmentdummies with ln(Round) and a constant asindependent variables The interaction term al-lows different slopes for different treatmentsCompared with the coefficient of ln(Round) the

coefficient for the interaction term Dyln(Round) captures the slope differences be-tween treatment y and 2020 By Observation1 as we cannot reject the hypotheses that initialround probability of equilibrium play is thesame across all treatments11 we can comparethe slope of L(t) L(t) between different treat-ments From the probit specification L(t) 13(constant Db ln(t)) where D is the 1 5vector of treatment dummies and b is the 5 1vector of estimated coefficients we can derive

11 When including treatment dummies in this specifica-tion Wald tests on the hypothesis that the treatment dummycoefficients are all zero yield p-values of 02703 for -p1and 03383 for -p2

TABLE 3mdashCONVERGENCE SPEED PROBIT MODELS WITH CLUSTERING AT INDIVIDUAL LEVEL

Dependent variable -Nash equilibrium play

(1) -Nashprice 1

(2) -Nashprice 2

(3) -Nashprice 1

(4) -Nashprice 2

D1000 0143 0268(0048) (0050)

D2000 0090 0217(0060) (0057)

D18 0005 0004(0071) (0069)

D1020 0080 0025(0060) (0073)

D40 0000 0078(0068) (0078)

ln(Round) 0115 0167 0131 0183(0014) (0017) (0021) (0022)

D1000ln(Round) 0050 0095(0019) (0022)

D2000ln(Round) 0031 0073(0020) (0021)

D18ln(Round) 0001 0004(0020) (0021)

D1020ln(Round) 0024 0008(0019) (0022)

D40ln(Round) 0002 0023(0019) (0024)

Observations 9720 9720 9720 9720Number of groups 162 162 162 162Log pseudo-likelihood 5562890 5824055 5557424 5793013

D1000 D2000 150 131D1000ln(Round) D2000ln(Round) 133 123D1020 D1000 D2000 005 107D1020ln(Round) D1000ln(Round)

D2000ln(Round)005 103

Notes Coefficients are probability derivatives Robust standard errors in parentheses are adjusted for clustering at theindividual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummy variable fortreatment y Excluded dummy is D2020 The bottom panel presents null hypotheses and Wald 2(1) test statistics

1517VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the slope of the probability function for treat-ment y with respect to t Ly(t) 13(con-stant Db)byt All coefficients reported inTable 3 are increments of probability deriva-tives 13(constant Db)by compared to thebaseline case of 2020

RESULT 2 (Speed of Convergence)

(i) When 20 increasing from 0 to 18and from 0 to 20 significantly increases thespeed of convergence in -p1 and -p2

(ii) When 20 increasing from 18 to 20has no significant effect on convergencespeed

(iii) When 20 increasing from 20 to 40has no significant effect on convergencespeed

(iv) When 0 increasing from 10 to 20has no significant effects on convergencespeed

(v) When 20 increasing from 10 to 20has no significant effects on convergencespeed

SUPPORT Models (3) and (4) in Table 3 re-port analyses on convergence speed For eachindependent variable the coefficients standarderrors (in parentheses) and significance levelsare reported The second Wald test looks atwhether the coefficient of D1000ln(Round)equals that of D2000ln(Round)

Part (i) of Result 2 supports Hypotheses 1(b)and 2(b) while by part (ii) we reject Hypothesis3(b) Part (iii) supports Hypothesis 4(b) Parts(iv) and (v) reject Hypothesis 5(b) Result 2provides the first empirical evidence on the roleof strategic complementarity and the speed ofconvergence

Although part (ii) indicates that increasing from 18 to 20 does not significantly changeconvergence speed we now investigate whetherthere are any differences between 18 and thesupermodular treatments In particular we com-pare 2018 with 2020 and 204012 InResult 1 we show that these treatments achieve

the same convergence level Also we cannotreject the hypothesis that round-one prices foreach player are drawn for the same distributionfor each treatment13 As the treatments all startand converge to similar levels of equilibriumplay we use a more flexible functional form tolook for differences in speed

In Table 4 we report the results of probitregressions comparing 2018 with the two20 supermodular treatments We use Roundln(Round) and their interactions with 2018as independent variables to allow different con-vergence speeds to the same level of conver-gence In model (1) the dependent variable isplayer 1 -equilibrium play There is no signif-icant difference between 2018 and the 20supermodular treatments In model (2) the de-pendent variable is player 2 -equilibrium playThere are significant differences between thenear-supermodular and 20 supermodular treat-ments In early rounds ln(Round) is large relativeto Round The negative and weakly significantcoefficient on D2018ln(Round) implies a lowerprobability of early-round equilibrium play in

12 We omit 1020 from this analysis First it does notachieve the same -p1 convergence level as the other su-permodular treatments Second omitting it avoids the pos-sibility of an -effect

13 Kolmogorov-Smirnov test result tables are availableby request

TABLE 4mdashCONVERGENCE SPEED

Probit models with individual-level clustering comparing2018 with the 20 supermodular treatments achieving

similar convergence levels

Dependent variable -Nashequilibrium play

(1) -Nashprice 1

(2) -Nashprice 2

ln(Round) 0054 0216(0040) (0043)

Round 0004 0001(0002) (0002)

D2018ln(Round) 0013 0066(0032) (0035)

D2018Round 0001 0006(0002) (0003)

Observations 4680 4680Log pseudo-likelihood 287907 299385

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses Significant at 10-percent level 5-percent level 1-percent level Dy isthe dummy variable for treatment y Excluded dummies areD2020 and D2040

1518 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

2018 relative to the supermodular treatmentsThe 18 treatment does catch up to these treat-ments which is reflected in the positive andsignificant coefficient on D2018 interactedwith Round This result suggests a differencebetween supermodular and near-supermodulartreatments while they achieve similar conver-gence levels the supermodular treatments per-form better in early rounds and thus mightreach certain convergence levels faster thantheir near-supermodular counterparts

The previous discussion examines the sepa-rate effects of strategic complementarity (-effects) and (-effects) on convergenceVarying allows us however to test the im-portance of strategic complementarity relativeto other features of mechanism design Having afull factorial design in the parameters 1020 and 0 20 we can study whether affects the role of strategic complementarity Inparticular as increases from 0 to 20 weexpect improvement in convergence level andspeed We study whether this improvementchanges when increases from 10 to 20

The last two Wald tests in the bottom panelof Table 3 separately examine the -effectson the change in convergence level and speedresulting from increasing from 0 to 20(-effects on -effects) Changing from 10 to20 does not significantly change the improve-ment in either convergence level or speed Thisis the first empirical result examining the ef-fects of other factors on the role of strategiccomplementarity

So far we have discussed the performance ofthe mechanism relative to equilibrium predic-tions and individual behavior We now turn togroup-level welfare results Since the compen-sation mechanism balances the budget only in

equilibrium total payoffs off the equilibriumpath can be only weakly related to efficientpayoffs Therefore we use two separate mea-sures to capture welfare implications an effi-ciency measure and a measure of budgetimbalance

We first define the per-round efficiency mea-sure which includes neither the taxsubsidy northe penalty terms ie

et rxt cxt ext

rx cx ex

where x is the efficient quantity The efficiencyachieved in a block e(t1 t2) where 0 t1 t2 60 is then defined as

et1 t2 t t1

t2 et

t2 t1 1

RESULT 3 (Efficiency in the Last 20Rounds)

(i) When 20 increasing from 0 to 18significantly improves efficiency while in-creasing from 0 to 20 weakly improvesefficiency

(ii) When 20 increasing from 18 to 20has no significant effect on efficiency

(iii) When 20 increasing from 20 to 40weakly improves efficiency

(iv) When 0 increasing from 10 to 20weakly improves efficiency

(v) When 20 increasing from 10 to 20has no significant effect on efficiency

SUPPORT Table 5 presents the efficiencymeasure for each session in each treatment

TABLE 5mdashEFFICIENCY MEASURE

Efficiency Last 20 rounds

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0600 0671 0804 0858 0733 1000 2000 00872000 0650 0870 0907 0873 0825 0825 2000 2020 00952018 0963 0952 0889 0959 0941 2000 2018 00161020 0958 0919 0905 0881 0984 0929 1020 2020 07142020 0888 0937 0939 0780 0971 0903 2018 2020 08332040 0973 0962 0943 0949 0957 2020 2040 0064

Note Significant at 10-percent level 5-percent level 1-percent level

1519VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

in the last 20 rounds the alternative hypoth-eses and the results of one-tailed permutationtests

Result 3 is largely consistent with Result 1indicating that supermodular and near-supermodular mechanisms induce a signifi-cantly higher proportion of equilibrium playthan mechanisms far from the supermodularthreshold The new finding is that increasing from the threshold 20 to 40 weakly improvesefficiency

Second the budget surplus at round t is thesum of the penalty terms plus tax minus sub-sidy in that round ie s(t) ( )(p1(t) p2(t))2 p2(t)x(t) p1(t)x(t) Examining thesession-level budget surplus across all roundswe find the following results First 22 out of 27sessions result in a budget surplus This comesfrom a combination of our choice of parametersand the dynamics of play Second the onlysignificant difference across treatments is thatbudget surpluses under 2018 and 2020are significantly lower than those under2040 ( p-values 0029 and 0024 respec-tively) This finding points to a potential costof driving up the punishment parameter Inother words before the system equilibrates ahigh punishment parameter can worsen thebudget imbalance problem inherent in themechanism

Overall Results 1 through 3 suggest the fol-lowing observations regarding games with stra-tegic complementarities First in terms ofconvergence level and speed supermodular andnear-supermodular games perform significantlybetter than those far under the threshold Sec-ond while there is no significant differencebetween supermodular and near-supermodulargames in terms of convergence level early per-formance differences may lead to supermodulartreatments reaching certain convergence levelsmore quickly Third beyond the threshold in-creasing has no significant effect on either thelevel or the speed of convergence but has thedisadvantage of risking higher budget imbal-ance Last for a given increasing haspartial effects on convergence level but nosignificant effect on convergence speed Tocheck the persistence of these experimental re-sults in the long run we use simulations inSection V

V Simulation Results Continued Dynamics

Our experiment examines the relationship be-tween strategic complementarity and conver-gence to equilibrium Learning theory predictslong-run convergence Figures 1 and 2 showthat convergence continues to improve in laterrounds for several treatments suggesting con-tinued dynamics Due to time attention andresource constraints it was not feasible to runthe experiment for much longer than 60rounds Therefore we rely on simulations tostudy continued convergence beyond 60rounds

To do so we look for a learning algorithmwhich when calibrated closely approximatesthe observed dynamic paths over 60 roundsThe large empirical literature on learning ingames (see eg Camerer 2003 for a survey)suggests many models Our interest here is notto compare the performance of various learningmodels We look for learning models that matchtwo criteria First we require a model that per-forms well in a variety of experimental gamesSecond given our experiment has complete in-formation about the payoff structure the modelneeds to incorporate this information In thefollowing subsections we first introduce threelearning models which meet our criteria Wethen look at how well the learning models pre-dict the experimental data by calibrating eachalgorithm on a subset of the experimental dataand validating the models on a hold-out sampleWe then report the forecasting results using oneof the calibrated algorithms

A Three Learning Models

The models we examine are stochastic ficti-tious play with discounting (hereafter shortenedas sFP) (Yin-Wong Cheung and Daniel Fried-man 1997 Fudenberg and Levine 1998) func-tional EWA (fEWA) (Teck-Hua Ho et al2001) and the payoff assessment learning model(PA) (Rajiv Sarin and Farshid Vahid 1999)We now give a brief overview of each modelInterested readers are referred to the originalsfor complete descriptions

The particular version of sFP that we use islogistic fictitious play (see eg Fudenberg andLevine 1998) A player predicts the matchrsquosprice in the next round pi(t 1) according to

1520 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(6) pi t 1

pi t yen 1

t 1

rpi t

1 yen 1

t 1

r

for some discount factor r [0 1] Note r 0corresponds to the Cournot best reply assess-ment pi(t 1) pi(t) When r 1 it yieldsthe standard fictitious play assessment Theusual adaptive learning model assumes 0 r 1All observations influence the expected state butmore recent observations have greater weight

As opposed to standard fictitious play stochas-tic fictitious play allows decision randomizationand thus better captures the human learning pro-cess Omitting time subscripts the probability thata player announces price pi is given by

(7) Pr pipi exp13 pi pi

yenj 0

40

exp13 pj pj

Given a predicted price a player is thus morelikely to play strategies that yield higher pay-offs How much more likely is determined by 13the sensitivity parameter As 13 increases theprobability of a best response to pi increases

The second model we consider fEWA is aone-parameter variant of the experience-weighted attraction (EWA) model (Camererand Ho 1999) In this model strategy probabil-ities are determined by logit probabilities simi-lar to equation (7) with actual payoffs ((pipi)) replaced by strategy attraction In bothvariants strategy attraction Ai

j(t) and an expe-rience weight N(t) are updated after everyperiod The experience weight is updated ac-cording to N(t) (1 ) N(t 1) 1where is the change-detection parameter and controls exploration (low ) versus exploita-tion Attractions are updated according to thefollowing rule

(8) Aijt

Nt 1Ai

jt 1 1 I pij pi ti pj pi t

Nt

where the indicator function I(pij pi(t)) equals

one if pij pi(t) and zero otherwise and [0

1] is the imagination weight In EWA all pa-rameters are estimated whereas in fEWA theseparameters (except for 13) are endogenously de-termined by the following functions Thechange-detector function i(t) is given by

(9) i t 1 5 j 1

mi Ipij pi t

1

yen 1

t

I pij pi t

t 2

This function will be close to 1 when recenthistory resembles previous history The imagi-nation weight i(t) equals i(t) while equalsthe Gini coefficient of previous choice frequen-cies EWA models encompass a variety offamiliar learning models cumulative reinforce-ment learning ( 0 1 N(0) 1)weighted reinforcement learning ( 0 0N(0) 1) weighted fictitious play ( 1 0) standard fictitious play ( 1 0)and Cournot best reply ( 1 1)

Finally we introduce the main componentsof the PA model For simplicity we omit allsubscripts that represent player i and let j(t)represent the actual payoff of strategy j in roundt Since the game has a large strategy space weincorporate similarity functions into the modelto represent agent use of strategy similarity Asstrategies in this game are naturally ordered bytheir labels we use the Bartlett similarity func-tion fjk(h t) to denote the similarity betweenthe played strategy k and an unplayed strategyj at period t

fjk h t 1 j kh if j k h0 otherwise

In this function the parameter h determines theh 1 unplayed strategies on either side of theplayed strategy to be updated When h 1fjk(1 t) degenerates into an indicator functionequal to one if strategy j is chosen in round t andzero otherwise

The PA model assumes that a player is amyopic subjective maximizer That is hechooses strategies based on assessed payoffs

1521VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

and does not explicitly take into account thelikelihood of alternate states Let uj(t) denotethe subjective assessment of strategy pj at timet and r the discount factor Payoff assessmentsare updated through a weighted average of hisprevious assessments and the payoff he actuallyobtains at time t If strategy k is chosen at timet then

(10) uj t 1 1 rfjk h tuj t

rfjk h tk t j

Each period the assessed strategy payoffs aresubject to zero-mean symmetrically distributedshocks zj(t) The decision maker chooses on thebasis of his shock-distorted subjective assess-ments uj(t) uj(t) zj(t) At time t he choosesstrategy pk if

(11) uk t uj t 0 pj pk

Note that mood shocks affect only his choicesand not the manner in which assessments areupdated

B Calibration

Literature assessing the performance oflearning models contains two approaches to cal-ibration and validation The first approach cal-ibrates the model on the first t rounds andvalidates on the remaining rounds The secondapproach uses half of the sessions in each treat-ment to calibrate and the other half to validateWe choose the latter approach for two reasonsFirst this approach is feasible as we have mul-tiple independent sessions for each treatmentSecond we need not assume that the parametersof later rounds are the same as those in earlierrounds We thus calibrate the parameters ofeach model in blocks of 15 rounds using theexperimental data from the first two sessions ofeach treatment We then evaluate each model bymeasuring how well the parameterized modelpredicts play in the remaining two or threesessions

For parameter estimation we conduct MonteCarlo simulations designed to replicate thecharacteristics of the experimental settings In

all calibrations we exclude the last two rounds(59 and 60) to avoid any end-of-game effectsWe then compare the simulated paths with theexperimental data to find those parameters thatminimize the mean-squared deviation (MSD)scores Since the final distributions of our pricedata are unimodal the simulated mean is aninformative statistic and is well captured byMSD (Ernan Haruvy and Dale Stahl 2000) Inall simulations we use the k-period-aheadrather than the one-period-ahead approach14 be-cause we are interested in forecasting the long-run mechanism performance In doing so wechoose k 10 15 20 30 and 58 We look atblocks of 10 15 20 and 30 rounds because asplayers gain information and experience infor-mation use may change over time We use k 15 rather than the other values because it bestcaptures the actual dynamics in the experimen-tal data

Each simulation consists of 1500 games(18000 players) and the following steps

(i) Simulated players are randomly matchedinto pairs at the beginning of each round

(ii) Simulated players select price announce-ments(a) Initial round Almost half of all play-

ers selected a first-round price of 20(44 percent of player 1s and 43 percentof player 2s) There are also consider-able spikes in the frequency of select-ing other prices divisible by 515 Basedon Kolmogorov-Smirnov tests on theactual round-one price distribution wereject the null hypotheses of uniformdistribution (d 0250 p-value 0000) and normal distribution (d 0381 p-value 0000) We thus fol-low the convention (eg Ho et al2001) and use the actual first-roundempirical distribution of choices togenerate the first-round choices

(b) Subsequent rounds Simulated playerstrategies are determined via equation(7) for sFP and fEWA and equation(11) for PA

14 Ido Erev and Haruvy (2000) discuss the tradeoffs ofthe two approaches

15 For example the seven most frequently selected pricesare divisible by 5

1522 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iii) Simulated player 1rsquos quantity choice isbased on the following steps(a) Determine each player 1rsquos best re-

sponse to p2(b) Determine whether player 1 will devi-

ate from the best response via the ac-tual probability of errors for eachblock

(c) If yes deviate via the actual mean andstandard deviation for the block Oth-erwise play the best response

(iv) Payoffs are determined by the payoff func-tion of the compensation mechanismequation (1) for each treatment

(v) Assessments are updated according toequation (8) for fEWA and equation (10)for PA

(vi) Proceed to the next round

The discount factor r [0 1] is searched ata grid size of 01 The parameter 13 is searchedat a grid size of 01 in the interval [15 105] forfEWA and [15 250] for sFP The size of thesimilarity window h [1 10] is searched at agrid size of 1 Mood shocks z are drawn from

a uniform distribution16 on an interval [a a]where a is searched on [0 500] with a step sizeof 50 For all parameters intervals and gridsizes are determined by payoff magnitude

Table 6 reports the calibrated parameters(discount factor sensitivity parameter moodshock interval and similarity window size) forthe first two sessions of each treatment in 15-round blocks17 Estimated parameters for thesupermodular and near-supermodular treatmentsare consistent with the increased level of con-vergence over time while the 00 treatmentsare not The second column (sFP) reports thebest-fit parameters for the stochastic fictitiousplay model With the exception of treatment2000 the discount factor is close to 10

16 Chen and Yuri Khoroshilov (2003) compare threeversions of PA models where shocks were drawn fromuniform logistic and double exponential distributions ontwo sets of experimental data and find that the performanceof the three versions of PA were statistically indistinguish-able

17 We also calibrate all parameters for the entire 60rounds but do not report results due to space limitations

TABLE 6mdashCALIBRATION OF THREE LEARNING MODELS IN FIFTEEN-ROUND BLOCKS

Model sFP fEWA PA

Block 1 2 3 4 1 2 3 4 1 2 3 4

Discount rate (r) Discount rate (r)1000 10 07 09 10 01 00 09 102000 07 06 01 01 05 10 02 012018 10 10 09 09 02 10 03 021020 10 10 10 09 01 03 06 032020 10 10 10 09 02 01 05 022040 10 10 10 07 01 10 04 03

Sensitivity (13) Sensitivity (13) Shock interval (a)1000 15 44 45 28 16 45 44 37 500 500 250 502000 15 43 140 180 17 40 51 74 50 100 150 502018 16 80 115 183 15 53 62 95 50 300 500 1501020 19 59 86 138 18 47 62 82 100 500 450 3002020 28 56 81 112 24 38 61 67 200 300 350 3502040 34 65 140 205 32 49 67 99 150 50 150 50

Similarity window (h)1000 1 1 10 92000 10 10 9 62018 5 4 6 11020 1 1 8 32020 2 2 7 12040 1 3 2 2

1523VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

indicating that players keep track of all past playwhen forming beliefs about opponentsrsquo nextmove18 Given these beliefs the increasing sen-sitivity parameter 13 for all treatments except1000 indicates that the likelihood of bestresponse increases over time The third columnreports calibration results for the fEWA modelAgain with the exception of treatment 1000the parameter 13 increases over time The finalcolumn reports calibration of parameters in thePA model For each treatment except 1000 amiddle block has the highest discount factorindicating more weight on new informationabout a strategyrsquos performance The second pa-rameter in the PA model represents the upperbound of the interval from which shocks aredrawn a Experimentation should decrease inthe final rounds Indeed for all treatments theestimated mood-shock ranges (weakly) de-crease from the third to the last block Finallythe decreasing similarity windows from thethird to final block indicate decreasing strategyspillover Relatively large discount factors inthe third block combined with relatively largesimilarity windows flatten the payoff assess-ments around the most recently played strate-gies consistent with the local experimentation(or relatively stabilized play) observed in thedata

C Validation

Using the parameters calibrated on the firsttwo sessions of each treatment we next com-pare the performance of the three learning mod-els in predicting play in the hold-out sampleFor comparison we also present the perfor-mance of two static models The random choicemodel assumes that each player randomlychooses any strategy with equal probability forall rounds This model incorporates only num-ber of strategies and thus provides a minimumstandard for a dynamic learning model Theequilibrium model assumes that each playerplays the subgame perfect Nash equilibriumevery round Its fit conveys the same informa-tion as the proportion of equilibrium play pre-

sented in Section IV but with a different metric(MSD)

Table 7 presents each modelrsquos MSD scoresfor each hold-out session Recall that two ofeach treatmentrsquos four or five independent ses-sions are used for calibration and the rest forvalidation The results indicate that all threelearning models perform significantly betterthan the random choice model (p-value 001one-sided permutation tests) and by a largermargin significantly better than the equilibriummodel (p-value 001 one-sided permutationtests) While the equilibrium model does a poorjob of explaining overall experimental data itsperformance improvement over time19 justifiesthe use of learning models to explain the dy-namics of play Within each of the top threepanels (ie learning models) session-levelMSD scores are lower for the supermodular andnear-supermodular treatments indicating thateach learning model does a better job explainingbehavior in those treatments Indeed in the em-pirical learning literature learning models fitbetter in experiments with better equilibriumconvergence (see eg Chen and Khoroshilov2003)

RESULT 4 (Comparison of Learning Mod-els) The performance of PA is weakly betterthan that of fEWA and strictly better than that ofsFP The performances of fEWA and sFP arenot significantly different

SUPPORT Table 7 reports the MSD scores foreach independent session in the hold-out sampleunder each model The Wilcoxon signed-rankstests show that the MSD scores under PA areweakly lower than those under fEWA (z 0085 p-value 0068) and strictly lowerthan those under sFP (z 0028 p-value 0023) Using the same test we cannot reject thehypothesis that fEWA and sFP yield the sameMSD scores (z 0662 p-value 0508)

Although the performance of PA is onlyweakly better than that of fEWA the overallMSD scores for PA are lower than those for

18 As the discount factor is zero in Cournot best replywe can reject this model based on our estimation of thediscount factors

19 We omit the table to show the performance of theequilibrium model in blocks of 15 rounds as the informa-tion is repetitive with Figures 1 and 2

1524 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

fEWA for every treatment Therefore we usePA for forecasting beyond 60 rounds

Figure 3 presents the simulated path of thecalibrated PA model and compares it with theactual data in the hold-out sample by superim-posing the simulated mean (black line) plus andminus one standard deviation (grey lines) on theactual mean (black box) and standard deviation(error bars) The simulation does a good job oftracking both the mean and the standard devia-tion of the actual data In treatments 1000and 2040 however the reduction in variancelags that seen in the middle rounds of theexperiment

D Forecasting

We now report the results from the PAmodel As the strategic complementarity predic-tions concern long-run performance we use thecalibrated parameters for the first 58 rounds tosimulate play in later rounds This exercise al-

lows us to study convergence and efficiency inlong but finite horizons

In all forecasting exercises we base all post-58-round parameters on those from the lastblock (calibrated from rounds 46 to 58) Sincewe do not know how these parameters mightchange beyond 60 rounds we use three differ-ent specifications First we retain all final-blockparameters Second we exponentially decay insubsequent blocks the probability of deviationfrom the best response in the production stage20

Third we exponentially decay the shock inter-val in subsequent blocks As all three specifica-tions yield qualitatively similar results wereport results from only the third specificationdue to space limitations21 In presenting the

20 In block k 4 we use the probability of deviationPrk minPr4(k 4)2 108 We set a lower bound of108 to avoid the division-by-zero problem

21 Different sequences of random numbers produceslightly different parameters estimates While the conver-

TABLE 7mdashVALIDATION ON HOLD-OUT SESSIONS

Session

Stochastic fictitious play

1000 2000 2018 1020 2020 2040

1 0955 0934 0940 0930 0888 09402 0960 0946 0890 0917 0973 08783 0948 0883 0873Overall 0957 0943 0915 0910 0912 0909

fEWA

1 0956 0934 0942 0928 0887 09482 0960 0946 0890 0922 0978 08863 0946 0879 0871Overall 0958 0942 0916 0910 0912 0917

Payoff assessment

1 0952 0933 0914 0941 0888 09162 0957 0944 0899 0922 0917 08983 0947 0894 0903Overall 0955 0941 0907 0919 0902 0907

Equilibrium play

1 1867 1853 1833 1833 1489 17922 1889 1919 1606 1839 1953 16693 1917 1447 1539Overall 1878 1896 1719 1706 1660 1731

Random play

Overall 0976 0976 0976 0976 0976 0976

1525VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

results we use the shorthand notation x y todenote a measure under treatment x is signifi-cantly higher than that under treatment y at the5-percent level or less and x y to denote thatthe measures are not significantly different atthe 5-percent level

Figure 4 reports the simulated proportion of-equilibrium play We report only the resultsfor the first 500 rounds since the dynamics donot change much thereafter In our simulationall treatments reach higher convergence levelsthan those achieved in the 60 rounds of the

experiment Since the simulated variance reduc-tion lags that of the experiment round 60 resultsare not achieved until approximately 90 roundsof simulation We further note that these im-provements slow down after 200 rounds Theproportion of -equilibrium play is bounded by72 percent for player 1 and 93 percent for player2 We now outline the simulation results

RESULT 5 (Level of Convergence in Round500) In the simulated data for player 1 atround 500 we have the following level of con-vergence ranking in

(i) p1 2018 2040 2020 1020 2000 1000

(ii) -p1 2018 2040 2020 1020 2000 1000

(iii) p2 2040 2018 1020 2020 2000 1000 and

gence level of a given treatment for a given set of parameterestimates is not affected by the sequence of random num-bers this convergence level is somewhat sensitive to theexact parameter calibration We therefore conduct four dif-ferent calibrations for each treatment and select the param-eters that yield the highest level The relative rankings oftreatments are quite robust with respect to ordinal selected

FIGURE 3 SIMULATED DYNAMIC PATH VERSUS ACTUAL DATA

1526 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iv) -p2 2040 2018 1020 2020 2000 1000

SUPPORT The fourth and seventh columns ofTable 8 report p-values for t-tests of the preced-ing alternative hypotheses for players 1 and 2respectively

Comparing Results 1 and 5 we first note thatthe four supermodular and near-supermodulartreatments continue to dominate the 00 treat-ments In fact the simulations suggest that thegap remains constant compared with 2000and actually increases compared with 1000Unlike Result 1 however the four dominanttreatments differ In particular 2018 per-forms better in terms of player 1rsquos convergenceand 2040 in terms of player 2rsquos although thedifferences within the top four treatments aresmaller than the differences between these four

and the 00 treatments In addition an increasein significantly improves convergence by alarge margin (40 to 80 percent) for both playerswhen 0

It is instructive to compare these results withthose concerning the speed of convergence inour experimental treatments In terms of con-vergence to -p2 our regression results (Table4) indicate that 2018rsquos performance in laterrounds enables it to achieve the same conver-gence level as the supermodular treatmentsThis trend continues in our simulations as2018 performs robustly well compared to thesupermodular treatments Likewise in our anal-ysis of the experimental results the conver-gence speed of 2040 was not significantlydifferent from those of the other dominant treat-ments In our simulations however this treat-ment dominates all treatments in player 2equilibrium play

FIGURE 3 CONTINUED

1527VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

In our simulations the convergence improve-ment due to increasing from 0 to 20 doesdepend on (-effect on -effect) An increasein significantly decreases the improvement in

convergence level (all p-values for one-sidedt-tests are less than 001) Due to the relativelypoor performance of 1000 in our simula-tions increasing from 10 to 20 when 0

Player 1

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500Round

Ep

silo

n-e

qu

ilib

riu

m p

lay

α10β00α20β00α20β18α10β20α20β20α20β40

Player 2

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500

Round

Ep

silo

n-e

qu

ilib

riu

m p

lay α10β00

α20β00α20β18α10β20α20β20α20β40

FIGURE 4 SIMULATED -EQUILIBRIUM PLAY

1528 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

improves convergence more dramatically inround 500 than in rounds 40 to 60 In the longrun an increase in is a partial substitute for anincrease in

We now use definition 2 to examine the con-vergence speed in the long run Table 9 presentsresults for the first time a treatment reaches thelevel of convergence L We omit the two 00treatments since they do not converge to thesame levels as the other treatments In terms ofplayer 1 2018 achieves the fastest conver-gence for all levels L The picture for player 2convergence however is more subtle First forthe 20 and 18 treatments the speeds of con-vergence to all L are remarkably similarWhile 2040 lags the other treatments interms of achieving 80-percent -equilibriumplay for player 2 it is the only treatment toachieve L 90 percent

Apart from convergence to equilibrium it isalso important to look at long-run welfare prop-erties We evaluate mechanism performance us-ing the efficiency measure in Section IV

Figure 5 summarizes the simulated and actualefficiency achieved by each of the treatmentsThe top panel reports simulated results whilethe bottom panel reports efficiency in the ex-perimental data In the long run supermodularand near-supermodular treatments continue todiffer from the 00 treatments with 20 supe-rior to 10 in the 00 treatments

VI Interpretation and Discussion

The underlying force for convergence ingames with strategic complementarities is thecombination of the slope of the best-responsefunctions and adaptive learning by players In

TABLE 9mdashSPEED OF CONVERGENCE IN SIMULATED DATA

(Threshold is the percent of players playing epsilon-equilibrium strategy Entry indicates round in which went overthreshold for good where ldquomdashrdquo indicates that treatment never achieved threshold)

Threshold

Player 1 Player 2

04 05 06 07 04 05 06 07 08 09

2018 54 83 126 316 38 52 62 87 127 mdash1020 129 221 mdash mdash 36 48 63 91 148 mdash2020 61 91 149 mdash 39 51 61 84 137 mdash2040 101 166 263 496 30 38 56 94 166 339

TABLE 8mdashRESULTS OF t-TESTS COMPARING LEVEL OF CONVERGENCE OF SIMULATIONS OF EXPERIMENTAL TREATMENTS

(Values are for round 500 and based on 1500 simulated games)

Treatment

Player 1 equilibrium price Player 2 equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0073 00822000 0188 2000 1000 0000 0213 2000 1000 00002018 0265 2018 2040 0187 0381 2018 1020 02641020 0211 1020 2000 0000 0376 1020 2020 00012020 0243 2020 1020 0000 0353 2020 2000 00002040 0259 2040 2020 0007 0442 2040 2018 0000

Treatment

Player 1 -equilibrium price Player 2 -equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0216 02322000 0519 2000 1000 0000 0573 2000 1000 00002018 0713 2018 2040 0045 0870 2018 1020 00081020 0584 1020 2000 0000 0857 1020 2020 00002020 0662 2020 1020 0000 0834 2020 2000 00002040 0701 2040 2020 0000 0925 2040 2018 0000

1529VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the compensation mechanisms the slope ofplayer 2rsquos best response function is determinedby As and determine the penalty for

mismatched prices we seriously consider thepossibility that convergence in supermodularand near-supermodular treatments to an equilib-

Simulation Data

75

80

85

90

95

100

50 150 250 350 450

Round

α10β00α20β00α20β18α10β20α20β20α20β40

Experimental Data

40

45

50

55

60

65

70

75

80

85

90

0 10 20 30 40 50 60

Round

Fra

ctio

n o

f M

axim

um

Wel

fare

α10β00α20β00α20β18α10β20α20β20α20β40

Fra

ctio

n o

f M

axim

um

Wel

fare

FIGURE 5 EFFICIENCY IN THE SIMULATED AND ACTUAL DATA

1530 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium where both players announce the sameprice is not due to strategic complementaritiesbut rather to increases in and making thepenalty terms more prominent and matching strat-egies more focal22 We thus have two hypoth-eses to explain the observed improvement inconvergence to equilibrium play in supermodu-lar and near-supermodular treatments a best-response hypothesis and a matching hypothesisIn this section we present evidence that theimprovement in convergence is due to payoff-relevant changes to best responses and not tomatching becoming more focal

While the focal point hypothesis suggeststhat players converge to a match it does notspecify the price at which they match In thefirst round of our experiments almost 50 per-cent of all prices were 20 and less than 10percent were in the -equilibrium range of[15 17] In the final rounds of our experi-ments play in the four supermodular or near-supermodular treatments converges stronglyto this range suggesting more than simplematching

By equation (4) player 1rsquos best response isalways to match regardless of p2 By theGeneral Incentive Hypothesis we expectmore matching behavior by player 1 as increases By equation (5) however match-ing is a best response for player 2 only inequilibrium Therefore data for player 2 canhelp us separate the best-response and match-ing hypotheses

We first investigate which model better ex-plains our experimental data We operationalizethe best response hypothesis by the stochasticfictitious play model of Section V and thematching hypothesis with a stochastic matchingmodel23 In both models the predicted player 1price is specified by equation (6) In the match-ing model player 2 plays this price with prob-ability 1 and one of the other 40 priceswith probability 40 We estimate using thefirst two sessions of each treatment24 We then

compare the performance of the matchingmodel with the sFP model using the hold-outsample Using the Wilcoxon signed-rank teststo compare the MSD scores in the hold-outsample we conclude that sFP explains player2rsquos behavior significantly better than the match-ing model (p-value 0006) We conclude thatbest response to a predicted price explains be-havior significantly better than the matchingmodel

We next investigate whether differences inthe cost of matching defined as the differencebetween matching and best-response profitscan explain all of the differences in match-ing behavior among treatments or whetherthere exists additional matching behaviorthat cannot be explained by payoff-relevantfactors

We examine the probability that player 2 triesto match the anticipated player 1 price We donot know what price player 2 anticipates Toestimate how players weigh history we cali-brate a matching model in a manner similar tothat outlined in Section V B We assume aplayer matches the price he anticipates givenby equation (6) and we find the discount ratefor each treatment that minimizes MSD25 Thisgives us the anticipated price that best explainsthe matching hypothesis Using this discountrate we calculate for each t 1 an anticipatedplayer 1 price p1 for each player 2 We alsocalculate the opportunity cost of matching p1equal to the difference between best-responseand matching profits This opportunity cost cap-tures the payoff relevant effects of and alsodepends on the price to be matched

We next regress the probability that player 2matches the anticipated price Pr(p2 p1) onln(Round) treatment dummies and treatmentdummies interacted with ln(Round) In onespecification we include the cost of matching asa regressor If parameter changes make match-ing more focal then changes in the cost ofmatching will not explain all of the changes inprobability of matching

22 We thank an anonymous referee and the co-editor forpointing this out

23 We do not report the results of the deterministicmatching model as it performs significantly worse than itsstochastic counterpart

24 Estimated in each block are as follows 1000 10 7 6 8 2000 7 6 5 4 2018 7

2 3 3 1020 5 3 3 2 2020 1 0 0 02040 2 0 0 1

25 In all treatments the calibrated discount rate is r 01

1531VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Table 10 presents the results of two probitmodels with clustering at the individual level Inthe first specification we do not include the costof matching as a regressor In this specificationthe coefficient for D00 is negative and signifi-cant thus reducing from 20 to 0 decreases theprobability that player 2 will try to match player1rsquos price Including the cost of matching in theregression however we find that neither thedummies nor their interactions are signifi-cantmdashthe coefficient for the previously signifi-cant D00 dummy falls almost by a half whilethe precision of the estimate stays about thesame The coefficient on the cost of matchinghowever is significant and negative Thisfinding suggests that the rise in matching that

occurs when we increase from 0 to 20 is notbecause matching is more focal but ratherbecause an increase of decreases the cost ofmatching26

VII Concluding Remarks

The appeal of games with strategic comple-mentarities is simple as long as players areadaptive learners the game will in the limitconverge to the set bounded by the largest andsmallest Nash equilibria This convergence de-pends on neither initial conditions nor the as-sumption of a particular learning modelUnfortunately while many competitive envi-ronments are supermodular many are not Inthis study we examine whether there are non-supermodular games which have the same con-vergence properties as supermodular ones Wealso study whether there exists a clear conver-gence ranking among games with strategiccomplementarities

Our results confirm the findings of previousexperimental studies that supermodular gamesperform significantly better than games far be-low the supermodular threshold From a littlebelow the threshold to the threshold howeverthe change in convergence level is statisticallyinsignificant This results suggests that in thecontext of mechanism design the designerneed not be overly concerned with setting pa-rameters that are firmly above the supermodularthreshold close is just as good It also enlargesthe set of robustly stable games For examplethe Falkinger mechanism (Falkinger 1996) isnot supermodular but close These results sug-gest that near-supermodular games perform likesupermodular ones

Our next result concerns convergence perfor-mance within the class of games with strategiccomplementarities Variations in the degree ofcomplementarities have no significant effect onperformance within the 60 experimental roundsOur simulations suggest however an increased

26 We consider other specifications of anticipated priceincluding the averages of the last one two three and fourprices seen In only one specification was a dummy or itsinteraction with ln(Round) significant In that specificationthe coefficient for D18 is positive and its interaction withln(Round) is negative a finding which is not consistent witha focal point hypothesis

TABLE 10mdashPLAYER 2 MATCHING HYPOTHESIS

(Probit models with individual-level clustering comparingmodels with and without the cost to Player 2 of matching

the anticipated price of Player 1)

Model

Dependent variableProbability player 2 priceequals anticipated player 1

price

(1) (2)

D10 0014 0017(0043) (0041)

D00 0077 0043(0035) (0036)

D18 0046 0032(0053) (0056)

D40 0008 0041(0061) (0042)

ln(Round) 0041 0028(0011) (0010)

Match Cost 0001(0000)

ln(Round)D10 0014 0012(0012) (0012)

ln(Round)D00 0003 0000(0013) (0012)

ln(Round)D18 0006 0002(0021) (0020)

ln(Round)D40 0001 0015(0018) (0016)

Observations 9588 9588Log pseudo-likelihood 324321 317716

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses are adjusted for cluster-ing at the individual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummyvariable for treatment y Excluded dummy is D2020Match Cost is the difference in cents of matching andbest-responding to anticipated Player 1 price

1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 13: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

largely consistent with Result 1 which uses amore conservative test The coefficients ofln(Round) are both positive and highly signifi-cant indicating that players learn to play equilib-rium strategies over time The concave functionalform ln(Round) which yields a better log-likelihood than either the linear or quadratic func-tional form indicates that learning is rapid at thebeginning and decreases over time We will ex-amine learning in more detail in Section V

In specifications (3) and (4) we useln(Round) interaction of each of the treatmentdummies with ln(Round) and a constant asindependent variables The interaction term al-lows different slopes for different treatmentsCompared with the coefficient of ln(Round) the

coefficient for the interaction term Dyln(Round) captures the slope differences be-tween treatment y and 2020 By Observation1 as we cannot reject the hypotheses that initialround probability of equilibrium play is thesame across all treatments11 we can comparethe slope of L(t) L(t) between different treat-ments From the probit specification L(t) 13(constant Db ln(t)) where D is the 1 5vector of treatment dummies and b is the 5 1vector of estimated coefficients we can derive

11 When including treatment dummies in this specifica-tion Wald tests on the hypothesis that the treatment dummycoefficients are all zero yield p-values of 02703 for -p1and 03383 for -p2

TABLE 3mdashCONVERGENCE SPEED PROBIT MODELS WITH CLUSTERING AT INDIVIDUAL LEVEL

Dependent variable -Nash equilibrium play

(1) -Nashprice 1

(2) -Nashprice 2

(3) -Nashprice 1

(4) -Nashprice 2

D1000 0143 0268(0048) (0050)

D2000 0090 0217(0060) (0057)

D18 0005 0004(0071) (0069)

D1020 0080 0025(0060) (0073)

D40 0000 0078(0068) (0078)

ln(Round) 0115 0167 0131 0183(0014) (0017) (0021) (0022)

D1000ln(Round) 0050 0095(0019) (0022)

D2000ln(Round) 0031 0073(0020) (0021)

D18ln(Round) 0001 0004(0020) (0021)

D1020ln(Round) 0024 0008(0019) (0022)

D40ln(Round) 0002 0023(0019) (0024)

Observations 9720 9720 9720 9720Number of groups 162 162 162 162Log pseudo-likelihood 5562890 5824055 5557424 5793013

D1000 D2000 150 131D1000ln(Round) D2000ln(Round) 133 123D1020 D1000 D2000 005 107D1020ln(Round) D1000ln(Round)

D2000ln(Round)005 103

Notes Coefficients are probability derivatives Robust standard errors in parentheses are adjusted for clustering at theindividual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummy variable fortreatment y Excluded dummy is D2020 The bottom panel presents null hypotheses and Wald 2(1) test statistics

1517VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the slope of the probability function for treat-ment y with respect to t Ly(t) 13(con-stant Db)byt All coefficients reported inTable 3 are increments of probability deriva-tives 13(constant Db)by compared to thebaseline case of 2020

RESULT 2 (Speed of Convergence)

(i) When 20 increasing from 0 to 18and from 0 to 20 significantly increases thespeed of convergence in -p1 and -p2

(ii) When 20 increasing from 18 to 20has no significant effect on convergencespeed

(iii) When 20 increasing from 20 to 40has no significant effect on convergencespeed

(iv) When 0 increasing from 10 to 20has no significant effects on convergencespeed

(v) When 20 increasing from 10 to 20has no significant effects on convergencespeed

SUPPORT Models (3) and (4) in Table 3 re-port analyses on convergence speed For eachindependent variable the coefficients standarderrors (in parentheses) and significance levelsare reported The second Wald test looks atwhether the coefficient of D1000ln(Round)equals that of D2000ln(Round)

Part (i) of Result 2 supports Hypotheses 1(b)and 2(b) while by part (ii) we reject Hypothesis3(b) Part (iii) supports Hypothesis 4(b) Parts(iv) and (v) reject Hypothesis 5(b) Result 2provides the first empirical evidence on the roleof strategic complementarity and the speed ofconvergence

Although part (ii) indicates that increasing from 18 to 20 does not significantly changeconvergence speed we now investigate whetherthere are any differences between 18 and thesupermodular treatments In particular we com-pare 2018 with 2020 and 204012 InResult 1 we show that these treatments achieve

the same convergence level Also we cannotreject the hypothesis that round-one prices foreach player are drawn for the same distributionfor each treatment13 As the treatments all startand converge to similar levels of equilibriumplay we use a more flexible functional form tolook for differences in speed

In Table 4 we report the results of probitregressions comparing 2018 with the two20 supermodular treatments We use Roundln(Round) and their interactions with 2018as independent variables to allow different con-vergence speeds to the same level of conver-gence In model (1) the dependent variable isplayer 1 -equilibrium play There is no signif-icant difference between 2018 and the 20supermodular treatments In model (2) the de-pendent variable is player 2 -equilibrium playThere are significant differences between thenear-supermodular and 20 supermodular treat-ments In early rounds ln(Round) is large relativeto Round The negative and weakly significantcoefficient on D2018ln(Round) implies a lowerprobability of early-round equilibrium play in

12 We omit 1020 from this analysis First it does notachieve the same -p1 convergence level as the other su-permodular treatments Second omitting it avoids the pos-sibility of an -effect

13 Kolmogorov-Smirnov test result tables are availableby request

TABLE 4mdashCONVERGENCE SPEED

Probit models with individual-level clustering comparing2018 with the 20 supermodular treatments achieving

similar convergence levels

Dependent variable -Nashequilibrium play

(1) -Nashprice 1

(2) -Nashprice 2

ln(Round) 0054 0216(0040) (0043)

Round 0004 0001(0002) (0002)

D2018ln(Round) 0013 0066(0032) (0035)

D2018Round 0001 0006(0002) (0003)

Observations 4680 4680Log pseudo-likelihood 287907 299385

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses Significant at 10-percent level 5-percent level 1-percent level Dy isthe dummy variable for treatment y Excluded dummies areD2020 and D2040

1518 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

2018 relative to the supermodular treatmentsThe 18 treatment does catch up to these treat-ments which is reflected in the positive andsignificant coefficient on D2018 interactedwith Round This result suggests a differencebetween supermodular and near-supermodulartreatments while they achieve similar conver-gence levels the supermodular treatments per-form better in early rounds and thus mightreach certain convergence levels faster thantheir near-supermodular counterparts

The previous discussion examines the sepa-rate effects of strategic complementarity (-effects) and (-effects) on convergenceVarying allows us however to test the im-portance of strategic complementarity relativeto other features of mechanism design Having afull factorial design in the parameters 1020 and 0 20 we can study whether affects the role of strategic complementarity Inparticular as increases from 0 to 20 weexpect improvement in convergence level andspeed We study whether this improvementchanges when increases from 10 to 20

The last two Wald tests in the bottom panelof Table 3 separately examine the -effectson the change in convergence level and speedresulting from increasing from 0 to 20(-effects on -effects) Changing from 10 to20 does not significantly change the improve-ment in either convergence level or speed Thisis the first empirical result examining the ef-fects of other factors on the role of strategiccomplementarity

So far we have discussed the performance ofthe mechanism relative to equilibrium predic-tions and individual behavior We now turn togroup-level welfare results Since the compen-sation mechanism balances the budget only in

equilibrium total payoffs off the equilibriumpath can be only weakly related to efficientpayoffs Therefore we use two separate mea-sures to capture welfare implications an effi-ciency measure and a measure of budgetimbalance

We first define the per-round efficiency mea-sure which includes neither the taxsubsidy northe penalty terms ie

et rxt cxt ext

rx cx ex

where x is the efficient quantity The efficiencyachieved in a block e(t1 t2) where 0 t1 t2 60 is then defined as

et1 t2 t t1

t2 et

t2 t1 1

RESULT 3 (Efficiency in the Last 20Rounds)

(i) When 20 increasing from 0 to 18significantly improves efficiency while in-creasing from 0 to 20 weakly improvesefficiency

(ii) When 20 increasing from 18 to 20has no significant effect on efficiency

(iii) When 20 increasing from 20 to 40weakly improves efficiency

(iv) When 0 increasing from 10 to 20weakly improves efficiency

(v) When 20 increasing from 10 to 20has no significant effect on efficiency

SUPPORT Table 5 presents the efficiencymeasure for each session in each treatment

TABLE 5mdashEFFICIENCY MEASURE

Efficiency Last 20 rounds

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0600 0671 0804 0858 0733 1000 2000 00872000 0650 0870 0907 0873 0825 0825 2000 2020 00952018 0963 0952 0889 0959 0941 2000 2018 00161020 0958 0919 0905 0881 0984 0929 1020 2020 07142020 0888 0937 0939 0780 0971 0903 2018 2020 08332040 0973 0962 0943 0949 0957 2020 2040 0064

Note Significant at 10-percent level 5-percent level 1-percent level

1519VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

in the last 20 rounds the alternative hypoth-eses and the results of one-tailed permutationtests

Result 3 is largely consistent with Result 1indicating that supermodular and near-supermodular mechanisms induce a signifi-cantly higher proportion of equilibrium playthan mechanisms far from the supermodularthreshold The new finding is that increasing from the threshold 20 to 40 weakly improvesefficiency

Second the budget surplus at round t is thesum of the penalty terms plus tax minus sub-sidy in that round ie s(t) ( )(p1(t) p2(t))2 p2(t)x(t) p1(t)x(t) Examining thesession-level budget surplus across all roundswe find the following results First 22 out of 27sessions result in a budget surplus This comesfrom a combination of our choice of parametersand the dynamics of play Second the onlysignificant difference across treatments is thatbudget surpluses under 2018 and 2020are significantly lower than those under2040 ( p-values 0029 and 0024 respec-tively) This finding points to a potential costof driving up the punishment parameter Inother words before the system equilibrates ahigh punishment parameter can worsen thebudget imbalance problem inherent in themechanism

Overall Results 1 through 3 suggest the fol-lowing observations regarding games with stra-tegic complementarities First in terms ofconvergence level and speed supermodular andnear-supermodular games perform significantlybetter than those far under the threshold Sec-ond while there is no significant differencebetween supermodular and near-supermodulargames in terms of convergence level early per-formance differences may lead to supermodulartreatments reaching certain convergence levelsmore quickly Third beyond the threshold in-creasing has no significant effect on either thelevel or the speed of convergence but has thedisadvantage of risking higher budget imbal-ance Last for a given increasing haspartial effects on convergence level but nosignificant effect on convergence speed Tocheck the persistence of these experimental re-sults in the long run we use simulations inSection V

V Simulation Results Continued Dynamics

Our experiment examines the relationship be-tween strategic complementarity and conver-gence to equilibrium Learning theory predictslong-run convergence Figures 1 and 2 showthat convergence continues to improve in laterrounds for several treatments suggesting con-tinued dynamics Due to time attention andresource constraints it was not feasible to runthe experiment for much longer than 60rounds Therefore we rely on simulations tostudy continued convergence beyond 60rounds

To do so we look for a learning algorithmwhich when calibrated closely approximatesthe observed dynamic paths over 60 roundsThe large empirical literature on learning ingames (see eg Camerer 2003 for a survey)suggests many models Our interest here is notto compare the performance of various learningmodels We look for learning models that matchtwo criteria First we require a model that per-forms well in a variety of experimental gamesSecond given our experiment has complete in-formation about the payoff structure the modelneeds to incorporate this information In thefollowing subsections we first introduce threelearning models which meet our criteria Wethen look at how well the learning models pre-dict the experimental data by calibrating eachalgorithm on a subset of the experimental dataand validating the models on a hold-out sampleWe then report the forecasting results using oneof the calibrated algorithms

A Three Learning Models

The models we examine are stochastic ficti-tious play with discounting (hereafter shortenedas sFP) (Yin-Wong Cheung and Daniel Fried-man 1997 Fudenberg and Levine 1998) func-tional EWA (fEWA) (Teck-Hua Ho et al2001) and the payoff assessment learning model(PA) (Rajiv Sarin and Farshid Vahid 1999)We now give a brief overview of each modelInterested readers are referred to the originalsfor complete descriptions

The particular version of sFP that we use islogistic fictitious play (see eg Fudenberg andLevine 1998) A player predicts the matchrsquosprice in the next round pi(t 1) according to

1520 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(6) pi t 1

pi t yen 1

t 1

rpi t

1 yen 1

t 1

r

for some discount factor r [0 1] Note r 0corresponds to the Cournot best reply assess-ment pi(t 1) pi(t) When r 1 it yieldsthe standard fictitious play assessment Theusual adaptive learning model assumes 0 r 1All observations influence the expected state butmore recent observations have greater weight

As opposed to standard fictitious play stochas-tic fictitious play allows decision randomizationand thus better captures the human learning pro-cess Omitting time subscripts the probability thata player announces price pi is given by

(7) Pr pipi exp13 pi pi

yenj 0

40

exp13 pj pj

Given a predicted price a player is thus morelikely to play strategies that yield higher pay-offs How much more likely is determined by 13the sensitivity parameter As 13 increases theprobability of a best response to pi increases

The second model we consider fEWA is aone-parameter variant of the experience-weighted attraction (EWA) model (Camererand Ho 1999) In this model strategy probabil-ities are determined by logit probabilities simi-lar to equation (7) with actual payoffs ((pipi)) replaced by strategy attraction In bothvariants strategy attraction Ai

j(t) and an expe-rience weight N(t) are updated after everyperiod The experience weight is updated ac-cording to N(t) (1 ) N(t 1) 1where is the change-detection parameter and controls exploration (low ) versus exploita-tion Attractions are updated according to thefollowing rule

(8) Aijt

Nt 1Ai

jt 1 1 I pij pi ti pj pi t

Nt

where the indicator function I(pij pi(t)) equals

one if pij pi(t) and zero otherwise and [0

1] is the imagination weight In EWA all pa-rameters are estimated whereas in fEWA theseparameters (except for 13) are endogenously de-termined by the following functions Thechange-detector function i(t) is given by

(9) i t 1 5 j 1

mi Ipij pi t

1

yen 1

t

I pij pi t

t 2

This function will be close to 1 when recenthistory resembles previous history The imagi-nation weight i(t) equals i(t) while equalsthe Gini coefficient of previous choice frequen-cies EWA models encompass a variety offamiliar learning models cumulative reinforce-ment learning ( 0 1 N(0) 1)weighted reinforcement learning ( 0 0N(0) 1) weighted fictitious play ( 1 0) standard fictitious play ( 1 0)and Cournot best reply ( 1 1)

Finally we introduce the main componentsof the PA model For simplicity we omit allsubscripts that represent player i and let j(t)represent the actual payoff of strategy j in roundt Since the game has a large strategy space weincorporate similarity functions into the modelto represent agent use of strategy similarity Asstrategies in this game are naturally ordered bytheir labels we use the Bartlett similarity func-tion fjk(h t) to denote the similarity betweenthe played strategy k and an unplayed strategyj at period t

fjk h t 1 j kh if j k h0 otherwise

In this function the parameter h determines theh 1 unplayed strategies on either side of theplayed strategy to be updated When h 1fjk(1 t) degenerates into an indicator functionequal to one if strategy j is chosen in round t andzero otherwise

The PA model assumes that a player is amyopic subjective maximizer That is hechooses strategies based on assessed payoffs

1521VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

and does not explicitly take into account thelikelihood of alternate states Let uj(t) denotethe subjective assessment of strategy pj at timet and r the discount factor Payoff assessmentsare updated through a weighted average of hisprevious assessments and the payoff he actuallyobtains at time t If strategy k is chosen at timet then

(10) uj t 1 1 rfjk h tuj t

rfjk h tk t j

Each period the assessed strategy payoffs aresubject to zero-mean symmetrically distributedshocks zj(t) The decision maker chooses on thebasis of his shock-distorted subjective assess-ments uj(t) uj(t) zj(t) At time t he choosesstrategy pk if

(11) uk t uj t 0 pj pk

Note that mood shocks affect only his choicesand not the manner in which assessments areupdated

B Calibration

Literature assessing the performance oflearning models contains two approaches to cal-ibration and validation The first approach cal-ibrates the model on the first t rounds andvalidates on the remaining rounds The secondapproach uses half of the sessions in each treat-ment to calibrate and the other half to validateWe choose the latter approach for two reasonsFirst this approach is feasible as we have mul-tiple independent sessions for each treatmentSecond we need not assume that the parametersof later rounds are the same as those in earlierrounds We thus calibrate the parameters ofeach model in blocks of 15 rounds using theexperimental data from the first two sessions ofeach treatment We then evaluate each model bymeasuring how well the parameterized modelpredicts play in the remaining two or threesessions

For parameter estimation we conduct MonteCarlo simulations designed to replicate thecharacteristics of the experimental settings In

all calibrations we exclude the last two rounds(59 and 60) to avoid any end-of-game effectsWe then compare the simulated paths with theexperimental data to find those parameters thatminimize the mean-squared deviation (MSD)scores Since the final distributions of our pricedata are unimodal the simulated mean is aninformative statistic and is well captured byMSD (Ernan Haruvy and Dale Stahl 2000) Inall simulations we use the k-period-aheadrather than the one-period-ahead approach14 be-cause we are interested in forecasting the long-run mechanism performance In doing so wechoose k 10 15 20 30 and 58 We look atblocks of 10 15 20 and 30 rounds because asplayers gain information and experience infor-mation use may change over time We use k 15 rather than the other values because it bestcaptures the actual dynamics in the experimen-tal data

Each simulation consists of 1500 games(18000 players) and the following steps

(i) Simulated players are randomly matchedinto pairs at the beginning of each round

(ii) Simulated players select price announce-ments(a) Initial round Almost half of all play-

ers selected a first-round price of 20(44 percent of player 1s and 43 percentof player 2s) There are also consider-able spikes in the frequency of select-ing other prices divisible by 515 Basedon Kolmogorov-Smirnov tests on theactual round-one price distribution wereject the null hypotheses of uniformdistribution (d 0250 p-value 0000) and normal distribution (d 0381 p-value 0000) We thus fol-low the convention (eg Ho et al2001) and use the actual first-roundempirical distribution of choices togenerate the first-round choices

(b) Subsequent rounds Simulated playerstrategies are determined via equation(7) for sFP and fEWA and equation(11) for PA

14 Ido Erev and Haruvy (2000) discuss the tradeoffs ofthe two approaches

15 For example the seven most frequently selected pricesare divisible by 5

1522 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iii) Simulated player 1rsquos quantity choice isbased on the following steps(a) Determine each player 1rsquos best re-

sponse to p2(b) Determine whether player 1 will devi-

ate from the best response via the ac-tual probability of errors for eachblock

(c) If yes deviate via the actual mean andstandard deviation for the block Oth-erwise play the best response

(iv) Payoffs are determined by the payoff func-tion of the compensation mechanismequation (1) for each treatment

(v) Assessments are updated according toequation (8) for fEWA and equation (10)for PA

(vi) Proceed to the next round

The discount factor r [0 1] is searched ata grid size of 01 The parameter 13 is searchedat a grid size of 01 in the interval [15 105] forfEWA and [15 250] for sFP The size of thesimilarity window h [1 10] is searched at agrid size of 1 Mood shocks z are drawn from

a uniform distribution16 on an interval [a a]where a is searched on [0 500] with a step sizeof 50 For all parameters intervals and gridsizes are determined by payoff magnitude

Table 6 reports the calibrated parameters(discount factor sensitivity parameter moodshock interval and similarity window size) forthe first two sessions of each treatment in 15-round blocks17 Estimated parameters for thesupermodular and near-supermodular treatmentsare consistent with the increased level of con-vergence over time while the 00 treatmentsare not The second column (sFP) reports thebest-fit parameters for the stochastic fictitiousplay model With the exception of treatment2000 the discount factor is close to 10

16 Chen and Yuri Khoroshilov (2003) compare threeversions of PA models where shocks were drawn fromuniform logistic and double exponential distributions ontwo sets of experimental data and find that the performanceof the three versions of PA were statistically indistinguish-able

17 We also calibrate all parameters for the entire 60rounds but do not report results due to space limitations

TABLE 6mdashCALIBRATION OF THREE LEARNING MODELS IN FIFTEEN-ROUND BLOCKS

Model sFP fEWA PA

Block 1 2 3 4 1 2 3 4 1 2 3 4

Discount rate (r) Discount rate (r)1000 10 07 09 10 01 00 09 102000 07 06 01 01 05 10 02 012018 10 10 09 09 02 10 03 021020 10 10 10 09 01 03 06 032020 10 10 10 09 02 01 05 022040 10 10 10 07 01 10 04 03

Sensitivity (13) Sensitivity (13) Shock interval (a)1000 15 44 45 28 16 45 44 37 500 500 250 502000 15 43 140 180 17 40 51 74 50 100 150 502018 16 80 115 183 15 53 62 95 50 300 500 1501020 19 59 86 138 18 47 62 82 100 500 450 3002020 28 56 81 112 24 38 61 67 200 300 350 3502040 34 65 140 205 32 49 67 99 150 50 150 50

Similarity window (h)1000 1 1 10 92000 10 10 9 62018 5 4 6 11020 1 1 8 32020 2 2 7 12040 1 3 2 2

1523VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

indicating that players keep track of all past playwhen forming beliefs about opponentsrsquo nextmove18 Given these beliefs the increasing sen-sitivity parameter 13 for all treatments except1000 indicates that the likelihood of bestresponse increases over time The third columnreports calibration results for the fEWA modelAgain with the exception of treatment 1000the parameter 13 increases over time The finalcolumn reports calibration of parameters in thePA model For each treatment except 1000 amiddle block has the highest discount factorindicating more weight on new informationabout a strategyrsquos performance The second pa-rameter in the PA model represents the upperbound of the interval from which shocks aredrawn a Experimentation should decrease inthe final rounds Indeed for all treatments theestimated mood-shock ranges (weakly) de-crease from the third to the last block Finallythe decreasing similarity windows from thethird to final block indicate decreasing strategyspillover Relatively large discount factors inthe third block combined with relatively largesimilarity windows flatten the payoff assess-ments around the most recently played strate-gies consistent with the local experimentation(or relatively stabilized play) observed in thedata

C Validation

Using the parameters calibrated on the firsttwo sessions of each treatment we next com-pare the performance of the three learning mod-els in predicting play in the hold-out sampleFor comparison we also present the perfor-mance of two static models The random choicemodel assumes that each player randomlychooses any strategy with equal probability forall rounds This model incorporates only num-ber of strategies and thus provides a minimumstandard for a dynamic learning model Theequilibrium model assumes that each playerplays the subgame perfect Nash equilibriumevery round Its fit conveys the same informa-tion as the proportion of equilibrium play pre-

sented in Section IV but with a different metric(MSD)

Table 7 presents each modelrsquos MSD scoresfor each hold-out session Recall that two ofeach treatmentrsquos four or five independent ses-sions are used for calibration and the rest forvalidation The results indicate that all threelearning models perform significantly betterthan the random choice model (p-value 001one-sided permutation tests) and by a largermargin significantly better than the equilibriummodel (p-value 001 one-sided permutationtests) While the equilibrium model does a poorjob of explaining overall experimental data itsperformance improvement over time19 justifiesthe use of learning models to explain the dy-namics of play Within each of the top threepanels (ie learning models) session-levelMSD scores are lower for the supermodular andnear-supermodular treatments indicating thateach learning model does a better job explainingbehavior in those treatments Indeed in the em-pirical learning literature learning models fitbetter in experiments with better equilibriumconvergence (see eg Chen and Khoroshilov2003)

RESULT 4 (Comparison of Learning Mod-els) The performance of PA is weakly betterthan that of fEWA and strictly better than that ofsFP The performances of fEWA and sFP arenot significantly different

SUPPORT Table 7 reports the MSD scores foreach independent session in the hold-out sampleunder each model The Wilcoxon signed-rankstests show that the MSD scores under PA areweakly lower than those under fEWA (z 0085 p-value 0068) and strictly lowerthan those under sFP (z 0028 p-value 0023) Using the same test we cannot reject thehypothesis that fEWA and sFP yield the sameMSD scores (z 0662 p-value 0508)

Although the performance of PA is onlyweakly better than that of fEWA the overallMSD scores for PA are lower than those for

18 As the discount factor is zero in Cournot best replywe can reject this model based on our estimation of thediscount factors

19 We omit the table to show the performance of theequilibrium model in blocks of 15 rounds as the informa-tion is repetitive with Figures 1 and 2

1524 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

fEWA for every treatment Therefore we usePA for forecasting beyond 60 rounds

Figure 3 presents the simulated path of thecalibrated PA model and compares it with theactual data in the hold-out sample by superim-posing the simulated mean (black line) plus andminus one standard deviation (grey lines) on theactual mean (black box) and standard deviation(error bars) The simulation does a good job oftracking both the mean and the standard devia-tion of the actual data In treatments 1000and 2040 however the reduction in variancelags that seen in the middle rounds of theexperiment

D Forecasting

We now report the results from the PAmodel As the strategic complementarity predic-tions concern long-run performance we use thecalibrated parameters for the first 58 rounds tosimulate play in later rounds This exercise al-

lows us to study convergence and efficiency inlong but finite horizons

In all forecasting exercises we base all post-58-round parameters on those from the lastblock (calibrated from rounds 46 to 58) Sincewe do not know how these parameters mightchange beyond 60 rounds we use three differ-ent specifications First we retain all final-blockparameters Second we exponentially decay insubsequent blocks the probability of deviationfrom the best response in the production stage20

Third we exponentially decay the shock inter-val in subsequent blocks As all three specifica-tions yield qualitatively similar results wereport results from only the third specificationdue to space limitations21 In presenting the

20 In block k 4 we use the probability of deviationPrk minPr4(k 4)2 108 We set a lower bound of108 to avoid the division-by-zero problem

21 Different sequences of random numbers produceslightly different parameters estimates While the conver-

TABLE 7mdashVALIDATION ON HOLD-OUT SESSIONS

Session

Stochastic fictitious play

1000 2000 2018 1020 2020 2040

1 0955 0934 0940 0930 0888 09402 0960 0946 0890 0917 0973 08783 0948 0883 0873Overall 0957 0943 0915 0910 0912 0909

fEWA

1 0956 0934 0942 0928 0887 09482 0960 0946 0890 0922 0978 08863 0946 0879 0871Overall 0958 0942 0916 0910 0912 0917

Payoff assessment

1 0952 0933 0914 0941 0888 09162 0957 0944 0899 0922 0917 08983 0947 0894 0903Overall 0955 0941 0907 0919 0902 0907

Equilibrium play

1 1867 1853 1833 1833 1489 17922 1889 1919 1606 1839 1953 16693 1917 1447 1539Overall 1878 1896 1719 1706 1660 1731

Random play

Overall 0976 0976 0976 0976 0976 0976

1525VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

results we use the shorthand notation x y todenote a measure under treatment x is signifi-cantly higher than that under treatment y at the5-percent level or less and x y to denote thatthe measures are not significantly different atthe 5-percent level

Figure 4 reports the simulated proportion of-equilibrium play We report only the resultsfor the first 500 rounds since the dynamics donot change much thereafter In our simulationall treatments reach higher convergence levelsthan those achieved in the 60 rounds of the

experiment Since the simulated variance reduc-tion lags that of the experiment round 60 resultsare not achieved until approximately 90 roundsof simulation We further note that these im-provements slow down after 200 rounds Theproportion of -equilibrium play is bounded by72 percent for player 1 and 93 percent for player2 We now outline the simulation results

RESULT 5 (Level of Convergence in Round500) In the simulated data for player 1 atround 500 we have the following level of con-vergence ranking in

(i) p1 2018 2040 2020 1020 2000 1000

(ii) -p1 2018 2040 2020 1020 2000 1000

(iii) p2 2040 2018 1020 2020 2000 1000 and

gence level of a given treatment for a given set of parameterestimates is not affected by the sequence of random num-bers this convergence level is somewhat sensitive to theexact parameter calibration We therefore conduct four dif-ferent calibrations for each treatment and select the param-eters that yield the highest level The relative rankings oftreatments are quite robust with respect to ordinal selected

FIGURE 3 SIMULATED DYNAMIC PATH VERSUS ACTUAL DATA

1526 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iv) -p2 2040 2018 1020 2020 2000 1000

SUPPORT The fourth and seventh columns ofTable 8 report p-values for t-tests of the preced-ing alternative hypotheses for players 1 and 2respectively

Comparing Results 1 and 5 we first note thatthe four supermodular and near-supermodulartreatments continue to dominate the 00 treat-ments In fact the simulations suggest that thegap remains constant compared with 2000and actually increases compared with 1000Unlike Result 1 however the four dominanttreatments differ In particular 2018 per-forms better in terms of player 1rsquos convergenceand 2040 in terms of player 2rsquos although thedifferences within the top four treatments aresmaller than the differences between these four

and the 00 treatments In addition an increasein significantly improves convergence by alarge margin (40 to 80 percent) for both playerswhen 0

It is instructive to compare these results withthose concerning the speed of convergence inour experimental treatments In terms of con-vergence to -p2 our regression results (Table4) indicate that 2018rsquos performance in laterrounds enables it to achieve the same conver-gence level as the supermodular treatmentsThis trend continues in our simulations as2018 performs robustly well compared to thesupermodular treatments Likewise in our anal-ysis of the experimental results the conver-gence speed of 2040 was not significantlydifferent from those of the other dominant treat-ments In our simulations however this treat-ment dominates all treatments in player 2equilibrium play

FIGURE 3 CONTINUED

1527VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

In our simulations the convergence improve-ment due to increasing from 0 to 20 doesdepend on (-effect on -effect) An increasein significantly decreases the improvement in

convergence level (all p-values for one-sidedt-tests are less than 001) Due to the relativelypoor performance of 1000 in our simula-tions increasing from 10 to 20 when 0

Player 1

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500Round

Ep

silo

n-e

qu

ilib

riu

m p

lay

α10β00α20β00α20β18α10β20α20β20α20β40

Player 2

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500

Round

Ep

silo

n-e

qu

ilib

riu

m p

lay α10β00

α20β00α20β18α10β20α20β20α20β40

FIGURE 4 SIMULATED -EQUILIBRIUM PLAY

1528 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

improves convergence more dramatically inround 500 than in rounds 40 to 60 In the longrun an increase in is a partial substitute for anincrease in

We now use definition 2 to examine the con-vergence speed in the long run Table 9 presentsresults for the first time a treatment reaches thelevel of convergence L We omit the two 00treatments since they do not converge to thesame levels as the other treatments In terms ofplayer 1 2018 achieves the fastest conver-gence for all levels L The picture for player 2convergence however is more subtle First forthe 20 and 18 treatments the speeds of con-vergence to all L are remarkably similarWhile 2040 lags the other treatments interms of achieving 80-percent -equilibriumplay for player 2 it is the only treatment toachieve L 90 percent

Apart from convergence to equilibrium it isalso important to look at long-run welfare prop-erties We evaluate mechanism performance us-ing the efficiency measure in Section IV

Figure 5 summarizes the simulated and actualefficiency achieved by each of the treatmentsThe top panel reports simulated results whilethe bottom panel reports efficiency in the ex-perimental data In the long run supermodularand near-supermodular treatments continue todiffer from the 00 treatments with 20 supe-rior to 10 in the 00 treatments

VI Interpretation and Discussion

The underlying force for convergence ingames with strategic complementarities is thecombination of the slope of the best-responsefunctions and adaptive learning by players In

TABLE 9mdashSPEED OF CONVERGENCE IN SIMULATED DATA

(Threshold is the percent of players playing epsilon-equilibrium strategy Entry indicates round in which went overthreshold for good where ldquomdashrdquo indicates that treatment never achieved threshold)

Threshold

Player 1 Player 2

04 05 06 07 04 05 06 07 08 09

2018 54 83 126 316 38 52 62 87 127 mdash1020 129 221 mdash mdash 36 48 63 91 148 mdash2020 61 91 149 mdash 39 51 61 84 137 mdash2040 101 166 263 496 30 38 56 94 166 339

TABLE 8mdashRESULTS OF t-TESTS COMPARING LEVEL OF CONVERGENCE OF SIMULATIONS OF EXPERIMENTAL TREATMENTS

(Values are for round 500 and based on 1500 simulated games)

Treatment

Player 1 equilibrium price Player 2 equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0073 00822000 0188 2000 1000 0000 0213 2000 1000 00002018 0265 2018 2040 0187 0381 2018 1020 02641020 0211 1020 2000 0000 0376 1020 2020 00012020 0243 2020 1020 0000 0353 2020 2000 00002040 0259 2040 2020 0007 0442 2040 2018 0000

Treatment

Player 1 -equilibrium price Player 2 -equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0216 02322000 0519 2000 1000 0000 0573 2000 1000 00002018 0713 2018 2040 0045 0870 2018 1020 00081020 0584 1020 2000 0000 0857 1020 2020 00002020 0662 2020 1020 0000 0834 2020 2000 00002040 0701 2040 2020 0000 0925 2040 2018 0000

1529VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the compensation mechanisms the slope ofplayer 2rsquos best response function is determinedby As and determine the penalty for

mismatched prices we seriously consider thepossibility that convergence in supermodularand near-supermodular treatments to an equilib-

Simulation Data

75

80

85

90

95

100

50 150 250 350 450

Round

α10β00α20β00α20β18α10β20α20β20α20β40

Experimental Data

40

45

50

55

60

65

70

75

80

85

90

0 10 20 30 40 50 60

Round

Fra

ctio

n o

f M

axim

um

Wel

fare

α10β00α20β00α20β18α10β20α20β20α20β40

Fra

ctio

n o

f M

axim

um

Wel

fare

FIGURE 5 EFFICIENCY IN THE SIMULATED AND ACTUAL DATA

1530 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium where both players announce the sameprice is not due to strategic complementaritiesbut rather to increases in and making thepenalty terms more prominent and matching strat-egies more focal22 We thus have two hypoth-eses to explain the observed improvement inconvergence to equilibrium play in supermodu-lar and near-supermodular treatments a best-response hypothesis and a matching hypothesisIn this section we present evidence that theimprovement in convergence is due to payoff-relevant changes to best responses and not tomatching becoming more focal

While the focal point hypothesis suggeststhat players converge to a match it does notspecify the price at which they match In thefirst round of our experiments almost 50 per-cent of all prices were 20 and less than 10percent were in the -equilibrium range of[15 17] In the final rounds of our experi-ments play in the four supermodular or near-supermodular treatments converges stronglyto this range suggesting more than simplematching

By equation (4) player 1rsquos best response isalways to match regardless of p2 By theGeneral Incentive Hypothesis we expectmore matching behavior by player 1 as increases By equation (5) however match-ing is a best response for player 2 only inequilibrium Therefore data for player 2 canhelp us separate the best-response and match-ing hypotheses

We first investigate which model better ex-plains our experimental data We operationalizethe best response hypothesis by the stochasticfictitious play model of Section V and thematching hypothesis with a stochastic matchingmodel23 In both models the predicted player 1price is specified by equation (6) In the match-ing model player 2 plays this price with prob-ability 1 and one of the other 40 priceswith probability 40 We estimate using thefirst two sessions of each treatment24 We then

compare the performance of the matchingmodel with the sFP model using the hold-outsample Using the Wilcoxon signed-rank teststo compare the MSD scores in the hold-outsample we conclude that sFP explains player2rsquos behavior significantly better than the match-ing model (p-value 0006) We conclude thatbest response to a predicted price explains be-havior significantly better than the matchingmodel

We next investigate whether differences inthe cost of matching defined as the differencebetween matching and best-response profitscan explain all of the differences in match-ing behavior among treatments or whetherthere exists additional matching behaviorthat cannot be explained by payoff-relevantfactors

We examine the probability that player 2 triesto match the anticipated player 1 price We donot know what price player 2 anticipates Toestimate how players weigh history we cali-brate a matching model in a manner similar tothat outlined in Section V B We assume aplayer matches the price he anticipates givenby equation (6) and we find the discount ratefor each treatment that minimizes MSD25 Thisgives us the anticipated price that best explainsthe matching hypothesis Using this discountrate we calculate for each t 1 an anticipatedplayer 1 price p1 for each player 2 We alsocalculate the opportunity cost of matching p1equal to the difference between best-responseand matching profits This opportunity cost cap-tures the payoff relevant effects of and alsodepends on the price to be matched

We next regress the probability that player 2matches the anticipated price Pr(p2 p1) onln(Round) treatment dummies and treatmentdummies interacted with ln(Round) In onespecification we include the cost of matching asa regressor If parameter changes make match-ing more focal then changes in the cost ofmatching will not explain all of the changes inprobability of matching

22 We thank an anonymous referee and the co-editor forpointing this out

23 We do not report the results of the deterministicmatching model as it performs significantly worse than itsstochastic counterpart

24 Estimated in each block are as follows 1000 10 7 6 8 2000 7 6 5 4 2018 7

2 3 3 1020 5 3 3 2 2020 1 0 0 02040 2 0 0 1

25 In all treatments the calibrated discount rate is r 01

1531VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Table 10 presents the results of two probitmodels with clustering at the individual level Inthe first specification we do not include the costof matching as a regressor In this specificationthe coefficient for D00 is negative and signifi-cant thus reducing from 20 to 0 decreases theprobability that player 2 will try to match player1rsquos price Including the cost of matching in theregression however we find that neither thedummies nor their interactions are signifi-cantmdashthe coefficient for the previously signifi-cant D00 dummy falls almost by a half whilethe precision of the estimate stays about thesame The coefficient on the cost of matchinghowever is significant and negative Thisfinding suggests that the rise in matching that

occurs when we increase from 0 to 20 is notbecause matching is more focal but ratherbecause an increase of decreases the cost ofmatching26

VII Concluding Remarks

The appeal of games with strategic comple-mentarities is simple as long as players areadaptive learners the game will in the limitconverge to the set bounded by the largest andsmallest Nash equilibria This convergence de-pends on neither initial conditions nor the as-sumption of a particular learning modelUnfortunately while many competitive envi-ronments are supermodular many are not Inthis study we examine whether there are non-supermodular games which have the same con-vergence properties as supermodular ones Wealso study whether there exists a clear conver-gence ranking among games with strategiccomplementarities

Our results confirm the findings of previousexperimental studies that supermodular gamesperform significantly better than games far be-low the supermodular threshold From a littlebelow the threshold to the threshold howeverthe change in convergence level is statisticallyinsignificant This results suggests that in thecontext of mechanism design the designerneed not be overly concerned with setting pa-rameters that are firmly above the supermodularthreshold close is just as good It also enlargesthe set of robustly stable games For examplethe Falkinger mechanism (Falkinger 1996) isnot supermodular but close These results sug-gest that near-supermodular games perform likesupermodular ones

Our next result concerns convergence perfor-mance within the class of games with strategiccomplementarities Variations in the degree ofcomplementarities have no significant effect onperformance within the 60 experimental roundsOur simulations suggest however an increased

26 We consider other specifications of anticipated priceincluding the averages of the last one two three and fourprices seen In only one specification was a dummy or itsinteraction with ln(Round) significant In that specificationthe coefficient for D18 is positive and its interaction withln(Round) is negative a finding which is not consistent witha focal point hypothesis

TABLE 10mdashPLAYER 2 MATCHING HYPOTHESIS

(Probit models with individual-level clustering comparingmodels with and without the cost to Player 2 of matching

the anticipated price of Player 1)

Model

Dependent variableProbability player 2 priceequals anticipated player 1

price

(1) (2)

D10 0014 0017(0043) (0041)

D00 0077 0043(0035) (0036)

D18 0046 0032(0053) (0056)

D40 0008 0041(0061) (0042)

ln(Round) 0041 0028(0011) (0010)

Match Cost 0001(0000)

ln(Round)D10 0014 0012(0012) (0012)

ln(Round)D00 0003 0000(0013) (0012)

ln(Round)D18 0006 0002(0021) (0020)

ln(Round)D40 0001 0015(0018) (0016)

Observations 9588 9588Log pseudo-likelihood 324321 317716

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses are adjusted for cluster-ing at the individual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummyvariable for treatment y Excluded dummy is D2020Match Cost is the difference in cents of matching andbest-responding to anticipated Player 1 price

1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 14: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

the slope of the probability function for treat-ment y with respect to t Ly(t) 13(con-stant Db)byt All coefficients reported inTable 3 are increments of probability deriva-tives 13(constant Db)by compared to thebaseline case of 2020

RESULT 2 (Speed of Convergence)

(i) When 20 increasing from 0 to 18and from 0 to 20 significantly increases thespeed of convergence in -p1 and -p2

(ii) When 20 increasing from 18 to 20has no significant effect on convergencespeed

(iii) When 20 increasing from 20 to 40has no significant effect on convergencespeed

(iv) When 0 increasing from 10 to 20has no significant effects on convergencespeed

(v) When 20 increasing from 10 to 20has no significant effects on convergencespeed

SUPPORT Models (3) and (4) in Table 3 re-port analyses on convergence speed For eachindependent variable the coefficients standarderrors (in parentheses) and significance levelsare reported The second Wald test looks atwhether the coefficient of D1000ln(Round)equals that of D2000ln(Round)

Part (i) of Result 2 supports Hypotheses 1(b)and 2(b) while by part (ii) we reject Hypothesis3(b) Part (iii) supports Hypothesis 4(b) Parts(iv) and (v) reject Hypothesis 5(b) Result 2provides the first empirical evidence on the roleof strategic complementarity and the speed ofconvergence

Although part (ii) indicates that increasing from 18 to 20 does not significantly changeconvergence speed we now investigate whetherthere are any differences between 18 and thesupermodular treatments In particular we com-pare 2018 with 2020 and 204012 InResult 1 we show that these treatments achieve

the same convergence level Also we cannotreject the hypothesis that round-one prices foreach player are drawn for the same distributionfor each treatment13 As the treatments all startand converge to similar levels of equilibriumplay we use a more flexible functional form tolook for differences in speed

In Table 4 we report the results of probitregressions comparing 2018 with the two20 supermodular treatments We use Roundln(Round) and their interactions with 2018as independent variables to allow different con-vergence speeds to the same level of conver-gence In model (1) the dependent variable isplayer 1 -equilibrium play There is no signif-icant difference between 2018 and the 20supermodular treatments In model (2) the de-pendent variable is player 2 -equilibrium playThere are significant differences between thenear-supermodular and 20 supermodular treat-ments In early rounds ln(Round) is large relativeto Round The negative and weakly significantcoefficient on D2018ln(Round) implies a lowerprobability of early-round equilibrium play in

12 We omit 1020 from this analysis First it does notachieve the same -p1 convergence level as the other su-permodular treatments Second omitting it avoids the pos-sibility of an -effect

13 Kolmogorov-Smirnov test result tables are availableby request

TABLE 4mdashCONVERGENCE SPEED

Probit models with individual-level clustering comparing2018 with the 20 supermodular treatments achieving

similar convergence levels

Dependent variable -Nashequilibrium play

(1) -Nashprice 1

(2) -Nashprice 2

ln(Round) 0054 0216(0040) (0043)

Round 0004 0001(0002) (0002)

D2018ln(Round) 0013 0066(0032) (0035)

D2018Round 0001 0006(0002) (0003)

Observations 4680 4680Log pseudo-likelihood 287907 299385

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses Significant at 10-percent level 5-percent level 1-percent level Dy isthe dummy variable for treatment y Excluded dummies areD2020 and D2040

1518 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

2018 relative to the supermodular treatmentsThe 18 treatment does catch up to these treat-ments which is reflected in the positive andsignificant coefficient on D2018 interactedwith Round This result suggests a differencebetween supermodular and near-supermodulartreatments while they achieve similar conver-gence levels the supermodular treatments per-form better in early rounds and thus mightreach certain convergence levels faster thantheir near-supermodular counterparts

The previous discussion examines the sepa-rate effects of strategic complementarity (-effects) and (-effects) on convergenceVarying allows us however to test the im-portance of strategic complementarity relativeto other features of mechanism design Having afull factorial design in the parameters 1020 and 0 20 we can study whether affects the role of strategic complementarity Inparticular as increases from 0 to 20 weexpect improvement in convergence level andspeed We study whether this improvementchanges when increases from 10 to 20

The last two Wald tests in the bottom panelof Table 3 separately examine the -effectson the change in convergence level and speedresulting from increasing from 0 to 20(-effects on -effects) Changing from 10 to20 does not significantly change the improve-ment in either convergence level or speed Thisis the first empirical result examining the ef-fects of other factors on the role of strategiccomplementarity

So far we have discussed the performance ofthe mechanism relative to equilibrium predic-tions and individual behavior We now turn togroup-level welfare results Since the compen-sation mechanism balances the budget only in

equilibrium total payoffs off the equilibriumpath can be only weakly related to efficientpayoffs Therefore we use two separate mea-sures to capture welfare implications an effi-ciency measure and a measure of budgetimbalance

We first define the per-round efficiency mea-sure which includes neither the taxsubsidy northe penalty terms ie

et rxt cxt ext

rx cx ex

where x is the efficient quantity The efficiencyachieved in a block e(t1 t2) where 0 t1 t2 60 is then defined as

et1 t2 t t1

t2 et

t2 t1 1

RESULT 3 (Efficiency in the Last 20Rounds)

(i) When 20 increasing from 0 to 18significantly improves efficiency while in-creasing from 0 to 20 weakly improvesefficiency

(ii) When 20 increasing from 18 to 20has no significant effect on efficiency

(iii) When 20 increasing from 20 to 40weakly improves efficiency

(iv) When 0 increasing from 10 to 20weakly improves efficiency

(v) When 20 increasing from 10 to 20has no significant effect on efficiency

SUPPORT Table 5 presents the efficiencymeasure for each session in each treatment

TABLE 5mdashEFFICIENCY MEASURE

Efficiency Last 20 rounds

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0600 0671 0804 0858 0733 1000 2000 00872000 0650 0870 0907 0873 0825 0825 2000 2020 00952018 0963 0952 0889 0959 0941 2000 2018 00161020 0958 0919 0905 0881 0984 0929 1020 2020 07142020 0888 0937 0939 0780 0971 0903 2018 2020 08332040 0973 0962 0943 0949 0957 2020 2040 0064

Note Significant at 10-percent level 5-percent level 1-percent level

1519VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

in the last 20 rounds the alternative hypoth-eses and the results of one-tailed permutationtests

Result 3 is largely consistent with Result 1indicating that supermodular and near-supermodular mechanisms induce a signifi-cantly higher proportion of equilibrium playthan mechanisms far from the supermodularthreshold The new finding is that increasing from the threshold 20 to 40 weakly improvesefficiency

Second the budget surplus at round t is thesum of the penalty terms plus tax minus sub-sidy in that round ie s(t) ( )(p1(t) p2(t))2 p2(t)x(t) p1(t)x(t) Examining thesession-level budget surplus across all roundswe find the following results First 22 out of 27sessions result in a budget surplus This comesfrom a combination of our choice of parametersand the dynamics of play Second the onlysignificant difference across treatments is thatbudget surpluses under 2018 and 2020are significantly lower than those under2040 ( p-values 0029 and 0024 respec-tively) This finding points to a potential costof driving up the punishment parameter Inother words before the system equilibrates ahigh punishment parameter can worsen thebudget imbalance problem inherent in themechanism

Overall Results 1 through 3 suggest the fol-lowing observations regarding games with stra-tegic complementarities First in terms ofconvergence level and speed supermodular andnear-supermodular games perform significantlybetter than those far under the threshold Sec-ond while there is no significant differencebetween supermodular and near-supermodulargames in terms of convergence level early per-formance differences may lead to supermodulartreatments reaching certain convergence levelsmore quickly Third beyond the threshold in-creasing has no significant effect on either thelevel or the speed of convergence but has thedisadvantage of risking higher budget imbal-ance Last for a given increasing haspartial effects on convergence level but nosignificant effect on convergence speed Tocheck the persistence of these experimental re-sults in the long run we use simulations inSection V

V Simulation Results Continued Dynamics

Our experiment examines the relationship be-tween strategic complementarity and conver-gence to equilibrium Learning theory predictslong-run convergence Figures 1 and 2 showthat convergence continues to improve in laterrounds for several treatments suggesting con-tinued dynamics Due to time attention andresource constraints it was not feasible to runthe experiment for much longer than 60rounds Therefore we rely on simulations tostudy continued convergence beyond 60rounds

To do so we look for a learning algorithmwhich when calibrated closely approximatesthe observed dynamic paths over 60 roundsThe large empirical literature on learning ingames (see eg Camerer 2003 for a survey)suggests many models Our interest here is notto compare the performance of various learningmodels We look for learning models that matchtwo criteria First we require a model that per-forms well in a variety of experimental gamesSecond given our experiment has complete in-formation about the payoff structure the modelneeds to incorporate this information In thefollowing subsections we first introduce threelearning models which meet our criteria Wethen look at how well the learning models pre-dict the experimental data by calibrating eachalgorithm on a subset of the experimental dataand validating the models on a hold-out sampleWe then report the forecasting results using oneof the calibrated algorithms

A Three Learning Models

The models we examine are stochastic ficti-tious play with discounting (hereafter shortenedas sFP) (Yin-Wong Cheung and Daniel Fried-man 1997 Fudenberg and Levine 1998) func-tional EWA (fEWA) (Teck-Hua Ho et al2001) and the payoff assessment learning model(PA) (Rajiv Sarin and Farshid Vahid 1999)We now give a brief overview of each modelInterested readers are referred to the originalsfor complete descriptions

The particular version of sFP that we use islogistic fictitious play (see eg Fudenberg andLevine 1998) A player predicts the matchrsquosprice in the next round pi(t 1) according to

1520 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(6) pi t 1

pi t yen 1

t 1

rpi t

1 yen 1

t 1

r

for some discount factor r [0 1] Note r 0corresponds to the Cournot best reply assess-ment pi(t 1) pi(t) When r 1 it yieldsthe standard fictitious play assessment Theusual adaptive learning model assumes 0 r 1All observations influence the expected state butmore recent observations have greater weight

As opposed to standard fictitious play stochas-tic fictitious play allows decision randomizationand thus better captures the human learning pro-cess Omitting time subscripts the probability thata player announces price pi is given by

(7) Pr pipi exp13 pi pi

yenj 0

40

exp13 pj pj

Given a predicted price a player is thus morelikely to play strategies that yield higher pay-offs How much more likely is determined by 13the sensitivity parameter As 13 increases theprobability of a best response to pi increases

The second model we consider fEWA is aone-parameter variant of the experience-weighted attraction (EWA) model (Camererand Ho 1999) In this model strategy probabil-ities are determined by logit probabilities simi-lar to equation (7) with actual payoffs ((pipi)) replaced by strategy attraction In bothvariants strategy attraction Ai

j(t) and an expe-rience weight N(t) are updated after everyperiod The experience weight is updated ac-cording to N(t) (1 ) N(t 1) 1where is the change-detection parameter and controls exploration (low ) versus exploita-tion Attractions are updated according to thefollowing rule

(8) Aijt

Nt 1Ai

jt 1 1 I pij pi ti pj pi t

Nt

where the indicator function I(pij pi(t)) equals

one if pij pi(t) and zero otherwise and [0

1] is the imagination weight In EWA all pa-rameters are estimated whereas in fEWA theseparameters (except for 13) are endogenously de-termined by the following functions Thechange-detector function i(t) is given by

(9) i t 1 5 j 1

mi Ipij pi t

1

yen 1

t

I pij pi t

t 2

This function will be close to 1 when recenthistory resembles previous history The imagi-nation weight i(t) equals i(t) while equalsthe Gini coefficient of previous choice frequen-cies EWA models encompass a variety offamiliar learning models cumulative reinforce-ment learning ( 0 1 N(0) 1)weighted reinforcement learning ( 0 0N(0) 1) weighted fictitious play ( 1 0) standard fictitious play ( 1 0)and Cournot best reply ( 1 1)

Finally we introduce the main componentsof the PA model For simplicity we omit allsubscripts that represent player i and let j(t)represent the actual payoff of strategy j in roundt Since the game has a large strategy space weincorporate similarity functions into the modelto represent agent use of strategy similarity Asstrategies in this game are naturally ordered bytheir labels we use the Bartlett similarity func-tion fjk(h t) to denote the similarity betweenthe played strategy k and an unplayed strategyj at period t

fjk h t 1 j kh if j k h0 otherwise

In this function the parameter h determines theh 1 unplayed strategies on either side of theplayed strategy to be updated When h 1fjk(1 t) degenerates into an indicator functionequal to one if strategy j is chosen in round t andzero otherwise

The PA model assumes that a player is amyopic subjective maximizer That is hechooses strategies based on assessed payoffs

1521VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

and does not explicitly take into account thelikelihood of alternate states Let uj(t) denotethe subjective assessment of strategy pj at timet and r the discount factor Payoff assessmentsare updated through a weighted average of hisprevious assessments and the payoff he actuallyobtains at time t If strategy k is chosen at timet then

(10) uj t 1 1 rfjk h tuj t

rfjk h tk t j

Each period the assessed strategy payoffs aresubject to zero-mean symmetrically distributedshocks zj(t) The decision maker chooses on thebasis of his shock-distorted subjective assess-ments uj(t) uj(t) zj(t) At time t he choosesstrategy pk if

(11) uk t uj t 0 pj pk

Note that mood shocks affect only his choicesand not the manner in which assessments areupdated

B Calibration

Literature assessing the performance oflearning models contains two approaches to cal-ibration and validation The first approach cal-ibrates the model on the first t rounds andvalidates on the remaining rounds The secondapproach uses half of the sessions in each treat-ment to calibrate and the other half to validateWe choose the latter approach for two reasonsFirst this approach is feasible as we have mul-tiple independent sessions for each treatmentSecond we need not assume that the parametersof later rounds are the same as those in earlierrounds We thus calibrate the parameters ofeach model in blocks of 15 rounds using theexperimental data from the first two sessions ofeach treatment We then evaluate each model bymeasuring how well the parameterized modelpredicts play in the remaining two or threesessions

For parameter estimation we conduct MonteCarlo simulations designed to replicate thecharacteristics of the experimental settings In

all calibrations we exclude the last two rounds(59 and 60) to avoid any end-of-game effectsWe then compare the simulated paths with theexperimental data to find those parameters thatminimize the mean-squared deviation (MSD)scores Since the final distributions of our pricedata are unimodal the simulated mean is aninformative statistic and is well captured byMSD (Ernan Haruvy and Dale Stahl 2000) Inall simulations we use the k-period-aheadrather than the one-period-ahead approach14 be-cause we are interested in forecasting the long-run mechanism performance In doing so wechoose k 10 15 20 30 and 58 We look atblocks of 10 15 20 and 30 rounds because asplayers gain information and experience infor-mation use may change over time We use k 15 rather than the other values because it bestcaptures the actual dynamics in the experimen-tal data

Each simulation consists of 1500 games(18000 players) and the following steps

(i) Simulated players are randomly matchedinto pairs at the beginning of each round

(ii) Simulated players select price announce-ments(a) Initial round Almost half of all play-

ers selected a first-round price of 20(44 percent of player 1s and 43 percentof player 2s) There are also consider-able spikes in the frequency of select-ing other prices divisible by 515 Basedon Kolmogorov-Smirnov tests on theactual round-one price distribution wereject the null hypotheses of uniformdistribution (d 0250 p-value 0000) and normal distribution (d 0381 p-value 0000) We thus fol-low the convention (eg Ho et al2001) and use the actual first-roundempirical distribution of choices togenerate the first-round choices

(b) Subsequent rounds Simulated playerstrategies are determined via equation(7) for sFP and fEWA and equation(11) for PA

14 Ido Erev and Haruvy (2000) discuss the tradeoffs ofthe two approaches

15 For example the seven most frequently selected pricesare divisible by 5

1522 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iii) Simulated player 1rsquos quantity choice isbased on the following steps(a) Determine each player 1rsquos best re-

sponse to p2(b) Determine whether player 1 will devi-

ate from the best response via the ac-tual probability of errors for eachblock

(c) If yes deviate via the actual mean andstandard deviation for the block Oth-erwise play the best response

(iv) Payoffs are determined by the payoff func-tion of the compensation mechanismequation (1) for each treatment

(v) Assessments are updated according toequation (8) for fEWA and equation (10)for PA

(vi) Proceed to the next round

The discount factor r [0 1] is searched ata grid size of 01 The parameter 13 is searchedat a grid size of 01 in the interval [15 105] forfEWA and [15 250] for sFP The size of thesimilarity window h [1 10] is searched at agrid size of 1 Mood shocks z are drawn from

a uniform distribution16 on an interval [a a]where a is searched on [0 500] with a step sizeof 50 For all parameters intervals and gridsizes are determined by payoff magnitude

Table 6 reports the calibrated parameters(discount factor sensitivity parameter moodshock interval and similarity window size) forthe first two sessions of each treatment in 15-round blocks17 Estimated parameters for thesupermodular and near-supermodular treatmentsare consistent with the increased level of con-vergence over time while the 00 treatmentsare not The second column (sFP) reports thebest-fit parameters for the stochastic fictitiousplay model With the exception of treatment2000 the discount factor is close to 10

16 Chen and Yuri Khoroshilov (2003) compare threeversions of PA models where shocks were drawn fromuniform logistic and double exponential distributions ontwo sets of experimental data and find that the performanceof the three versions of PA were statistically indistinguish-able

17 We also calibrate all parameters for the entire 60rounds but do not report results due to space limitations

TABLE 6mdashCALIBRATION OF THREE LEARNING MODELS IN FIFTEEN-ROUND BLOCKS

Model sFP fEWA PA

Block 1 2 3 4 1 2 3 4 1 2 3 4

Discount rate (r) Discount rate (r)1000 10 07 09 10 01 00 09 102000 07 06 01 01 05 10 02 012018 10 10 09 09 02 10 03 021020 10 10 10 09 01 03 06 032020 10 10 10 09 02 01 05 022040 10 10 10 07 01 10 04 03

Sensitivity (13) Sensitivity (13) Shock interval (a)1000 15 44 45 28 16 45 44 37 500 500 250 502000 15 43 140 180 17 40 51 74 50 100 150 502018 16 80 115 183 15 53 62 95 50 300 500 1501020 19 59 86 138 18 47 62 82 100 500 450 3002020 28 56 81 112 24 38 61 67 200 300 350 3502040 34 65 140 205 32 49 67 99 150 50 150 50

Similarity window (h)1000 1 1 10 92000 10 10 9 62018 5 4 6 11020 1 1 8 32020 2 2 7 12040 1 3 2 2

1523VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

indicating that players keep track of all past playwhen forming beliefs about opponentsrsquo nextmove18 Given these beliefs the increasing sen-sitivity parameter 13 for all treatments except1000 indicates that the likelihood of bestresponse increases over time The third columnreports calibration results for the fEWA modelAgain with the exception of treatment 1000the parameter 13 increases over time The finalcolumn reports calibration of parameters in thePA model For each treatment except 1000 amiddle block has the highest discount factorindicating more weight on new informationabout a strategyrsquos performance The second pa-rameter in the PA model represents the upperbound of the interval from which shocks aredrawn a Experimentation should decrease inthe final rounds Indeed for all treatments theestimated mood-shock ranges (weakly) de-crease from the third to the last block Finallythe decreasing similarity windows from thethird to final block indicate decreasing strategyspillover Relatively large discount factors inthe third block combined with relatively largesimilarity windows flatten the payoff assess-ments around the most recently played strate-gies consistent with the local experimentation(or relatively stabilized play) observed in thedata

C Validation

Using the parameters calibrated on the firsttwo sessions of each treatment we next com-pare the performance of the three learning mod-els in predicting play in the hold-out sampleFor comparison we also present the perfor-mance of two static models The random choicemodel assumes that each player randomlychooses any strategy with equal probability forall rounds This model incorporates only num-ber of strategies and thus provides a minimumstandard for a dynamic learning model Theequilibrium model assumes that each playerplays the subgame perfect Nash equilibriumevery round Its fit conveys the same informa-tion as the proportion of equilibrium play pre-

sented in Section IV but with a different metric(MSD)

Table 7 presents each modelrsquos MSD scoresfor each hold-out session Recall that two ofeach treatmentrsquos four or five independent ses-sions are used for calibration and the rest forvalidation The results indicate that all threelearning models perform significantly betterthan the random choice model (p-value 001one-sided permutation tests) and by a largermargin significantly better than the equilibriummodel (p-value 001 one-sided permutationtests) While the equilibrium model does a poorjob of explaining overall experimental data itsperformance improvement over time19 justifiesthe use of learning models to explain the dy-namics of play Within each of the top threepanels (ie learning models) session-levelMSD scores are lower for the supermodular andnear-supermodular treatments indicating thateach learning model does a better job explainingbehavior in those treatments Indeed in the em-pirical learning literature learning models fitbetter in experiments with better equilibriumconvergence (see eg Chen and Khoroshilov2003)

RESULT 4 (Comparison of Learning Mod-els) The performance of PA is weakly betterthan that of fEWA and strictly better than that ofsFP The performances of fEWA and sFP arenot significantly different

SUPPORT Table 7 reports the MSD scores foreach independent session in the hold-out sampleunder each model The Wilcoxon signed-rankstests show that the MSD scores under PA areweakly lower than those under fEWA (z 0085 p-value 0068) and strictly lowerthan those under sFP (z 0028 p-value 0023) Using the same test we cannot reject thehypothesis that fEWA and sFP yield the sameMSD scores (z 0662 p-value 0508)

Although the performance of PA is onlyweakly better than that of fEWA the overallMSD scores for PA are lower than those for

18 As the discount factor is zero in Cournot best replywe can reject this model based on our estimation of thediscount factors

19 We omit the table to show the performance of theequilibrium model in blocks of 15 rounds as the informa-tion is repetitive with Figures 1 and 2

1524 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

fEWA for every treatment Therefore we usePA for forecasting beyond 60 rounds

Figure 3 presents the simulated path of thecalibrated PA model and compares it with theactual data in the hold-out sample by superim-posing the simulated mean (black line) plus andminus one standard deviation (grey lines) on theactual mean (black box) and standard deviation(error bars) The simulation does a good job oftracking both the mean and the standard devia-tion of the actual data In treatments 1000and 2040 however the reduction in variancelags that seen in the middle rounds of theexperiment

D Forecasting

We now report the results from the PAmodel As the strategic complementarity predic-tions concern long-run performance we use thecalibrated parameters for the first 58 rounds tosimulate play in later rounds This exercise al-

lows us to study convergence and efficiency inlong but finite horizons

In all forecasting exercises we base all post-58-round parameters on those from the lastblock (calibrated from rounds 46 to 58) Sincewe do not know how these parameters mightchange beyond 60 rounds we use three differ-ent specifications First we retain all final-blockparameters Second we exponentially decay insubsequent blocks the probability of deviationfrom the best response in the production stage20

Third we exponentially decay the shock inter-val in subsequent blocks As all three specifica-tions yield qualitatively similar results wereport results from only the third specificationdue to space limitations21 In presenting the

20 In block k 4 we use the probability of deviationPrk minPr4(k 4)2 108 We set a lower bound of108 to avoid the division-by-zero problem

21 Different sequences of random numbers produceslightly different parameters estimates While the conver-

TABLE 7mdashVALIDATION ON HOLD-OUT SESSIONS

Session

Stochastic fictitious play

1000 2000 2018 1020 2020 2040

1 0955 0934 0940 0930 0888 09402 0960 0946 0890 0917 0973 08783 0948 0883 0873Overall 0957 0943 0915 0910 0912 0909

fEWA

1 0956 0934 0942 0928 0887 09482 0960 0946 0890 0922 0978 08863 0946 0879 0871Overall 0958 0942 0916 0910 0912 0917

Payoff assessment

1 0952 0933 0914 0941 0888 09162 0957 0944 0899 0922 0917 08983 0947 0894 0903Overall 0955 0941 0907 0919 0902 0907

Equilibrium play

1 1867 1853 1833 1833 1489 17922 1889 1919 1606 1839 1953 16693 1917 1447 1539Overall 1878 1896 1719 1706 1660 1731

Random play

Overall 0976 0976 0976 0976 0976 0976

1525VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

results we use the shorthand notation x y todenote a measure under treatment x is signifi-cantly higher than that under treatment y at the5-percent level or less and x y to denote thatthe measures are not significantly different atthe 5-percent level

Figure 4 reports the simulated proportion of-equilibrium play We report only the resultsfor the first 500 rounds since the dynamics donot change much thereafter In our simulationall treatments reach higher convergence levelsthan those achieved in the 60 rounds of the

experiment Since the simulated variance reduc-tion lags that of the experiment round 60 resultsare not achieved until approximately 90 roundsof simulation We further note that these im-provements slow down after 200 rounds Theproportion of -equilibrium play is bounded by72 percent for player 1 and 93 percent for player2 We now outline the simulation results

RESULT 5 (Level of Convergence in Round500) In the simulated data for player 1 atround 500 we have the following level of con-vergence ranking in

(i) p1 2018 2040 2020 1020 2000 1000

(ii) -p1 2018 2040 2020 1020 2000 1000

(iii) p2 2040 2018 1020 2020 2000 1000 and

gence level of a given treatment for a given set of parameterestimates is not affected by the sequence of random num-bers this convergence level is somewhat sensitive to theexact parameter calibration We therefore conduct four dif-ferent calibrations for each treatment and select the param-eters that yield the highest level The relative rankings oftreatments are quite robust with respect to ordinal selected

FIGURE 3 SIMULATED DYNAMIC PATH VERSUS ACTUAL DATA

1526 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iv) -p2 2040 2018 1020 2020 2000 1000

SUPPORT The fourth and seventh columns ofTable 8 report p-values for t-tests of the preced-ing alternative hypotheses for players 1 and 2respectively

Comparing Results 1 and 5 we first note thatthe four supermodular and near-supermodulartreatments continue to dominate the 00 treat-ments In fact the simulations suggest that thegap remains constant compared with 2000and actually increases compared with 1000Unlike Result 1 however the four dominanttreatments differ In particular 2018 per-forms better in terms of player 1rsquos convergenceand 2040 in terms of player 2rsquos although thedifferences within the top four treatments aresmaller than the differences between these four

and the 00 treatments In addition an increasein significantly improves convergence by alarge margin (40 to 80 percent) for both playerswhen 0

It is instructive to compare these results withthose concerning the speed of convergence inour experimental treatments In terms of con-vergence to -p2 our regression results (Table4) indicate that 2018rsquos performance in laterrounds enables it to achieve the same conver-gence level as the supermodular treatmentsThis trend continues in our simulations as2018 performs robustly well compared to thesupermodular treatments Likewise in our anal-ysis of the experimental results the conver-gence speed of 2040 was not significantlydifferent from those of the other dominant treat-ments In our simulations however this treat-ment dominates all treatments in player 2equilibrium play

FIGURE 3 CONTINUED

1527VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

In our simulations the convergence improve-ment due to increasing from 0 to 20 doesdepend on (-effect on -effect) An increasein significantly decreases the improvement in

convergence level (all p-values for one-sidedt-tests are less than 001) Due to the relativelypoor performance of 1000 in our simula-tions increasing from 10 to 20 when 0

Player 1

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500Round

Ep

silo

n-e

qu

ilib

riu

m p

lay

α10β00α20β00α20β18α10β20α20β20α20β40

Player 2

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500

Round

Ep

silo

n-e

qu

ilib

riu

m p

lay α10β00

α20β00α20β18α10β20α20β20α20β40

FIGURE 4 SIMULATED -EQUILIBRIUM PLAY

1528 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

improves convergence more dramatically inround 500 than in rounds 40 to 60 In the longrun an increase in is a partial substitute for anincrease in

We now use definition 2 to examine the con-vergence speed in the long run Table 9 presentsresults for the first time a treatment reaches thelevel of convergence L We omit the two 00treatments since they do not converge to thesame levels as the other treatments In terms ofplayer 1 2018 achieves the fastest conver-gence for all levels L The picture for player 2convergence however is more subtle First forthe 20 and 18 treatments the speeds of con-vergence to all L are remarkably similarWhile 2040 lags the other treatments interms of achieving 80-percent -equilibriumplay for player 2 it is the only treatment toachieve L 90 percent

Apart from convergence to equilibrium it isalso important to look at long-run welfare prop-erties We evaluate mechanism performance us-ing the efficiency measure in Section IV

Figure 5 summarizes the simulated and actualefficiency achieved by each of the treatmentsThe top panel reports simulated results whilethe bottom panel reports efficiency in the ex-perimental data In the long run supermodularand near-supermodular treatments continue todiffer from the 00 treatments with 20 supe-rior to 10 in the 00 treatments

VI Interpretation and Discussion

The underlying force for convergence ingames with strategic complementarities is thecombination of the slope of the best-responsefunctions and adaptive learning by players In

TABLE 9mdashSPEED OF CONVERGENCE IN SIMULATED DATA

(Threshold is the percent of players playing epsilon-equilibrium strategy Entry indicates round in which went overthreshold for good where ldquomdashrdquo indicates that treatment never achieved threshold)

Threshold

Player 1 Player 2

04 05 06 07 04 05 06 07 08 09

2018 54 83 126 316 38 52 62 87 127 mdash1020 129 221 mdash mdash 36 48 63 91 148 mdash2020 61 91 149 mdash 39 51 61 84 137 mdash2040 101 166 263 496 30 38 56 94 166 339

TABLE 8mdashRESULTS OF t-TESTS COMPARING LEVEL OF CONVERGENCE OF SIMULATIONS OF EXPERIMENTAL TREATMENTS

(Values are for round 500 and based on 1500 simulated games)

Treatment

Player 1 equilibrium price Player 2 equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0073 00822000 0188 2000 1000 0000 0213 2000 1000 00002018 0265 2018 2040 0187 0381 2018 1020 02641020 0211 1020 2000 0000 0376 1020 2020 00012020 0243 2020 1020 0000 0353 2020 2000 00002040 0259 2040 2020 0007 0442 2040 2018 0000

Treatment

Player 1 -equilibrium price Player 2 -equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0216 02322000 0519 2000 1000 0000 0573 2000 1000 00002018 0713 2018 2040 0045 0870 2018 1020 00081020 0584 1020 2000 0000 0857 1020 2020 00002020 0662 2020 1020 0000 0834 2020 2000 00002040 0701 2040 2020 0000 0925 2040 2018 0000

1529VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the compensation mechanisms the slope ofplayer 2rsquos best response function is determinedby As and determine the penalty for

mismatched prices we seriously consider thepossibility that convergence in supermodularand near-supermodular treatments to an equilib-

Simulation Data

75

80

85

90

95

100

50 150 250 350 450

Round

α10β00α20β00α20β18α10β20α20β20α20β40

Experimental Data

40

45

50

55

60

65

70

75

80

85

90

0 10 20 30 40 50 60

Round

Fra

ctio

n o

f M

axim

um

Wel

fare

α10β00α20β00α20β18α10β20α20β20α20β40

Fra

ctio

n o

f M

axim

um

Wel

fare

FIGURE 5 EFFICIENCY IN THE SIMULATED AND ACTUAL DATA

1530 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium where both players announce the sameprice is not due to strategic complementaritiesbut rather to increases in and making thepenalty terms more prominent and matching strat-egies more focal22 We thus have two hypoth-eses to explain the observed improvement inconvergence to equilibrium play in supermodu-lar and near-supermodular treatments a best-response hypothesis and a matching hypothesisIn this section we present evidence that theimprovement in convergence is due to payoff-relevant changes to best responses and not tomatching becoming more focal

While the focal point hypothesis suggeststhat players converge to a match it does notspecify the price at which they match In thefirst round of our experiments almost 50 per-cent of all prices were 20 and less than 10percent were in the -equilibrium range of[15 17] In the final rounds of our experi-ments play in the four supermodular or near-supermodular treatments converges stronglyto this range suggesting more than simplematching

By equation (4) player 1rsquos best response isalways to match regardless of p2 By theGeneral Incentive Hypothesis we expectmore matching behavior by player 1 as increases By equation (5) however match-ing is a best response for player 2 only inequilibrium Therefore data for player 2 canhelp us separate the best-response and match-ing hypotheses

We first investigate which model better ex-plains our experimental data We operationalizethe best response hypothesis by the stochasticfictitious play model of Section V and thematching hypothesis with a stochastic matchingmodel23 In both models the predicted player 1price is specified by equation (6) In the match-ing model player 2 plays this price with prob-ability 1 and one of the other 40 priceswith probability 40 We estimate using thefirst two sessions of each treatment24 We then

compare the performance of the matchingmodel with the sFP model using the hold-outsample Using the Wilcoxon signed-rank teststo compare the MSD scores in the hold-outsample we conclude that sFP explains player2rsquos behavior significantly better than the match-ing model (p-value 0006) We conclude thatbest response to a predicted price explains be-havior significantly better than the matchingmodel

We next investigate whether differences inthe cost of matching defined as the differencebetween matching and best-response profitscan explain all of the differences in match-ing behavior among treatments or whetherthere exists additional matching behaviorthat cannot be explained by payoff-relevantfactors

We examine the probability that player 2 triesto match the anticipated player 1 price We donot know what price player 2 anticipates Toestimate how players weigh history we cali-brate a matching model in a manner similar tothat outlined in Section V B We assume aplayer matches the price he anticipates givenby equation (6) and we find the discount ratefor each treatment that minimizes MSD25 Thisgives us the anticipated price that best explainsthe matching hypothesis Using this discountrate we calculate for each t 1 an anticipatedplayer 1 price p1 for each player 2 We alsocalculate the opportunity cost of matching p1equal to the difference between best-responseand matching profits This opportunity cost cap-tures the payoff relevant effects of and alsodepends on the price to be matched

We next regress the probability that player 2matches the anticipated price Pr(p2 p1) onln(Round) treatment dummies and treatmentdummies interacted with ln(Round) In onespecification we include the cost of matching asa regressor If parameter changes make match-ing more focal then changes in the cost ofmatching will not explain all of the changes inprobability of matching

22 We thank an anonymous referee and the co-editor forpointing this out

23 We do not report the results of the deterministicmatching model as it performs significantly worse than itsstochastic counterpart

24 Estimated in each block are as follows 1000 10 7 6 8 2000 7 6 5 4 2018 7

2 3 3 1020 5 3 3 2 2020 1 0 0 02040 2 0 0 1

25 In all treatments the calibrated discount rate is r 01

1531VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Table 10 presents the results of two probitmodels with clustering at the individual level Inthe first specification we do not include the costof matching as a regressor In this specificationthe coefficient for D00 is negative and signifi-cant thus reducing from 20 to 0 decreases theprobability that player 2 will try to match player1rsquos price Including the cost of matching in theregression however we find that neither thedummies nor their interactions are signifi-cantmdashthe coefficient for the previously signifi-cant D00 dummy falls almost by a half whilethe precision of the estimate stays about thesame The coefficient on the cost of matchinghowever is significant and negative Thisfinding suggests that the rise in matching that

occurs when we increase from 0 to 20 is notbecause matching is more focal but ratherbecause an increase of decreases the cost ofmatching26

VII Concluding Remarks

The appeal of games with strategic comple-mentarities is simple as long as players areadaptive learners the game will in the limitconverge to the set bounded by the largest andsmallest Nash equilibria This convergence de-pends on neither initial conditions nor the as-sumption of a particular learning modelUnfortunately while many competitive envi-ronments are supermodular many are not Inthis study we examine whether there are non-supermodular games which have the same con-vergence properties as supermodular ones Wealso study whether there exists a clear conver-gence ranking among games with strategiccomplementarities

Our results confirm the findings of previousexperimental studies that supermodular gamesperform significantly better than games far be-low the supermodular threshold From a littlebelow the threshold to the threshold howeverthe change in convergence level is statisticallyinsignificant This results suggests that in thecontext of mechanism design the designerneed not be overly concerned with setting pa-rameters that are firmly above the supermodularthreshold close is just as good It also enlargesthe set of robustly stable games For examplethe Falkinger mechanism (Falkinger 1996) isnot supermodular but close These results sug-gest that near-supermodular games perform likesupermodular ones

Our next result concerns convergence perfor-mance within the class of games with strategiccomplementarities Variations in the degree ofcomplementarities have no significant effect onperformance within the 60 experimental roundsOur simulations suggest however an increased

26 We consider other specifications of anticipated priceincluding the averages of the last one two three and fourprices seen In only one specification was a dummy or itsinteraction with ln(Round) significant In that specificationthe coefficient for D18 is positive and its interaction withln(Round) is negative a finding which is not consistent witha focal point hypothesis

TABLE 10mdashPLAYER 2 MATCHING HYPOTHESIS

(Probit models with individual-level clustering comparingmodels with and without the cost to Player 2 of matching

the anticipated price of Player 1)

Model

Dependent variableProbability player 2 priceequals anticipated player 1

price

(1) (2)

D10 0014 0017(0043) (0041)

D00 0077 0043(0035) (0036)

D18 0046 0032(0053) (0056)

D40 0008 0041(0061) (0042)

ln(Round) 0041 0028(0011) (0010)

Match Cost 0001(0000)

ln(Round)D10 0014 0012(0012) (0012)

ln(Round)D00 0003 0000(0013) (0012)

ln(Round)D18 0006 0002(0021) (0020)

ln(Round)D40 0001 0015(0018) (0016)

Observations 9588 9588Log pseudo-likelihood 324321 317716

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses are adjusted for cluster-ing at the individual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummyvariable for treatment y Excluded dummy is D2020Match Cost is the difference in cents of matching andbest-responding to anticipated Player 1 price

1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 15: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

2018 relative to the supermodular treatmentsThe 18 treatment does catch up to these treat-ments which is reflected in the positive andsignificant coefficient on D2018 interactedwith Round This result suggests a differencebetween supermodular and near-supermodulartreatments while they achieve similar conver-gence levels the supermodular treatments per-form better in early rounds and thus mightreach certain convergence levels faster thantheir near-supermodular counterparts

The previous discussion examines the sepa-rate effects of strategic complementarity (-effects) and (-effects) on convergenceVarying allows us however to test the im-portance of strategic complementarity relativeto other features of mechanism design Having afull factorial design in the parameters 1020 and 0 20 we can study whether affects the role of strategic complementarity Inparticular as increases from 0 to 20 weexpect improvement in convergence level andspeed We study whether this improvementchanges when increases from 10 to 20

The last two Wald tests in the bottom panelof Table 3 separately examine the -effectson the change in convergence level and speedresulting from increasing from 0 to 20(-effects on -effects) Changing from 10 to20 does not significantly change the improve-ment in either convergence level or speed Thisis the first empirical result examining the ef-fects of other factors on the role of strategiccomplementarity

So far we have discussed the performance ofthe mechanism relative to equilibrium predic-tions and individual behavior We now turn togroup-level welfare results Since the compen-sation mechanism balances the budget only in

equilibrium total payoffs off the equilibriumpath can be only weakly related to efficientpayoffs Therefore we use two separate mea-sures to capture welfare implications an effi-ciency measure and a measure of budgetimbalance

We first define the per-round efficiency mea-sure which includes neither the taxsubsidy northe penalty terms ie

et rxt cxt ext

rx cx ex

where x is the efficient quantity The efficiencyachieved in a block e(t1 t2) where 0 t1 t2 60 is then defined as

et1 t2 t t1

t2 et

t2 t1 1

RESULT 3 (Efficiency in the Last 20Rounds)

(i) When 20 increasing from 0 to 18significantly improves efficiency while in-creasing from 0 to 20 weakly improvesefficiency

(ii) When 20 increasing from 18 to 20has no significant effect on efficiency

(iii) When 20 increasing from 20 to 40weakly improves efficiency

(iv) When 0 increasing from 10 to 20weakly improves efficiency

(v) When 20 increasing from 10 to 20has no significant effect on efficiency

SUPPORT Table 5 presents the efficiencymeasure for each session in each treatment

TABLE 5mdashEFFICIENCY MEASURE

Efficiency Last 20 rounds

Treatment Session 1 Session 2 Session 3 Session 4 Session 5 Overall H1 p-value

1000 0600 0671 0804 0858 0733 1000 2000 00872000 0650 0870 0907 0873 0825 0825 2000 2020 00952018 0963 0952 0889 0959 0941 2000 2018 00161020 0958 0919 0905 0881 0984 0929 1020 2020 07142020 0888 0937 0939 0780 0971 0903 2018 2020 08332040 0973 0962 0943 0949 0957 2020 2040 0064

Note Significant at 10-percent level 5-percent level 1-percent level

1519VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

in the last 20 rounds the alternative hypoth-eses and the results of one-tailed permutationtests

Result 3 is largely consistent with Result 1indicating that supermodular and near-supermodular mechanisms induce a signifi-cantly higher proportion of equilibrium playthan mechanisms far from the supermodularthreshold The new finding is that increasing from the threshold 20 to 40 weakly improvesefficiency

Second the budget surplus at round t is thesum of the penalty terms plus tax minus sub-sidy in that round ie s(t) ( )(p1(t) p2(t))2 p2(t)x(t) p1(t)x(t) Examining thesession-level budget surplus across all roundswe find the following results First 22 out of 27sessions result in a budget surplus This comesfrom a combination of our choice of parametersand the dynamics of play Second the onlysignificant difference across treatments is thatbudget surpluses under 2018 and 2020are significantly lower than those under2040 ( p-values 0029 and 0024 respec-tively) This finding points to a potential costof driving up the punishment parameter Inother words before the system equilibrates ahigh punishment parameter can worsen thebudget imbalance problem inherent in themechanism

Overall Results 1 through 3 suggest the fol-lowing observations regarding games with stra-tegic complementarities First in terms ofconvergence level and speed supermodular andnear-supermodular games perform significantlybetter than those far under the threshold Sec-ond while there is no significant differencebetween supermodular and near-supermodulargames in terms of convergence level early per-formance differences may lead to supermodulartreatments reaching certain convergence levelsmore quickly Third beyond the threshold in-creasing has no significant effect on either thelevel or the speed of convergence but has thedisadvantage of risking higher budget imbal-ance Last for a given increasing haspartial effects on convergence level but nosignificant effect on convergence speed Tocheck the persistence of these experimental re-sults in the long run we use simulations inSection V

V Simulation Results Continued Dynamics

Our experiment examines the relationship be-tween strategic complementarity and conver-gence to equilibrium Learning theory predictslong-run convergence Figures 1 and 2 showthat convergence continues to improve in laterrounds for several treatments suggesting con-tinued dynamics Due to time attention andresource constraints it was not feasible to runthe experiment for much longer than 60rounds Therefore we rely on simulations tostudy continued convergence beyond 60rounds

To do so we look for a learning algorithmwhich when calibrated closely approximatesthe observed dynamic paths over 60 roundsThe large empirical literature on learning ingames (see eg Camerer 2003 for a survey)suggests many models Our interest here is notto compare the performance of various learningmodels We look for learning models that matchtwo criteria First we require a model that per-forms well in a variety of experimental gamesSecond given our experiment has complete in-formation about the payoff structure the modelneeds to incorporate this information In thefollowing subsections we first introduce threelearning models which meet our criteria Wethen look at how well the learning models pre-dict the experimental data by calibrating eachalgorithm on a subset of the experimental dataand validating the models on a hold-out sampleWe then report the forecasting results using oneof the calibrated algorithms

A Three Learning Models

The models we examine are stochastic ficti-tious play with discounting (hereafter shortenedas sFP) (Yin-Wong Cheung and Daniel Fried-man 1997 Fudenberg and Levine 1998) func-tional EWA (fEWA) (Teck-Hua Ho et al2001) and the payoff assessment learning model(PA) (Rajiv Sarin and Farshid Vahid 1999)We now give a brief overview of each modelInterested readers are referred to the originalsfor complete descriptions

The particular version of sFP that we use islogistic fictitious play (see eg Fudenberg andLevine 1998) A player predicts the matchrsquosprice in the next round pi(t 1) according to

1520 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(6) pi t 1

pi t yen 1

t 1

rpi t

1 yen 1

t 1

r

for some discount factor r [0 1] Note r 0corresponds to the Cournot best reply assess-ment pi(t 1) pi(t) When r 1 it yieldsthe standard fictitious play assessment Theusual adaptive learning model assumes 0 r 1All observations influence the expected state butmore recent observations have greater weight

As opposed to standard fictitious play stochas-tic fictitious play allows decision randomizationand thus better captures the human learning pro-cess Omitting time subscripts the probability thata player announces price pi is given by

(7) Pr pipi exp13 pi pi

yenj 0

40

exp13 pj pj

Given a predicted price a player is thus morelikely to play strategies that yield higher pay-offs How much more likely is determined by 13the sensitivity parameter As 13 increases theprobability of a best response to pi increases

The second model we consider fEWA is aone-parameter variant of the experience-weighted attraction (EWA) model (Camererand Ho 1999) In this model strategy probabil-ities are determined by logit probabilities simi-lar to equation (7) with actual payoffs ((pipi)) replaced by strategy attraction In bothvariants strategy attraction Ai

j(t) and an expe-rience weight N(t) are updated after everyperiod The experience weight is updated ac-cording to N(t) (1 ) N(t 1) 1where is the change-detection parameter and controls exploration (low ) versus exploita-tion Attractions are updated according to thefollowing rule

(8) Aijt

Nt 1Ai

jt 1 1 I pij pi ti pj pi t

Nt

where the indicator function I(pij pi(t)) equals

one if pij pi(t) and zero otherwise and [0

1] is the imagination weight In EWA all pa-rameters are estimated whereas in fEWA theseparameters (except for 13) are endogenously de-termined by the following functions Thechange-detector function i(t) is given by

(9) i t 1 5 j 1

mi Ipij pi t

1

yen 1

t

I pij pi t

t 2

This function will be close to 1 when recenthistory resembles previous history The imagi-nation weight i(t) equals i(t) while equalsthe Gini coefficient of previous choice frequen-cies EWA models encompass a variety offamiliar learning models cumulative reinforce-ment learning ( 0 1 N(0) 1)weighted reinforcement learning ( 0 0N(0) 1) weighted fictitious play ( 1 0) standard fictitious play ( 1 0)and Cournot best reply ( 1 1)

Finally we introduce the main componentsof the PA model For simplicity we omit allsubscripts that represent player i and let j(t)represent the actual payoff of strategy j in roundt Since the game has a large strategy space weincorporate similarity functions into the modelto represent agent use of strategy similarity Asstrategies in this game are naturally ordered bytheir labels we use the Bartlett similarity func-tion fjk(h t) to denote the similarity betweenthe played strategy k and an unplayed strategyj at period t

fjk h t 1 j kh if j k h0 otherwise

In this function the parameter h determines theh 1 unplayed strategies on either side of theplayed strategy to be updated When h 1fjk(1 t) degenerates into an indicator functionequal to one if strategy j is chosen in round t andzero otherwise

The PA model assumes that a player is amyopic subjective maximizer That is hechooses strategies based on assessed payoffs

1521VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

and does not explicitly take into account thelikelihood of alternate states Let uj(t) denotethe subjective assessment of strategy pj at timet and r the discount factor Payoff assessmentsare updated through a weighted average of hisprevious assessments and the payoff he actuallyobtains at time t If strategy k is chosen at timet then

(10) uj t 1 1 rfjk h tuj t

rfjk h tk t j

Each period the assessed strategy payoffs aresubject to zero-mean symmetrically distributedshocks zj(t) The decision maker chooses on thebasis of his shock-distorted subjective assess-ments uj(t) uj(t) zj(t) At time t he choosesstrategy pk if

(11) uk t uj t 0 pj pk

Note that mood shocks affect only his choicesand not the manner in which assessments areupdated

B Calibration

Literature assessing the performance oflearning models contains two approaches to cal-ibration and validation The first approach cal-ibrates the model on the first t rounds andvalidates on the remaining rounds The secondapproach uses half of the sessions in each treat-ment to calibrate and the other half to validateWe choose the latter approach for two reasonsFirst this approach is feasible as we have mul-tiple independent sessions for each treatmentSecond we need not assume that the parametersof later rounds are the same as those in earlierrounds We thus calibrate the parameters ofeach model in blocks of 15 rounds using theexperimental data from the first two sessions ofeach treatment We then evaluate each model bymeasuring how well the parameterized modelpredicts play in the remaining two or threesessions

For parameter estimation we conduct MonteCarlo simulations designed to replicate thecharacteristics of the experimental settings In

all calibrations we exclude the last two rounds(59 and 60) to avoid any end-of-game effectsWe then compare the simulated paths with theexperimental data to find those parameters thatminimize the mean-squared deviation (MSD)scores Since the final distributions of our pricedata are unimodal the simulated mean is aninformative statistic and is well captured byMSD (Ernan Haruvy and Dale Stahl 2000) Inall simulations we use the k-period-aheadrather than the one-period-ahead approach14 be-cause we are interested in forecasting the long-run mechanism performance In doing so wechoose k 10 15 20 30 and 58 We look atblocks of 10 15 20 and 30 rounds because asplayers gain information and experience infor-mation use may change over time We use k 15 rather than the other values because it bestcaptures the actual dynamics in the experimen-tal data

Each simulation consists of 1500 games(18000 players) and the following steps

(i) Simulated players are randomly matchedinto pairs at the beginning of each round

(ii) Simulated players select price announce-ments(a) Initial round Almost half of all play-

ers selected a first-round price of 20(44 percent of player 1s and 43 percentof player 2s) There are also consider-able spikes in the frequency of select-ing other prices divisible by 515 Basedon Kolmogorov-Smirnov tests on theactual round-one price distribution wereject the null hypotheses of uniformdistribution (d 0250 p-value 0000) and normal distribution (d 0381 p-value 0000) We thus fol-low the convention (eg Ho et al2001) and use the actual first-roundempirical distribution of choices togenerate the first-round choices

(b) Subsequent rounds Simulated playerstrategies are determined via equation(7) for sFP and fEWA and equation(11) for PA

14 Ido Erev and Haruvy (2000) discuss the tradeoffs ofthe two approaches

15 For example the seven most frequently selected pricesare divisible by 5

1522 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iii) Simulated player 1rsquos quantity choice isbased on the following steps(a) Determine each player 1rsquos best re-

sponse to p2(b) Determine whether player 1 will devi-

ate from the best response via the ac-tual probability of errors for eachblock

(c) If yes deviate via the actual mean andstandard deviation for the block Oth-erwise play the best response

(iv) Payoffs are determined by the payoff func-tion of the compensation mechanismequation (1) for each treatment

(v) Assessments are updated according toequation (8) for fEWA and equation (10)for PA

(vi) Proceed to the next round

The discount factor r [0 1] is searched ata grid size of 01 The parameter 13 is searchedat a grid size of 01 in the interval [15 105] forfEWA and [15 250] for sFP The size of thesimilarity window h [1 10] is searched at agrid size of 1 Mood shocks z are drawn from

a uniform distribution16 on an interval [a a]where a is searched on [0 500] with a step sizeof 50 For all parameters intervals and gridsizes are determined by payoff magnitude

Table 6 reports the calibrated parameters(discount factor sensitivity parameter moodshock interval and similarity window size) forthe first two sessions of each treatment in 15-round blocks17 Estimated parameters for thesupermodular and near-supermodular treatmentsare consistent with the increased level of con-vergence over time while the 00 treatmentsare not The second column (sFP) reports thebest-fit parameters for the stochastic fictitiousplay model With the exception of treatment2000 the discount factor is close to 10

16 Chen and Yuri Khoroshilov (2003) compare threeversions of PA models where shocks were drawn fromuniform logistic and double exponential distributions ontwo sets of experimental data and find that the performanceof the three versions of PA were statistically indistinguish-able

17 We also calibrate all parameters for the entire 60rounds but do not report results due to space limitations

TABLE 6mdashCALIBRATION OF THREE LEARNING MODELS IN FIFTEEN-ROUND BLOCKS

Model sFP fEWA PA

Block 1 2 3 4 1 2 3 4 1 2 3 4

Discount rate (r) Discount rate (r)1000 10 07 09 10 01 00 09 102000 07 06 01 01 05 10 02 012018 10 10 09 09 02 10 03 021020 10 10 10 09 01 03 06 032020 10 10 10 09 02 01 05 022040 10 10 10 07 01 10 04 03

Sensitivity (13) Sensitivity (13) Shock interval (a)1000 15 44 45 28 16 45 44 37 500 500 250 502000 15 43 140 180 17 40 51 74 50 100 150 502018 16 80 115 183 15 53 62 95 50 300 500 1501020 19 59 86 138 18 47 62 82 100 500 450 3002020 28 56 81 112 24 38 61 67 200 300 350 3502040 34 65 140 205 32 49 67 99 150 50 150 50

Similarity window (h)1000 1 1 10 92000 10 10 9 62018 5 4 6 11020 1 1 8 32020 2 2 7 12040 1 3 2 2

1523VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

indicating that players keep track of all past playwhen forming beliefs about opponentsrsquo nextmove18 Given these beliefs the increasing sen-sitivity parameter 13 for all treatments except1000 indicates that the likelihood of bestresponse increases over time The third columnreports calibration results for the fEWA modelAgain with the exception of treatment 1000the parameter 13 increases over time The finalcolumn reports calibration of parameters in thePA model For each treatment except 1000 amiddle block has the highest discount factorindicating more weight on new informationabout a strategyrsquos performance The second pa-rameter in the PA model represents the upperbound of the interval from which shocks aredrawn a Experimentation should decrease inthe final rounds Indeed for all treatments theestimated mood-shock ranges (weakly) de-crease from the third to the last block Finallythe decreasing similarity windows from thethird to final block indicate decreasing strategyspillover Relatively large discount factors inthe third block combined with relatively largesimilarity windows flatten the payoff assess-ments around the most recently played strate-gies consistent with the local experimentation(or relatively stabilized play) observed in thedata

C Validation

Using the parameters calibrated on the firsttwo sessions of each treatment we next com-pare the performance of the three learning mod-els in predicting play in the hold-out sampleFor comparison we also present the perfor-mance of two static models The random choicemodel assumes that each player randomlychooses any strategy with equal probability forall rounds This model incorporates only num-ber of strategies and thus provides a minimumstandard for a dynamic learning model Theequilibrium model assumes that each playerplays the subgame perfect Nash equilibriumevery round Its fit conveys the same informa-tion as the proportion of equilibrium play pre-

sented in Section IV but with a different metric(MSD)

Table 7 presents each modelrsquos MSD scoresfor each hold-out session Recall that two ofeach treatmentrsquos four or five independent ses-sions are used for calibration and the rest forvalidation The results indicate that all threelearning models perform significantly betterthan the random choice model (p-value 001one-sided permutation tests) and by a largermargin significantly better than the equilibriummodel (p-value 001 one-sided permutationtests) While the equilibrium model does a poorjob of explaining overall experimental data itsperformance improvement over time19 justifiesthe use of learning models to explain the dy-namics of play Within each of the top threepanels (ie learning models) session-levelMSD scores are lower for the supermodular andnear-supermodular treatments indicating thateach learning model does a better job explainingbehavior in those treatments Indeed in the em-pirical learning literature learning models fitbetter in experiments with better equilibriumconvergence (see eg Chen and Khoroshilov2003)

RESULT 4 (Comparison of Learning Mod-els) The performance of PA is weakly betterthan that of fEWA and strictly better than that ofsFP The performances of fEWA and sFP arenot significantly different

SUPPORT Table 7 reports the MSD scores foreach independent session in the hold-out sampleunder each model The Wilcoxon signed-rankstests show that the MSD scores under PA areweakly lower than those under fEWA (z 0085 p-value 0068) and strictly lowerthan those under sFP (z 0028 p-value 0023) Using the same test we cannot reject thehypothesis that fEWA and sFP yield the sameMSD scores (z 0662 p-value 0508)

Although the performance of PA is onlyweakly better than that of fEWA the overallMSD scores for PA are lower than those for

18 As the discount factor is zero in Cournot best replywe can reject this model based on our estimation of thediscount factors

19 We omit the table to show the performance of theequilibrium model in blocks of 15 rounds as the informa-tion is repetitive with Figures 1 and 2

1524 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

fEWA for every treatment Therefore we usePA for forecasting beyond 60 rounds

Figure 3 presents the simulated path of thecalibrated PA model and compares it with theactual data in the hold-out sample by superim-posing the simulated mean (black line) plus andminus one standard deviation (grey lines) on theactual mean (black box) and standard deviation(error bars) The simulation does a good job oftracking both the mean and the standard devia-tion of the actual data In treatments 1000and 2040 however the reduction in variancelags that seen in the middle rounds of theexperiment

D Forecasting

We now report the results from the PAmodel As the strategic complementarity predic-tions concern long-run performance we use thecalibrated parameters for the first 58 rounds tosimulate play in later rounds This exercise al-

lows us to study convergence and efficiency inlong but finite horizons

In all forecasting exercises we base all post-58-round parameters on those from the lastblock (calibrated from rounds 46 to 58) Sincewe do not know how these parameters mightchange beyond 60 rounds we use three differ-ent specifications First we retain all final-blockparameters Second we exponentially decay insubsequent blocks the probability of deviationfrom the best response in the production stage20

Third we exponentially decay the shock inter-val in subsequent blocks As all three specifica-tions yield qualitatively similar results wereport results from only the third specificationdue to space limitations21 In presenting the

20 In block k 4 we use the probability of deviationPrk minPr4(k 4)2 108 We set a lower bound of108 to avoid the division-by-zero problem

21 Different sequences of random numbers produceslightly different parameters estimates While the conver-

TABLE 7mdashVALIDATION ON HOLD-OUT SESSIONS

Session

Stochastic fictitious play

1000 2000 2018 1020 2020 2040

1 0955 0934 0940 0930 0888 09402 0960 0946 0890 0917 0973 08783 0948 0883 0873Overall 0957 0943 0915 0910 0912 0909

fEWA

1 0956 0934 0942 0928 0887 09482 0960 0946 0890 0922 0978 08863 0946 0879 0871Overall 0958 0942 0916 0910 0912 0917

Payoff assessment

1 0952 0933 0914 0941 0888 09162 0957 0944 0899 0922 0917 08983 0947 0894 0903Overall 0955 0941 0907 0919 0902 0907

Equilibrium play

1 1867 1853 1833 1833 1489 17922 1889 1919 1606 1839 1953 16693 1917 1447 1539Overall 1878 1896 1719 1706 1660 1731

Random play

Overall 0976 0976 0976 0976 0976 0976

1525VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

results we use the shorthand notation x y todenote a measure under treatment x is signifi-cantly higher than that under treatment y at the5-percent level or less and x y to denote thatthe measures are not significantly different atthe 5-percent level

Figure 4 reports the simulated proportion of-equilibrium play We report only the resultsfor the first 500 rounds since the dynamics donot change much thereafter In our simulationall treatments reach higher convergence levelsthan those achieved in the 60 rounds of the

experiment Since the simulated variance reduc-tion lags that of the experiment round 60 resultsare not achieved until approximately 90 roundsof simulation We further note that these im-provements slow down after 200 rounds Theproportion of -equilibrium play is bounded by72 percent for player 1 and 93 percent for player2 We now outline the simulation results

RESULT 5 (Level of Convergence in Round500) In the simulated data for player 1 atround 500 we have the following level of con-vergence ranking in

(i) p1 2018 2040 2020 1020 2000 1000

(ii) -p1 2018 2040 2020 1020 2000 1000

(iii) p2 2040 2018 1020 2020 2000 1000 and

gence level of a given treatment for a given set of parameterestimates is not affected by the sequence of random num-bers this convergence level is somewhat sensitive to theexact parameter calibration We therefore conduct four dif-ferent calibrations for each treatment and select the param-eters that yield the highest level The relative rankings oftreatments are quite robust with respect to ordinal selected

FIGURE 3 SIMULATED DYNAMIC PATH VERSUS ACTUAL DATA

1526 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iv) -p2 2040 2018 1020 2020 2000 1000

SUPPORT The fourth and seventh columns ofTable 8 report p-values for t-tests of the preced-ing alternative hypotheses for players 1 and 2respectively

Comparing Results 1 and 5 we first note thatthe four supermodular and near-supermodulartreatments continue to dominate the 00 treat-ments In fact the simulations suggest that thegap remains constant compared with 2000and actually increases compared with 1000Unlike Result 1 however the four dominanttreatments differ In particular 2018 per-forms better in terms of player 1rsquos convergenceand 2040 in terms of player 2rsquos although thedifferences within the top four treatments aresmaller than the differences between these four

and the 00 treatments In addition an increasein significantly improves convergence by alarge margin (40 to 80 percent) for both playerswhen 0

It is instructive to compare these results withthose concerning the speed of convergence inour experimental treatments In terms of con-vergence to -p2 our regression results (Table4) indicate that 2018rsquos performance in laterrounds enables it to achieve the same conver-gence level as the supermodular treatmentsThis trend continues in our simulations as2018 performs robustly well compared to thesupermodular treatments Likewise in our anal-ysis of the experimental results the conver-gence speed of 2040 was not significantlydifferent from those of the other dominant treat-ments In our simulations however this treat-ment dominates all treatments in player 2equilibrium play

FIGURE 3 CONTINUED

1527VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

In our simulations the convergence improve-ment due to increasing from 0 to 20 doesdepend on (-effect on -effect) An increasein significantly decreases the improvement in

convergence level (all p-values for one-sidedt-tests are less than 001) Due to the relativelypoor performance of 1000 in our simula-tions increasing from 10 to 20 when 0

Player 1

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500Round

Ep

silo

n-e

qu

ilib

riu

m p

lay

α10β00α20β00α20β18α10β20α20β20α20β40

Player 2

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500

Round

Ep

silo

n-e

qu

ilib

riu

m p

lay α10β00

α20β00α20β18α10β20α20β20α20β40

FIGURE 4 SIMULATED -EQUILIBRIUM PLAY

1528 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

improves convergence more dramatically inround 500 than in rounds 40 to 60 In the longrun an increase in is a partial substitute for anincrease in

We now use definition 2 to examine the con-vergence speed in the long run Table 9 presentsresults for the first time a treatment reaches thelevel of convergence L We omit the two 00treatments since they do not converge to thesame levels as the other treatments In terms ofplayer 1 2018 achieves the fastest conver-gence for all levels L The picture for player 2convergence however is more subtle First forthe 20 and 18 treatments the speeds of con-vergence to all L are remarkably similarWhile 2040 lags the other treatments interms of achieving 80-percent -equilibriumplay for player 2 it is the only treatment toachieve L 90 percent

Apart from convergence to equilibrium it isalso important to look at long-run welfare prop-erties We evaluate mechanism performance us-ing the efficiency measure in Section IV

Figure 5 summarizes the simulated and actualefficiency achieved by each of the treatmentsThe top panel reports simulated results whilethe bottom panel reports efficiency in the ex-perimental data In the long run supermodularand near-supermodular treatments continue todiffer from the 00 treatments with 20 supe-rior to 10 in the 00 treatments

VI Interpretation and Discussion

The underlying force for convergence ingames with strategic complementarities is thecombination of the slope of the best-responsefunctions and adaptive learning by players In

TABLE 9mdashSPEED OF CONVERGENCE IN SIMULATED DATA

(Threshold is the percent of players playing epsilon-equilibrium strategy Entry indicates round in which went overthreshold for good where ldquomdashrdquo indicates that treatment never achieved threshold)

Threshold

Player 1 Player 2

04 05 06 07 04 05 06 07 08 09

2018 54 83 126 316 38 52 62 87 127 mdash1020 129 221 mdash mdash 36 48 63 91 148 mdash2020 61 91 149 mdash 39 51 61 84 137 mdash2040 101 166 263 496 30 38 56 94 166 339

TABLE 8mdashRESULTS OF t-TESTS COMPARING LEVEL OF CONVERGENCE OF SIMULATIONS OF EXPERIMENTAL TREATMENTS

(Values are for round 500 and based on 1500 simulated games)

Treatment

Player 1 equilibrium price Player 2 equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0073 00822000 0188 2000 1000 0000 0213 2000 1000 00002018 0265 2018 2040 0187 0381 2018 1020 02641020 0211 1020 2000 0000 0376 1020 2020 00012020 0243 2020 1020 0000 0353 2020 2000 00002040 0259 2040 2020 0007 0442 2040 2018 0000

Treatment

Player 1 -equilibrium price Player 2 -equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0216 02322000 0519 2000 1000 0000 0573 2000 1000 00002018 0713 2018 2040 0045 0870 2018 1020 00081020 0584 1020 2000 0000 0857 1020 2020 00002020 0662 2020 1020 0000 0834 2020 2000 00002040 0701 2040 2020 0000 0925 2040 2018 0000

1529VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the compensation mechanisms the slope ofplayer 2rsquos best response function is determinedby As and determine the penalty for

mismatched prices we seriously consider thepossibility that convergence in supermodularand near-supermodular treatments to an equilib-

Simulation Data

75

80

85

90

95

100

50 150 250 350 450

Round

α10β00α20β00α20β18α10β20α20β20α20β40

Experimental Data

40

45

50

55

60

65

70

75

80

85

90

0 10 20 30 40 50 60

Round

Fra

ctio

n o

f M

axim

um

Wel

fare

α10β00α20β00α20β18α10β20α20β20α20β40

Fra

ctio

n o

f M

axim

um

Wel

fare

FIGURE 5 EFFICIENCY IN THE SIMULATED AND ACTUAL DATA

1530 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium where both players announce the sameprice is not due to strategic complementaritiesbut rather to increases in and making thepenalty terms more prominent and matching strat-egies more focal22 We thus have two hypoth-eses to explain the observed improvement inconvergence to equilibrium play in supermodu-lar and near-supermodular treatments a best-response hypothesis and a matching hypothesisIn this section we present evidence that theimprovement in convergence is due to payoff-relevant changes to best responses and not tomatching becoming more focal

While the focal point hypothesis suggeststhat players converge to a match it does notspecify the price at which they match In thefirst round of our experiments almost 50 per-cent of all prices were 20 and less than 10percent were in the -equilibrium range of[15 17] In the final rounds of our experi-ments play in the four supermodular or near-supermodular treatments converges stronglyto this range suggesting more than simplematching

By equation (4) player 1rsquos best response isalways to match regardless of p2 By theGeneral Incentive Hypothesis we expectmore matching behavior by player 1 as increases By equation (5) however match-ing is a best response for player 2 only inequilibrium Therefore data for player 2 canhelp us separate the best-response and match-ing hypotheses

We first investigate which model better ex-plains our experimental data We operationalizethe best response hypothesis by the stochasticfictitious play model of Section V and thematching hypothesis with a stochastic matchingmodel23 In both models the predicted player 1price is specified by equation (6) In the match-ing model player 2 plays this price with prob-ability 1 and one of the other 40 priceswith probability 40 We estimate using thefirst two sessions of each treatment24 We then

compare the performance of the matchingmodel with the sFP model using the hold-outsample Using the Wilcoxon signed-rank teststo compare the MSD scores in the hold-outsample we conclude that sFP explains player2rsquos behavior significantly better than the match-ing model (p-value 0006) We conclude thatbest response to a predicted price explains be-havior significantly better than the matchingmodel

We next investigate whether differences inthe cost of matching defined as the differencebetween matching and best-response profitscan explain all of the differences in match-ing behavior among treatments or whetherthere exists additional matching behaviorthat cannot be explained by payoff-relevantfactors

We examine the probability that player 2 triesto match the anticipated player 1 price We donot know what price player 2 anticipates Toestimate how players weigh history we cali-brate a matching model in a manner similar tothat outlined in Section V B We assume aplayer matches the price he anticipates givenby equation (6) and we find the discount ratefor each treatment that minimizes MSD25 Thisgives us the anticipated price that best explainsthe matching hypothesis Using this discountrate we calculate for each t 1 an anticipatedplayer 1 price p1 for each player 2 We alsocalculate the opportunity cost of matching p1equal to the difference between best-responseand matching profits This opportunity cost cap-tures the payoff relevant effects of and alsodepends on the price to be matched

We next regress the probability that player 2matches the anticipated price Pr(p2 p1) onln(Round) treatment dummies and treatmentdummies interacted with ln(Round) In onespecification we include the cost of matching asa regressor If parameter changes make match-ing more focal then changes in the cost ofmatching will not explain all of the changes inprobability of matching

22 We thank an anonymous referee and the co-editor forpointing this out

23 We do not report the results of the deterministicmatching model as it performs significantly worse than itsstochastic counterpart

24 Estimated in each block are as follows 1000 10 7 6 8 2000 7 6 5 4 2018 7

2 3 3 1020 5 3 3 2 2020 1 0 0 02040 2 0 0 1

25 In all treatments the calibrated discount rate is r 01

1531VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Table 10 presents the results of two probitmodels with clustering at the individual level Inthe first specification we do not include the costof matching as a regressor In this specificationthe coefficient for D00 is negative and signifi-cant thus reducing from 20 to 0 decreases theprobability that player 2 will try to match player1rsquos price Including the cost of matching in theregression however we find that neither thedummies nor their interactions are signifi-cantmdashthe coefficient for the previously signifi-cant D00 dummy falls almost by a half whilethe precision of the estimate stays about thesame The coefficient on the cost of matchinghowever is significant and negative Thisfinding suggests that the rise in matching that

occurs when we increase from 0 to 20 is notbecause matching is more focal but ratherbecause an increase of decreases the cost ofmatching26

VII Concluding Remarks

The appeal of games with strategic comple-mentarities is simple as long as players areadaptive learners the game will in the limitconverge to the set bounded by the largest andsmallest Nash equilibria This convergence de-pends on neither initial conditions nor the as-sumption of a particular learning modelUnfortunately while many competitive envi-ronments are supermodular many are not Inthis study we examine whether there are non-supermodular games which have the same con-vergence properties as supermodular ones Wealso study whether there exists a clear conver-gence ranking among games with strategiccomplementarities

Our results confirm the findings of previousexperimental studies that supermodular gamesperform significantly better than games far be-low the supermodular threshold From a littlebelow the threshold to the threshold howeverthe change in convergence level is statisticallyinsignificant This results suggests that in thecontext of mechanism design the designerneed not be overly concerned with setting pa-rameters that are firmly above the supermodularthreshold close is just as good It also enlargesthe set of robustly stable games For examplethe Falkinger mechanism (Falkinger 1996) isnot supermodular but close These results sug-gest that near-supermodular games perform likesupermodular ones

Our next result concerns convergence perfor-mance within the class of games with strategiccomplementarities Variations in the degree ofcomplementarities have no significant effect onperformance within the 60 experimental roundsOur simulations suggest however an increased

26 We consider other specifications of anticipated priceincluding the averages of the last one two three and fourprices seen In only one specification was a dummy or itsinteraction with ln(Round) significant In that specificationthe coefficient for D18 is positive and its interaction withln(Round) is negative a finding which is not consistent witha focal point hypothesis

TABLE 10mdashPLAYER 2 MATCHING HYPOTHESIS

(Probit models with individual-level clustering comparingmodels with and without the cost to Player 2 of matching

the anticipated price of Player 1)

Model

Dependent variableProbability player 2 priceequals anticipated player 1

price

(1) (2)

D10 0014 0017(0043) (0041)

D00 0077 0043(0035) (0036)

D18 0046 0032(0053) (0056)

D40 0008 0041(0061) (0042)

ln(Round) 0041 0028(0011) (0010)

Match Cost 0001(0000)

ln(Round)D10 0014 0012(0012) (0012)

ln(Round)D00 0003 0000(0013) (0012)

ln(Round)D18 0006 0002(0021) (0020)

ln(Round)D40 0001 0015(0018) (0016)

Observations 9588 9588Log pseudo-likelihood 324321 317716

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses are adjusted for cluster-ing at the individual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummyvariable for treatment y Excluded dummy is D2020Match Cost is the difference in cents of matching andbest-responding to anticipated Player 1 price

1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 16: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

in the last 20 rounds the alternative hypoth-eses and the results of one-tailed permutationtests

Result 3 is largely consistent with Result 1indicating that supermodular and near-supermodular mechanisms induce a signifi-cantly higher proportion of equilibrium playthan mechanisms far from the supermodularthreshold The new finding is that increasing from the threshold 20 to 40 weakly improvesefficiency

Second the budget surplus at round t is thesum of the penalty terms plus tax minus sub-sidy in that round ie s(t) ( )(p1(t) p2(t))2 p2(t)x(t) p1(t)x(t) Examining thesession-level budget surplus across all roundswe find the following results First 22 out of 27sessions result in a budget surplus This comesfrom a combination of our choice of parametersand the dynamics of play Second the onlysignificant difference across treatments is thatbudget surpluses under 2018 and 2020are significantly lower than those under2040 ( p-values 0029 and 0024 respec-tively) This finding points to a potential costof driving up the punishment parameter Inother words before the system equilibrates ahigh punishment parameter can worsen thebudget imbalance problem inherent in themechanism

Overall Results 1 through 3 suggest the fol-lowing observations regarding games with stra-tegic complementarities First in terms ofconvergence level and speed supermodular andnear-supermodular games perform significantlybetter than those far under the threshold Sec-ond while there is no significant differencebetween supermodular and near-supermodulargames in terms of convergence level early per-formance differences may lead to supermodulartreatments reaching certain convergence levelsmore quickly Third beyond the threshold in-creasing has no significant effect on either thelevel or the speed of convergence but has thedisadvantage of risking higher budget imbal-ance Last for a given increasing haspartial effects on convergence level but nosignificant effect on convergence speed Tocheck the persistence of these experimental re-sults in the long run we use simulations inSection V

V Simulation Results Continued Dynamics

Our experiment examines the relationship be-tween strategic complementarity and conver-gence to equilibrium Learning theory predictslong-run convergence Figures 1 and 2 showthat convergence continues to improve in laterrounds for several treatments suggesting con-tinued dynamics Due to time attention andresource constraints it was not feasible to runthe experiment for much longer than 60rounds Therefore we rely on simulations tostudy continued convergence beyond 60rounds

To do so we look for a learning algorithmwhich when calibrated closely approximatesthe observed dynamic paths over 60 roundsThe large empirical literature on learning ingames (see eg Camerer 2003 for a survey)suggests many models Our interest here is notto compare the performance of various learningmodels We look for learning models that matchtwo criteria First we require a model that per-forms well in a variety of experimental gamesSecond given our experiment has complete in-formation about the payoff structure the modelneeds to incorporate this information In thefollowing subsections we first introduce threelearning models which meet our criteria Wethen look at how well the learning models pre-dict the experimental data by calibrating eachalgorithm on a subset of the experimental dataand validating the models on a hold-out sampleWe then report the forecasting results using oneof the calibrated algorithms

A Three Learning Models

The models we examine are stochastic ficti-tious play with discounting (hereafter shortenedas sFP) (Yin-Wong Cheung and Daniel Fried-man 1997 Fudenberg and Levine 1998) func-tional EWA (fEWA) (Teck-Hua Ho et al2001) and the payoff assessment learning model(PA) (Rajiv Sarin and Farshid Vahid 1999)We now give a brief overview of each modelInterested readers are referred to the originalsfor complete descriptions

The particular version of sFP that we use islogistic fictitious play (see eg Fudenberg andLevine 1998) A player predicts the matchrsquosprice in the next round pi(t 1) according to

1520 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(6) pi t 1

pi t yen 1

t 1

rpi t

1 yen 1

t 1

r

for some discount factor r [0 1] Note r 0corresponds to the Cournot best reply assess-ment pi(t 1) pi(t) When r 1 it yieldsthe standard fictitious play assessment Theusual adaptive learning model assumes 0 r 1All observations influence the expected state butmore recent observations have greater weight

As opposed to standard fictitious play stochas-tic fictitious play allows decision randomizationand thus better captures the human learning pro-cess Omitting time subscripts the probability thata player announces price pi is given by

(7) Pr pipi exp13 pi pi

yenj 0

40

exp13 pj pj

Given a predicted price a player is thus morelikely to play strategies that yield higher pay-offs How much more likely is determined by 13the sensitivity parameter As 13 increases theprobability of a best response to pi increases

The second model we consider fEWA is aone-parameter variant of the experience-weighted attraction (EWA) model (Camererand Ho 1999) In this model strategy probabil-ities are determined by logit probabilities simi-lar to equation (7) with actual payoffs ((pipi)) replaced by strategy attraction In bothvariants strategy attraction Ai

j(t) and an expe-rience weight N(t) are updated after everyperiod The experience weight is updated ac-cording to N(t) (1 ) N(t 1) 1where is the change-detection parameter and controls exploration (low ) versus exploita-tion Attractions are updated according to thefollowing rule

(8) Aijt

Nt 1Ai

jt 1 1 I pij pi ti pj pi t

Nt

where the indicator function I(pij pi(t)) equals

one if pij pi(t) and zero otherwise and [0

1] is the imagination weight In EWA all pa-rameters are estimated whereas in fEWA theseparameters (except for 13) are endogenously de-termined by the following functions Thechange-detector function i(t) is given by

(9) i t 1 5 j 1

mi Ipij pi t

1

yen 1

t

I pij pi t

t 2

This function will be close to 1 when recenthistory resembles previous history The imagi-nation weight i(t) equals i(t) while equalsthe Gini coefficient of previous choice frequen-cies EWA models encompass a variety offamiliar learning models cumulative reinforce-ment learning ( 0 1 N(0) 1)weighted reinforcement learning ( 0 0N(0) 1) weighted fictitious play ( 1 0) standard fictitious play ( 1 0)and Cournot best reply ( 1 1)

Finally we introduce the main componentsof the PA model For simplicity we omit allsubscripts that represent player i and let j(t)represent the actual payoff of strategy j in roundt Since the game has a large strategy space weincorporate similarity functions into the modelto represent agent use of strategy similarity Asstrategies in this game are naturally ordered bytheir labels we use the Bartlett similarity func-tion fjk(h t) to denote the similarity betweenthe played strategy k and an unplayed strategyj at period t

fjk h t 1 j kh if j k h0 otherwise

In this function the parameter h determines theh 1 unplayed strategies on either side of theplayed strategy to be updated When h 1fjk(1 t) degenerates into an indicator functionequal to one if strategy j is chosen in round t andzero otherwise

The PA model assumes that a player is amyopic subjective maximizer That is hechooses strategies based on assessed payoffs

1521VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

and does not explicitly take into account thelikelihood of alternate states Let uj(t) denotethe subjective assessment of strategy pj at timet and r the discount factor Payoff assessmentsare updated through a weighted average of hisprevious assessments and the payoff he actuallyobtains at time t If strategy k is chosen at timet then

(10) uj t 1 1 rfjk h tuj t

rfjk h tk t j

Each period the assessed strategy payoffs aresubject to zero-mean symmetrically distributedshocks zj(t) The decision maker chooses on thebasis of his shock-distorted subjective assess-ments uj(t) uj(t) zj(t) At time t he choosesstrategy pk if

(11) uk t uj t 0 pj pk

Note that mood shocks affect only his choicesand not the manner in which assessments areupdated

B Calibration

Literature assessing the performance oflearning models contains two approaches to cal-ibration and validation The first approach cal-ibrates the model on the first t rounds andvalidates on the remaining rounds The secondapproach uses half of the sessions in each treat-ment to calibrate and the other half to validateWe choose the latter approach for two reasonsFirst this approach is feasible as we have mul-tiple independent sessions for each treatmentSecond we need not assume that the parametersof later rounds are the same as those in earlierrounds We thus calibrate the parameters ofeach model in blocks of 15 rounds using theexperimental data from the first two sessions ofeach treatment We then evaluate each model bymeasuring how well the parameterized modelpredicts play in the remaining two or threesessions

For parameter estimation we conduct MonteCarlo simulations designed to replicate thecharacteristics of the experimental settings In

all calibrations we exclude the last two rounds(59 and 60) to avoid any end-of-game effectsWe then compare the simulated paths with theexperimental data to find those parameters thatminimize the mean-squared deviation (MSD)scores Since the final distributions of our pricedata are unimodal the simulated mean is aninformative statistic and is well captured byMSD (Ernan Haruvy and Dale Stahl 2000) Inall simulations we use the k-period-aheadrather than the one-period-ahead approach14 be-cause we are interested in forecasting the long-run mechanism performance In doing so wechoose k 10 15 20 30 and 58 We look atblocks of 10 15 20 and 30 rounds because asplayers gain information and experience infor-mation use may change over time We use k 15 rather than the other values because it bestcaptures the actual dynamics in the experimen-tal data

Each simulation consists of 1500 games(18000 players) and the following steps

(i) Simulated players are randomly matchedinto pairs at the beginning of each round

(ii) Simulated players select price announce-ments(a) Initial round Almost half of all play-

ers selected a first-round price of 20(44 percent of player 1s and 43 percentof player 2s) There are also consider-able spikes in the frequency of select-ing other prices divisible by 515 Basedon Kolmogorov-Smirnov tests on theactual round-one price distribution wereject the null hypotheses of uniformdistribution (d 0250 p-value 0000) and normal distribution (d 0381 p-value 0000) We thus fol-low the convention (eg Ho et al2001) and use the actual first-roundempirical distribution of choices togenerate the first-round choices

(b) Subsequent rounds Simulated playerstrategies are determined via equation(7) for sFP and fEWA and equation(11) for PA

14 Ido Erev and Haruvy (2000) discuss the tradeoffs ofthe two approaches

15 For example the seven most frequently selected pricesare divisible by 5

1522 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iii) Simulated player 1rsquos quantity choice isbased on the following steps(a) Determine each player 1rsquos best re-

sponse to p2(b) Determine whether player 1 will devi-

ate from the best response via the ac-tual probability of errors for eachblock

(c) If yes deviate via the actual mean andstandard deviation for the block Oth-erwise play the best response

(iv) Payoffs are determined by the payoff func-tion of the compensation mechanismequation (1) for each treatment

(v) Assessments are updated according toequation (8) for fEWA and equation (10)for PA

(vi) Proceed to the next round

The discount factor r [0 1] is searched ata grid size of 01 The parameter 13 is searchedat a grid size of 01 in the interval [15 105] forfEWA and [15 250] for sFP The size of thesimilarity window h [1 10] is searched at agrid size of 1 Mood shocks z are drawn from

a uniform distribution16 on an interval [a a]where a is searched on [0 500] with a step sizeof 50 For all parameters intervals and gridsizes are determined by payoff magnitude

Table 6 reports the calibrated parameters(discount factor sensitivity parameter moodshock interval and similarity window size) forthe first two sessions of each treatment in 15-round blocks17 Estimated parameters for thesupermodular and near-supermodular treatmentsare consistent with the increased level of con-vergence over time while the 00 treatmentsare not The second column (sFP) reports thebest-fit parameters for the stochastic fictitiousplay model With the exception of treatment2000 the discount factor is close to 10

16 Chen and Yuri Khoroshilov (2003) compare threeversions of PA models where shocks were drawn fromuniform logistic and double exponential distributions ontwo sets of experimental data and find that the performanceof the three versions of PA were statistically indistinguish-able

17 We also calibrate all parameters for the entire 60rounds but do not report results due to space limitations

TABLE 6mdashCALIBRATION OF THREE LEARNING MODELS IN FIFTEEN-ROUND BLOCKS

Model sFP fEWA PA

Block 1 2 3 4 1 2 3 4 1 2 3 4

Discount rate (r) Discount rate (r)1000 10 07 09 10 01 00 09 102000 07 06 01 01 05 10 02 012018 10 10 09 09 02 10 03 021020 10 10 10 09 01 03 06 032020 10 10 10 09 02 01 05 022040 10 10 10 07 01 10 04 03

Sensitivity (13) Sensitivity (13) Shock interval (a)1000 15 44 45 28 16 45 44 37 500 500 250 502000 15 43 140 180 17 40 51 74 50 100 150 502018 16 80 115 183 15 53 62 95 50 300 500 1501020 19 59 86 138 18 47 62 82 100 500 450 3002020 28 56 81 112 24 38 61 67 200 300 350 3502040 34 65 140 205 32 49 67 99 150 50 150 50

Similarity window (h)1000 1 1 10 92000 10 10 9 62018 5 4 6 11020 1 1 8 32020 2 2 7 12040 1 3 2 2

1523VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

indicating that players keep track of all past playwhen forming beliefs about opponentsrsquo nextmove18 Given these beliefs the increasing sen-sitivity parameter 13 for all treatments except1000 indicates that the likelihood of bestresponse increases over time The third columnreports calibration results for the fEWA modelAgain with the exception of treatment 1000the parameter 13 increases over time The finalcolumn reports calibration of parameters in thePA model For each treatment except 1000 amiddle block has the highest discount factorindicating more weight on new informationabout a strategyrsquos performance The second pa-rameter in the PA model represents the upperbound of the interval from which shocks aredrawn a Experimentation should decrease inthe final rounds Indeed for all treatments theestimated mood-shock ranges (weakly) de-crease from the third to the last block Finallythe decreasing similarity windows from thethird to final block indicate decreasing strategyspillover Relatively large discount factors inthe third block combined with relatively largesimilarity windows flatten the payoff assess-ments around the most recently played strate-gies consistent with the local experimentation(or relatively stabilized play) observed in thedata

C Validation

Using the parameters calibrated on the firsttwo sessions of each treatment we next com-pare the performance of the three learning mod-els in predicting play in the hold-out sampleFor comparison we also present the perfor-mance of two static models The random choicemodel assumes that each player randomlychooses any strategy with equal probability forall rounds This model incorporates only num-ber of strategies and thus provides a minimumstandard for a dynamic learning model Theequilibrium model assumes that each playerplays the subgame perfect Nash equilibriumevery round Its fit conveys the same informa-tion as the proportion of equilibrium play pre-

sented in Section IV but with a different metric(MSD)

Table 7 presents each modelrsquos MSD scoresfor each hold-out session Recall that two ofeach treatmentrsquos four or five independent ses-sions are used for calibration and the rest forvalidation The results indicate that all threelearning models perform significantly betterthan the random choice model (p-value 001one-sided permutation tests) and by a largermargin significantly better than the equilibriummodel (p-value 001 one-sided permutationtests) While the equilibrium model does a poorjob of explaining overall experimental data itsperformance improvement over time19 justifiesthe use of learning models to explain the dy-namics of play Within each of the top threepanels (ie learning models) session-levelMSD scores are lower for the supermodular andnear-supermodular treatments indicating thateach learning model does a better job explainingbehavior in those treatments Indeed in the em-pirical learning literature learning models fitbetter in experiments with better equilibriumconvergence (see eg Chen and Khoroshilov2003)

RESULT 4 (Comparison of Learning Mod-els) The performance of PA is weakly betterthan that of fEWA and strictly better than that ofsFP The performances of fEWA and sFP arenot significantly different

SUPPORT Table 7 reports the MSD scores foreach independent session in the hold-out sampleunder each model The Wilcoxon signed-rankstests show that the MSD scores under PA areweakly lower than those under fEWA (z 0085 p-value 0068) and strictly lowerthan those under sFP (z 0028 p-value 0023) Using the same test we cannot reject thehypothesis that fEWA and sFP yield the sameMSD scores (z 0662 p-value 0508)

Although the performance of PA is onlyweakly better than that of fEWA the overallMSD scores for PA are lower than those for

18 As the discount factor is zero in Cournot best replywe can reject this model based on our estimation of thediscount factors

19 We omit the table to show the performance of theequilibrium model in blocks of 15 rounds as the informa-tion is repetitive with Figures 1 and 2

1524 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

fEWA for every treatment Therefore we usePA for forecasting beyond 60 rounds

Figure 3 presents the simulated path of thecalibrated PA model and compares it with theactual data in the hold-out sample by superim-posing the simulated mean (black line) plus andminus one standard deviation (grey lines) on theactual mean (black box) and standard deviation(error bars) The simulation does a good job oftracking both the mean and the standard devia-tion of the actual data In treatments 1000and 2040 however the reduction in variancelags that seen in the middle rounds of theexperiment

D Forecasting

We now report the results from the PAmodel As the strategic complementarity predic-tions concern long-run performance we use thecalibrated parameters for the first 58 rounds tosimulate play in later rounds This exercise al-

lows us to study convergence and efficiency inlong but finite horizons

In all forecasting exercises we base all post-58-round parameters on those from the lastblock (calibrated from rounds 46 to 58) Sincewe do not know how these parameters mightchange beyond 60 rounds we use three differ-ent specifications First we retain all final-blockparameters Second we exponentially decay insubsequent blocks the probability of deviationfrom the best response in the production stage20

Third we exponentially decay the shock inter-val in subsequent blocks As all three specifica-tions yield qualitatively similar results wereport results from only the third specificationdue to space limitations21 In presenting the

20 In block k 4 we use the probability of deviationPrk minPr4(k 4)2 108 We set a lower bound of108 to avoid the division-by-zero problem

21 Different sequences of random numbers produceslightly different parameters estimates While the conver-

TABLE 7mdashVALIDATION ON HOLD-OUT SESSIONS

Session

Stochastic fictitious play

1000 2000 2018 1020 2020 2040

1 0955 0934 0940 0930 0888 09402 0960 0946 0890 0917 0973 08783 0948 0883 0873Overall 0957 0943 0915 0910 0912 0909

fEWA

1 0956 0934 0942 0928 0887 09482 0960 0946 0890 0922 0978 08863 0946 0879 0871Overall 0958 0942 0916 0910 0912 0917

Payoff assessment

1 0952 0933 0914 0941 0888 09162 0957 0944 0899 0922 0917 08983 0947 0894 0903Overall 0955 0941 0907 0919 0902 0907

Equilibrium play

1 1867 1853 1833 1833 1489 17922 1889 1919 1606 1839 1953 16693 1917 1447 1539Overall 1878 1896 1719 1706 1660 1731

Random play

Overall 0976 0976 0976 0976 0976 0976

1525VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

results we use the shorthand notation x y todenote a measure under treatment x is signifi-cantly higher than that under treatment y at the5-percent level or less and x y to denote thatthe measures are not significantly different atthe 5-percent level

Figure 4 reports the simulated proportion of-equilibrium play We report only the resultsfor the first 500 rounds since the dynamics donot change much thereafter In our simulationall treatments reach higher convergence levelsthan those achieved in the 60 rounds of the

experiment Since the simulated variance reduc-tion lags that of the experiment round 60 resultsare not achieved until approximately 90 roundsof simulation We further note that these im-provements slow down after 200 rounds Theproportion of -equilibrium play is bounded by72 percent for player 1 and 93 percent for player2 We now outline the simulation results

RESULT 5 (Level of Convergence in Round500) In the simulated data for player 1 atround 500 we have the following level of con-vergence ranking in

(i) p1 2018 2040 2020 1020 2000 1000

(ii) -p1 2018 2040 2020 1020 2000 1000

(iii) p2 2040 2018 1020 2020 2000 1000 and

gence level of a given treatment for a given set of parameterestimates is not affected by the sequence of random num-bers this convergence level is somewhat sensitive to theexact parameter calibration We therefore conduct four dif-ferent calibrations for each treatment and select the param-eters that yield the highest level The relative rankings oftreatments are quite robust with respect to ordinal selected

FIGURE 3 SIMULATED DYNAMIC PATH VERSUS ACTUAL DATA

1526 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iv) -p2 2040 2018 1020 2020 2000 1000

SUPPORT The fourth and seventh columns ofTable 8 report p-values for t-tests of the preced-ing alternative hypotheses for players 1 and 2respectively

Comparing Results 1 and 5 we first note thatthe four supermodular and near-supermodulartreatments continue to dominate the 00 treat-ments In fact the simulations suggest that thegap remains constant compared with 2000and actually increases compared with 1000Unlike Result 1 however the four dominanttreatments differ In particular 2018 per-forms better in terms of player 1rsquos convergenceand 2040 in terms of player 2rsquos although thedifferences within the top four treatments aresmaller than the differences between these four

and the 00 treatments In addition an increasein significantly improves convergence by alarge margin (40 to 80 percent) for both playerswhen 0

It is instructive to compare these results withthose concerning the speed of convergence inour experimental treatments In terms of con-vergence to -p2 our regression results (Table4) indicate that 2018rsquos performance in laterrounds enables it to achieve the same conver-gence level as the supermodular treatmentsThis trend continues in our simulations as2018 performs robustly well compared to thesupermodular treatments Likewise in our anal-ysis of the experimental results the conver-gence speed of 2040 was not significantlydifferent from those of the other dominant treat-ments In our simulations however this treat-ment dominates all treatments in player 2equilibrium play

FIGURE 3 CONTINUED

1527VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

In our simulations the convergence improve-ment due to increasing from 0 to 20 doesdepend on (-effect on -effect) An increasein significantly decreases the improvement in

convergence level (all p-values for one-sidedt-tests are less than 001) Due to the relativelypoor performance of 1000 in our simula-tions increasing from 10 to 20 when 0

Player 1

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500Round

Ep

silo

n-e

qu

ilib

riu

m p

lay

α10β00α20β00α20β18α10β20α20β20α20β40

Player 2

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500

Round

Ep

silo

n-e

qu

ilib

riu

m p

lay α10β00

α20β00α20β18α10β20α20β20α20β40

FIGURE 4 SIMULATED -EQUILIBRIUM PLAY

1528 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

improves convergence more dramatically inround 500 than in rounds 40 to 60 In the longrun an increase in is a partial substitute for anincrease in

We now use definition 2 to examine the con-vergence speed in the long run Table 9 presentsresults for the first time a treatment reaches thelevel of convergence L We omit the two 00treatments since they do not converge to thesame levels as the other treatments In terms ofplayer 1 2018 achieves the fastest conver-gence for all levels L The picture for player 2convergence however is more subtle First forthe 20 and 18 treatments the speeds of con-vergence to all L are remarkably similarWhile 2040 lags the other treatments interms of achieving 80-percent -equilibriumplay for player 2 it is the only treatment toachieve L 90 percent

Apart from convergence to equilibrium it isalso important to look at long-run welfare prop-erties We evaluate mechanism performance us-ing the efficiency measure in Section IV

Figure 5 summarizes the simulated and actualefficiency achieved by each of the treatmentsThe top panel reports simulated results whilethe bottom panel reports efficiency in the ex-perimental data In the long run supermodularand near-supermodular treatments continue todiffer from the 00 treatments with 20 supe-rior to 10 in the 00 treatments

VI Interpretation and Discussion

The underlying force for convergence ingames with strategic complementarities is thecombination of the slope of the best-responsefunctions and adaptive learning by players In

TABLE 9mdashSPEED OF CONVERGENCE IN SIMULATED DATA

(Threshold is the percent of players playing epsilon-equilibrium strategy Entry indicates round in which went overthreshold for good where ldquomdashrdquo indicates that treatment never achieved threshold)

Threshold

Player 1 Player 2

04 05 06 07 04 05 06 07 08 09

2018 54 83 126 316 38 52 62 87 127 mdash1020 129 221 mdash mdash 36 48 63 91 148 mdash2020 61 91 149 mdash 39 51 61 84 137 mdash2040 101 166 263 496 30 38 56 94 166 339

TABLE 8mdashRESULTS OF t-TESTS COMPARING LEVEL OF CONVERGENCE OF SIMULATIONS OF EXPERIMENTAL TREATMENTS

(Values are for round 500 and based on 1500 simulated games)

Treatment

Player 1 equilibrium price Player 2 equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0073 00822000 0188 2000 1000 0000 0213 2000 1000 00002018 0265 2018 2040 0187 0381 2018 1020 02641020 0211 1020 2000 0000 0376 1020 2020 00012020 0243 2020 1020 0000 0353 2020 2000 00002040 0259 2040 2020 0007 0442 2040 2018 0000

Treatment

Player 1 -equilibrium price Player 2 -equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0216 02322000 0519 2000 1000 0000 0573 2000 1000 00002018 0713 2018 2040 0045 0870 2018 1020 00081020 0584 1020 2000 0000 0857 1020 2020 00002020 0662 2020 1020 0000 0834 2020 2000 00002040 0701 2040 2020 0000 0925 2040 2018 0000

1529VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the compensation mechanisms the slope ofplayer 2rsquos best response function is determinedby As and determine the penalty for

mismatched prices we seriously consider thepossibility that convergence in supermodularand near-supermodular treatments to an equilib-

Simulation Data

75

80

85

90

95

100

50 150 250 350 450

Round

α10β00α20β00α20β18α10β20α20β20α20β40

Experimental Data

40

45

50

55

60

65

70

75

80

85

90

0 10 20 30 40 50 60

Round

Fra

ctio

n o

f M

axim

um

Wel

fare

α10β00α20β00α20β18α10β20α20β20α20β40

Fra

ctio

n o

f M

axim

um

Wel

fare

FIGURE 5 EFFICIENCY IN THE SIMULATED AND ACTUAL DATA

1530 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium where both players announce the sameprice is not due to strategic complementaritiesbut rather to increases in and making thepenalty terms more prominent and matching strat-egies more focal22 We thus have two hypoth-eses to explain the observed improvement inconvergence to equilibrium play in supermodu-lar and near-supermodular treatments a best-response hypothesis and a matching hypothesisIn this section we present evidence that theimprovement in convergence is due to payoff-relevant changes to best responses and not tomatching becoming more focal

While the focal point hypothesis suggeststhat players converge to a match it does notspecify the price at which they match In thefirst round of our experiments almost 50 per-cent of all prices were 20 and less than 10percent were in the -equilibrium range of[15 17] In the final rounds of our experi-ments play in the four supermodular or near-supermodular treatments converges stronglyto this range suggesting more than simplematching

By equation (4) player 1rsquos best response isalways to match regardless of p2 By theGeneral Incentive Hypothesis we expectmore matching behavior by player 1 as increases By equation (5) however match-ing is a best response for player 2 only inequilibrium Therefore data for player 2 canhelp us separate the best-response and match-ing hypotheses

We first investigate which model better ex-plains our experimental data We operationalizethe best response hypothesis by the stochasticfictitious play model of Section V and thematching hypothesis with a stochastic matchingmodel23 In both models the predicted player 1price is specified by equation (6) In the match-ing model player 2 plays this price with prob-ability 1 and one of the other 40 priceswith probability 40 We estimate using thefirst two sessions of each treatment24 We then

compare the performance of the matchingmodel with the sFP model using the hold-outsample Using the Wilcoxon signed-rank teststo compare the MSD scores in the hold-outsample we conclude that sFP explains player2rsquos behavior significantly better than the match-ing model (p-value 0006) We conclude thatbest response to a predicted price explains be-havior significantly better than the matchingmodel

We next investigate whether differences inthe cost of matching defined as the differencebetween matching and best-response profitscan explain all of the differences in match-ing behavior among treatments or whetherthere exists additional matching behaviorthat cannot be explained by payoff-relevantfactors

We examine the probability that player 2 triesto match the anticipated player 1 price We donot know what price player 2 anticipates Toestimate how players weigh history we cali-brate a matching model in a manner similar tothat outlined in Section V B We assume aplayer matches the price he anticipates givenby equation (6) and we find the discount ratefor each treatment that minimizes MSD25 Thisgives us the anticipated price that best explainsthe matching hypothesis Using this discountrate we calculate for each t 1 an anticipatedplayer 1 price p1 for each player 2 We alsocalculate the opportunity cost of matching p1equal to the difference between best-responseand matching profits This opportunity cost cap-tures the payoff relevant effects of and alsodepends on the price to be matched

We next regress the probability that player 2matches the anticipated price Pr(p2 p1) onln(Round) treatment dummies and treatmentdummies interacted with ln(Round) In onespecification we include the cost of matching asa regressor If parameter changes make match-ing more focal then changes in the cost ofmatching will not explain all of the changes inprobability of matching

22 We thank an anonymous referee and the co-editor forpointing this out

23 We do not report the results of the deterministicmatching model as it performs significantly worse than itsstochastic counterpart

24 Estimated in each block are as follows 1000 10 7 6 8 2000 7 6 5 4 2018 7

2 3 3 1020 5 3 3 2 2020 1 0 0 02040 2 0 0 1

25 In all treatments the calibrated discount rate is r 01

1531VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Table 10 presents the results of two probitmodels with clustering at the individual level Inthe first specification we do not include the costof matching as a regressor In this specificationthe coefficient for D00 is negative and signifi-cant thus reducing from 20 to 0 decreases theprobability that player 2 will try to match player1rsquos price Including the cost of matching in theregression however we find that neither thedummies nor their interactions are signifi-cantmdashthe coefficient for the previously signifi-cant D00 dummy falls almost by a half whilethe precision of the estimate stays about thesame The coefficient on the cost of matchinghowever is significant and negative Thisfinding suggests that the rise in matching that

occurs when we increase from 0 to 20 is notbecause matching is more focal but ratherbecause an increase of decreases the cost ofmatching26

VII Concluding Remarks

The appeal of games with strategic comple-mentarities is simple as long as players areadaptive learners the game will in the limitconverge to the set bounded by the largest andsmallest Nash equilibria This convergence de-pends on neither initial conditions nor the as-sumption of a particular learning modelUnfortunately while many competitive envi-ronments are supermodular many are not Inthis study we examine whether there are non-supermodular games which have the same con-vergence properties as supermodular ones Wealso study whether there exists a clear conver-gence ranking among games with strategiccomplementarities

Our results confirm the findings of previousexperimental studies that supermodular gamesperform significantly better than games far be-low the supermodular threshold From a littlebelow the threshold to the threshold howeverthe change in convergence level is statisticallyinsignificant This results suggests that in thecontext of mechanism design the designerneed not be overly concerned with setting pa-rameters that are firmly above the supermodularthreshold close is just as good It also enlargesthe set of robustly stable games For examplethe Falkinger mechanism (Falkinger 1996) isnot supermodular but close These results sug-gest that near-supermodular games perform likesupermodular ones

Our next result concerns convergence perfor-mance within the class of games with strategiccomplementarities Variations in the degree ofcomplementarities have no significant effect onperformance within the 60 experimental roundsOur simulations suggest however an increased

26 We consider other specifications of anticipated priceincluding the averages of the last one two three and fourprices seen In only one specification was a dummy or itsinteraction with ln(Round) significant In that specificationthe coefficient for D18 is positive and its interaction withln(Round) is negative a finding which is not consistent witha focal point hypothesis

TABLE 10mdashPLAYER 2 MATCHING HYPOTHESIS

(Probit models with individual-level clustering comparingmodels with and without the cost to Player 2 of matching

the anticipated price of Player 1)

Model

Dependent variableProbability player 2 priceequals anticipated player 1

price

(1) (2)

D10 0014 0017(0043) (0041)

D00 0077 0043(0035) (0036)

D18 0046 0032(0053) (0056)

D40 0008 0041(0061) (0042)

ln(Round) 0041 0028(0011) (0010)

Match Cost 0001(0000)

ln(Round)D10 0014 0012(0012) (0012)

ln(Round)D00 0003 0000(0013) (0012)

ln(Round)D18 0006 0002(0021) (0020)

ln(Round)D40 0001 0015(0018) (0016)

Observations 9588 9588Log pseudo-likelihood 324321 317716

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses are adjusted for cluster-ing at the individual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummyvariable for treatment y Excluded dummy is D2020Match Cost is the difference in cents of matching andbest-responding to anticipated Player 1 price

1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 17: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

(6) pi t 1

pi t yen 1

t 1

rpi t

1 yen 1

t 1

r

for some discount factor r [0 1] Note r 0corresponds to the Cournot best reply assess-ment pi(t 1) pi(t) When r 1 it yieldsthe standard fictitious play assessment Theusual adaptive learning model assumes 0 r 1All observations influence the expected state butmore recent observations have greater weight

As opposed to standard fictitious play stochas-tic fictitious play allows decision randomizationand thus better captures the human learning pro-cess Omitting time subscripts the probability thata player announces price pi is given by

(7) Pr pipi exp13 pi pi

yenj 0

40

exp13 pj pj

Given a predicted price a player is thus morelikely to play strategies that yield higher pay-offs How much more likely is determined by 13the sensitivity parameter As 13 increases theprobability of a best response to pi increases

The second model we consider fEWA is aone-parameter variant of the experience-weighted attraction (EWA) model (Camererand Ho 1999) In this model strategy probabil-ities are determined by logit probabilities simi-lar to equation (7) with actual payoffs ((pipi)) replaced by strategy attraction In bothvariants strategy attraction Ai

j(t) and an expe-rience weight N(t) are updated after everyperiod The experience weight is updated ac-cording to N(t) (1 ) N(t 1) 1where is the change-detection parameter and controls exploration (low ) versus exploita-tion Attractions are updated according to thefollowing rule

(8) Aijt

Nt 1Ai

jt 1 1 I pij pi ti pj pi t

Nt

where the indicator function I(pij pi(t)) equals

one if pij pi(t) and zero otherwise and [0

1] is the imagination weight In EWA all pa-rameters are estimated whereas in fEWA theseparameters (except for 13) are endogenously de-termined by the following functions Thechange-detector function i(t) is given by

(9) i t 1 5 j 1

mi Ipij pi t

1

yen 1

t

I pij pi t

t 2

This function will be close to 1 when recenthistory resembles previous history The imagi-nation weight i(t) equals i(t) while equalsthe Gini coefficient of previous choice frequen-cies EWA models encompass a variety offamiliar learning models cumulative reinforce-ment learning ( 0 1 N(0) 1)weighted reinforcement learning ( 0 0N(0) 1) weighted fictitious play ( 1 0) standard fictitious play ( 1 0)and Cournot best reply ( 1 1)

Finally we introduce the main componentsof the PA model For simplicity we omit allsubscripts that represent player i and let j(t)represent the actual payoff of strategy j in roundt Since the game has a large strategy space weincorporate similarity functions into the modelto represent agent use of strategy similarity Asstrategies in this game are naturally ordered bytheir labels we use the Bartlett similarity func-tion fjk(h t) to denote the similarity betweenthe played strategy k and an unplayed strategyj at period t

fjk h t 1 j kh if j k h0 otherwise

In this function the parameter h determines theh 1 unplayed strategies on either side of theplayed strategy to be updated When h 1fjk(1 t) degenerates into an indicator functionequal to one if strategy j is chosen in round t andzero otherwise

The PA model assumes that a player is amyopic subjective maximizer That is hechooses strategies based on assessed payoffs

1521VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

and does not explicitly take into account thelikelihood of alternate states Let uj(t) denotethe subjective assessment of strategy pj at timet and r the discount factor Payoff assessmentsare updated through a weighted average of hisprevious assessments and the payoff he actuallyobtains at time t If strategy k is chosen at timet then

(10) uj t 1 1 rfjk h tuj t

rfjk h tk t j

Each period the assessed strategy payoffs aresubject to zero-mean symmetrically distributedshocks zj(t) The decision maker chooses on thebasis of his shock-distorted subjective assess-ments uj(t) uj(t) zj(t) At time t he choosesstrategy pk if

(11) uk t uj t 0 pj pk

Note that mood shocks affect only his choicesand not the manner in which assessments areupdated

B Calibration

Literature assessing the performance oflearning models contains two approaches to cal-ibration and validation The first approach cal-ibrates the model on the first t rounds andvalidates on the remaining rounds The secondapproach uses half of the sessions in each treat-ment to calibrate and the other half to validateWe choose the latter approach for two reasonsFirst this approach is feasible as we have mul-tiple independent sessions for each treatmentSecond we need not assume that the parametersof later rounds are the same as those in earlierrounds We thus calibrate the parameters ofeach model in blocks of 15 rounds using theexperimental data from the first two sessions ofeach treatment We then evaluate each model bymeasuring how well the parameterized modelpredicts play in the remaining two or threesessions

For parameter estimation we conduct MonteCarlo simulations designed to replicate thecharacteristics of the experimental settings In

all calibrations we exclude the last two rounds(59 and 60) to avoid any end-of-game effectsWe then compare the simulated paths with theexperimental data to find those parameters thatminimize the mean-squared deviation (MSD)scores Since the final distributions of our pricedata are unimodal the simulated mean is aninformative statistic and is well captured byMSD (Ernan Haruvy and Dale Stahl 2000) Inall simulations we use the k-period-aheadrather than the one-period-ahead approach14 be-cause we are interested in forecasting the long-run mechanism performance In doing so wechoose k 10 15 20 30 and 58 We look atblocks of 10 15 20 and 30 rounds because asplayers gain information and experience infor-mation use may change over time We use k 15 rather than the other values because it bestcaptures the actual dynamics in the experimen-tal data

Each simulation consists of 1500 games(18000 players) and the following steps

(i) Simulated players are randomly matchedinto pairs at the beginning of each round

(ii) Simulated players select price announce-ments(a) Initial round Almost half of all play-

ers selected a first-round price of 20(44 percent of player 1s and 43 percentof player 2s) There are also consider-able spikes in the frequency of select-ing other prices divisible by 515 Basedon Kolmogorov-Smirnov tests on theactual round-one price distribution wereject the null hypotheses of uniformdistribution (d 0250 p-value 0000) and normal distribution (d 0381 p-value 0000) We thus fol-low the convention (eg Ho et al2001) and use the actual first-roundempirical distribution of choices togenerate the first-round choices

(b) Subsequent rounds Simulated playerstrategies are determined via equation(7) for sFP and fEWA and equation(11) for PA

14 Ido Erev and Haruvy (2000) discuss the tradeoffs ofthe two approaches

15 For example the seven most frequently selected pricesare divisible by 5

1522 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iii) Simulated player 1rsquos quantity choice isbased on the following steps(a) Determine each player 1rsquos best re-

sponse to p2(b) Determine whether player 1 will devi-

ate from the best response via the ac-tual probability of errors for eachblock

(c) If yes deviate via the actual mean andstandard deviation for the block Oth-erwise play the best response

(iv) Payoffs are determined by the payoff func-tion of the compensation mechanismequation (1) for each treatment

(v) Assessments are updated according toequation (8) for fEWA and equation (10)for PA

(vi) Proceed to the next round

The discount factor r [0 1] is searched ata grid size of 01 The parameter 13 is searchedat a grid size of 01 in the interval [15 105] forfEWA and [15 250] for sFP The size of thesimilarity window h [1 10] is searched at agrid size of 1 Mood shocks z are drawn from

a uniform distribution16 on an interval [a a]where a is searched on [0 500] with a step sizeof 50 For all parameters intervals and gridsizes are determined by payoff magnitude

Table 6 reports the calibrated parameters(discount factor sensitivity parameter moodshock interval and similarity window size) forthe first two sessions of each treatment in 15-round blocks17 Estimated parameters for thesupermodular and near-supermodular treatmentsare consistent with the increased level of con-vergence over time while the 00 treatmentsare not The second column (sFP) reports thebest-fit parameters for the stochastic fictitiousplay model With the exception of treatment2000 the discount factor is close to 10

16 Chen and Yuri Khoroshilov (2003) compare threeversions of PA models where shocks were drawn fromuniform logistic and double exponential distributions ontwo sets of experimental data and find that the performanceof the three versions of PA were statistically indistinguish-able

17 We also calibrate all parameters for the entire 60rounds but do not report results due to space limitations

TABLE 6mdashCALIBRATION OF THREE LEARNING MODELS IN FIFTEEN-ROUND BLOCKS

Model sFP fEWA PA

Block 1 2 3 4 1 2 3 4 1 2 3 4

Discount rate (r) Discount rate (r)1000 10 07 09 10 01 00 09 102000 07 06 01 01 05 10 02 012018 10 10 09 09 02 10 03 021020 10 10 10 09 01 03 06 032020 10 10 10 09 02 01 05 022040 10 10 10 07 01 10 04 03

Sensitivity (13) Sensitivity (13) Shock interval (a)1000 15 44 45 28 16 45 44 37 500 500 250 502000 15 43 140 180 17 40 51 74 50 100 150 502018 16 80 115 183 15 53 62 95 50 300 500 1501020 19 59 86 138 18 47 62 82 100 500 450 3002020 28 56 81 112 24 38 61 67 200 300 350 3502040 34 65 140 205 32 49 67 99 150 50 150 50

Similarity window (h)1000 1 1 10 92000 10 10 9 62018 5 4 6 11020 1 1 8 32020 2 2 7 12040 1 3 2 2

1523VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

indicating that players keep track of all past playwhen forming beliefs about opponentsrsquo nextmove18 Given these beliefs the increasing sen-sitivity parameter 13 for all treatments except1000 indicates that the likelihood of bestresponse increases over time The third columnreports calibration results for the fEWA modelAgain with the exception of treatment 1000the parameter 13 increases over time The finalcolumn reports calibration of parameters in thePA model For each treatment except 1000 amiddle block has the highest discount factorindicating more weight on new informationabout a strategyrsquos performance The second pa-rameter in the PA model represents the upperbound of the interval from which shocks aredrawn a Experimentation should decrease inthe final rounds Indeed for all treatments theestimated mood-shock ranges (weakly) de-crease from the third to the last block Finallythe decreasing similarity windows from thethird to final block indicate decreasing strategyspillover Relatively large discount factors inthe third block combined with relatively largesimilarity windows flatten the payoff assess-ments around the most recently played strate-gies consistent with the local experimentation(or relatively stabilized play) observed in thedata

C Validation

Using the parameters calibrated on the firsttwo sessions of each treatment we next com-pare the performance of the three learning mod-els in predicting play in the hold-out sampleFor comparison we also present the perfor-mance of two static models The random choicemodel assumes that each player randomlychooses any strategy with equal probability forall rounds This model incorporates only num-ber of strategies and thus provides a minimumstandard for a dynamic learning model Theequilibrium model assumes that each playerplays the subgame perfect Nash equilibriumevery round Its fit conveys the same informa-tion as the proportion of equilibrium play pre-

sented in Section IV but with a different metric(MSD)

Table 7 presents each modelrsquos MSD scoresfor each hold-out session Recall that two ofeach treatmentrsquos four or five independent ses-sions are used for calibration and the rest forvalidation The results indicate that all threelearning models perform significantly betterthan the random choice model (p-value 001one-sided permutation tests) and by a largermargin significantly better than the equilibriummodel (p-value 001 one-sided permutationtests) While the equilibrium model does a poorjob of explaining overall experimental data itsperformance improvement over time19 justifiesthe use of learning models to explain the dy-namics of play Within each of the top threepanels (ie learning models) session-levelMSD scores are lower for the supermodular andnear-supermodular treatments indicating thateach learning model does a better job explainingbehavior in those treatments Indeed in the em-pirical learning literature learning models fitbetter in experiments with better equilibriumconvergence (see eg Chen and Khoroshilov2003)

RESULT 4 (Comparison of Learning Mod-els) The performance of PA is weakly betterthan that of fEWA and strictly better than that ofsFP The performances of fEWA and sFP arenot significantly different

SUPPORT Table 7 reports the MSD scores foreach independent session in the hold-out sampleunder each model The Wilcoxon signed-rankstests show that the MSD scores under PA areweakly lower than those under fEWA (z 0085 p-value 0068) and strictly lowerthan those under sFP (z 0028 p-value 0023) Using the same test we cannot reject thehypothesis that fEWA and sFP yield the sameMSD scores (z 0662 p-value 0508)

Although the performance of PA is onlyweakly better than that of fEWA the overallMSD scores for PA are lower than those for

18 As the discount factor is zero in Cournot best replywe can reject this model based on our estimation of thediscount factors

19 We omit the table to show the performance of theequilibrium model in blocks of 15 rounds as the informa-tion is repetitive with Figures 1 and 2

1524 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

fEWA for every treatment Therefore we usePA for forecasting beyond 60 rounds

Figure 3 presents the simulated path of thecalibrated PA model and compares it with theactual data in the hold-out sample by superim-posing the simulated mean (black line) plus andminus one standard deviation (grey lines) on theactual mean (black box) and standard deviation(error bars) The simulation does a good job oftracking both the mean and the standard devia-tion of the actual data In treatments 1000and 2040 however the reduction in variancelags that seen in the middle rounds of theexperiment

D Forecasting

We now report the results from the PAmodel As the strategic complementarity predic-tions concern long-run performance we use thecalibrated parameters for the first 58 rounds tosimulate play in later rounds This exercise al-

lows us to study convergence and efficiency inlong but finite horizons

In all forecasting exercises we base all post-58-round parameters on those from the lastblock (calibrated from rounds 46 to 58) Sincewe do not know how these parameters mightchange beyond 60 rounds we use three differ-ent specifications First we retain all final-blockparameters Second we exponentially decay insubsequent blocks the probability of deviationfrom the best response in the production stage20

Third we exponentially decay the shock inter-val in subsequent blocks As all three specifica-tions yield qualitatively similar results wereport results from only the third specificationdue to space limitations21 In presenting the

20 In block k 4 we use the probability of deviationPrk minPr4(k 4)2 108 We set a lower bound of108 to avoid the division-by-zero problem

21 Different sequences of random numbers produceslightly different parameters estimates While the conver-

TABLE 7mdashVALIDATION ON HOLD-OUT SESSIONS

Session

Stochastic fictitious play

1000 2000 2018 1020 2020 2040

1 0955 0934 0940 0930 0888 09402 0960 0946 0890 0917 0973 08783 0948 0883 0873Overall 0957 0943 0915 0910 0912 0909

fEWA

1 0956 0934 0942 0928 0887 09482 0960 0946 0890 0922 0978 08863 0946 0879 0871Overall 0958 0942 0916 0910 0912 0917

Payoff assessment

1 0952 0933 0914 0941 0888 09162 0957 0944 0899 0922 0917 08983 0947 0894 0903Overall 0955 0941 0907 0919 0902 0907

Equilibrium play

1 1867 1853 1833 1833 1489 17922 1889 1919 1606 1839 1953 16693 1917 1447 1539Overall 1878 1896 1719 1706 1660 1731

Random play

Overall 0976 0976 0976 0976 0976 0976

1525VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

results we use the shorthand notation x y todenote a measure under treatment x is signifi-cantly higher than that under treatment y at the5-percent level or less and x y to denote thatthe measures are not significantly different atthe 5-percent level

Figure 4 reports the simulated proportion of-equilibrium play We report only the resultsfor the first 500 rounds since the dynamics donot change much thereafter In our simulationall treatments reach higher convergence levelsthan those achieved in the 60 rounds of the

experiment Since the simulated variance reduc-tion lags that of the experiment round 60 resultsare not achieved until approximately 90 roundsof simulation We further note that these im-provements slow down after 200 rounds Theproportion of -equilibrium play is bounded by72 percent for player 1 and 93 percent for player2 We now outline the simulation results

RESULT 5 (Level of Convergence in Round500) In the simulated data for player 1 atround 500 we have the following level of con-vergence ranking in

(i) p1 2018 2040 2020 1020 2000 1000

(ii) -p1 2018 2040 2020 1020 2000 1000

(iii) p2 2040 2018 1020 2020 2000 1000 and

gence level of a given treatment for a given set of parameterestimates is not affected by the sequence of random num-bers this convergence level is somewhat sensitive to theexact parameter calibration We therefore conduct four dif-ferent calibrations for each treatment and select the param-eters that yield the highest level The relative rankings oftreatments are quite robust with respect to ordinal selected

FIGURE 3 SIMULATED DYNAMIC PATH VERSUS ACTUAL DATA

1526 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iv) -p2 2040 2018 1020 2020 2000 1000

SUPPORT The fourth and seventh columns ofTable 8 report p-values for t-tests of the preced-ing alternative hypotheses for players 1 and 2respectively

Comparing Results 1 and 5 we first note thatthe four supermodular and near-supermodulartreatments continue to dominate the 00 treat-ments In fact the simulations suggest that thegap remains constant compared with 2000and actually increases compared with 1000Unlike Result 1 however the four dominanttreatments differ In particular 2018 per-forms better in terms of player 1rsquos convergenceand 2040 in terms of player 2rsquos although thedifferences within the top four treatments aresmaller than the differences between these four

and the 00 treatments In addition an increasein significantly improves convergence by alarge margin (40 to 80 percent) for both playerswhen 0

It is instructive to compare these results withthose concerning the speed of convergence inour experimental treatments In terms of con-vergence to -p2 our regression results (Table4) indicate that 2018rsquos performance in laterrounds enables it to achieve the same conver-gence level as the supermodular treatmentsThis trend continues in our simulations as2018 performs robustly well compared to thesupermodular treatments Likewise in our anal-ysis of the experimental results the conver-gence speed of 2040 was not significantlydifferent from those of the other dominant treat-ments In our simulations however this treat-ment dominates all treatments in player 2equilibrium play

FIGURE 3 CONTINUED

1527VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

In our simulations the convergence improve-ment due to increasing from 0 to 20 doesdepend on (-effect on -effect) An increasein significantly decreases the improvement in

convergence level (all p-values for one-sidedt-tests are less than 001) Due to the relativelypoor performance of 1000 in our simula-tions increasing from 10 to 20 when 0

Player 1

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500Round

Ep

silo

n-e

qu

ilib

riu

m p

lay

α10β00α20β00α20β18α10β20α20β20α20β40

Player 2

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500

Round

Ep

silo

n-e

qu

ilib

riu

m p

lay α10β00

α20β00α20β18α10β20α20β20α20β40

FIGURE 4 SIMULATED -EQUILIBRIUM PLAY

1528 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

improves convergence more dramatically inround 500 than in rounds 40 to 60 In the longrun an increase in is a partial substitute for anincrease in

We now use definition 2 to examine the con-vergence speed in the long run Table 9 presentsresults for the first time a treatment reaches thelevel of convergence L We omit the two 00treatments since they do not converge to thesame levels as the other treatments In terms ofplayer 1 2018 achieves the fastest conver-gence for all levels L The picture for player 2convergence however is more subtle First forthe 20 and 18 treatments the speeds of con-vergence to all L are remarkably similarWhile 2040 lags the other treatments interms of achieving 80-percent -equilibriumplay for player 2 it is the only treatment toachieve L 90 percent

Apart from convergence to equilibrium it isalso important to look at long-run welfare prop-erties We evaluate mechanism performance us-ing the efficiency measure in Section IV

Figure 5 summarizes the simulated and actualefficiency achieved by each of the treatmentsThe top panel reports simulated results whilethe bottom panel reports efficiency in the ex-perimental data In the long run supermodularand near-supermodular treatments continue todiffer from the 00 treatments with 20 supe-rior to 10 in the 00 treatments

VI Interpretation and Discussion

The underlying force for convergence ingames with strategic complementarities is thecombination of the slope of the best-responsefunctions and adaptive learning by players In

TABLE 9mdashSPEED OF CONVERGENCE IN SIMULATED DATA

(Threshold is the percent of players playing epsilon-equilibrium strategy Entry indicates round in which went overthreshold for good where ldquomdashrdquo indicates that treatment never achieved threshold)

Threshold

Player 1 Player 2

04 05 06 07 04 05 06 07 08 09

2018 54 83 126 316 38 52 62 87 127 mdash1020 129 221 mdash mdash 36 48 63 91 148 mdash2020 61 91 149 mdash 39 51 61 84 137 mdash2040 101 166 263 496 30 38 56 94 166 339

TABLE 8mdashRESULTS OF t-TESTS COMPARING LEVEL OF CONVERGENCE OF SIMULATIONS OF EXPERIMENTAL TREATMENTS

(Values are for round 500 and based on 1500 simulated games)

Treatment

Player 1 equilibrium price Player 2 equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0073 00822000 0188 2000 1000 0000 0213 2000 1000 00002018 0265 2018 2040 0187 0381 2018 1020 02641020 0211 1020 2000 0000 0376 1020 2020 00012020 0243 2020 1020 0000 0353 2020 2000 00002040 0259 2040 2020 0007 0442 2040 2018 0000

Treatment

Player 1 -equilibrium price Player 2 -equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0216 02322000 0519 2000 1000 0000 0573 2000 1000 00002018 0713 2018 2040 0045 0870 2018 1020 00081020 0584 1020 2000 0000 0857 1020 2020 00002020 0662 2020 1020 0000 0834 2020 2000 00002040 0701 2040 2020 0000 0925 2040 2018 0000

1529VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the compensation mechanisms the slope ofplayer 2rsquos best response function is determinedby As and determine the penalty for

mismatched prices we seriously consider thepossibility that convergence in supermodularand near-supermodular treatments to an equilib-

Simulation Data

75

80

85

90

95

100

50 150 250 350 450

Round

α10β00α20β00α20β18α10β20α20β20α20β40

Experimental Data

40

45

50

55

60

65

70

75

80

85

90

0 10 20 30 40 50 60

Round

Fra

ctio

n o

f M

axim

um

Wel

fare

α10β00α20β00α20β18α10β20α20β20α20β40

Fra

ctio

n o

f M

axim

um

Wel

fare

FIGURE 5 EFFICIENCY IN THE SIMULATED AND ACTUAL DATA

1530 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium where both players announce the sameprice is not due to strategic complementaritiesbut rather to increases in and making thepenalty terms more prominent and matching strat-egies more focal22 We thus have two hypoth-eses to explain the observed improvement inconvergence to equilibrium play in supermodu-lar and near-supermodular treatments a best-response hypothesis and a matching hypothesisIn this section we present evidence that theimprovement in convergence is due to payoff-relevant changes to best responses and not tomatching becoming more focal

While the focal point hypothesis suggeststhat players converge to a match it does notspecify the price at which they match In thefirst round of our experiments almost 50 per-cent of all prices were 20 and less than 10percent were in the -equilibrium range of[15 17] In the final rounds of our experi-ments play in the four supermodular or near-supermodular treatments converges stronglyto this range suggesting more than simplematching

By equation (4) player 1rsquos best response isalways to match regardless of p2 By theGeneral Incentive Hypothesis we expectmore matching behavior by player 1 as increases By equation (5) however match-ing is a best response for player 2 only inequilibrium Therefore data for player 2 canhelp us separate the best-response and match-ing hypotheses

We first investigate which model better ex-plains our experimental data We operationalizethe best response hypothesis by the stochasticfictitious play model of Section V and thematching hypothesis with a stochastic matchingmodel23 In both models the predicted player 1price is specified by equation (6) In the match-ing model player 2 plays this price with prob-ability 1 and one of the other 40 priceswith probability 40 We estimate using thefirst two sessions of each treatment24 We then

compare the performance of the matchingmodel with the sFP model using the hold-outsample Using the Wilcoxon signed-rank teststo compare the MSD scores in the hold-outsample we conclude that sFP explains player2rsquos behavior significantly better than the match-ing model (p-value 0006) We conclude thatbest response to a predicted price explains be-havior significantly better than the matchingmodel

We next investigate whether differences inthe cost of matching defined as the differencebetween matching and best-response profitscan explain all of the differences in match-ing behavior among treatments or whetherthere exists additional matching behaviorthat cannot be explained by payoff-relevantfactors

We examine the probability that player 2 triesto match the anticipated player 1 price We donot know what price player 2 anticipates Toestimate how players weigh history we cali-brate a matching model in a manner similar tothat outlined in Section V B We assume aplayer matches the price he anticipates givenby equation (6) and we find the discount ratefor each treatment that minimizes MSD25 Thisgives us the anticipated price that best explainsthe matching hypothesis Using this discountrate we calculate for each t 1 an anticipatedplayer 1 price p1 for each player 2 We alsocalculate the opportunity cost of matching p1equal to the difference between best-responseand matching profits This opportunity cost cap-tures the payoff relevant effects of and alsodepends on the price to be matched

We next regress the probability that player 2matches the anticipated price Pr(p2 p1) onln(Round) treatment dummies and treatmentdummies interacted with ln(Round) In onespecification we include the cost of matching asa regressor If parameter changes make match-ing more focal then changes in the cost ofmatching will not explain all of the changes inprobability of matching

22 We thank an anonymous referee and the co-editor forpointing this out

23 We do not report the results of the deterministicmatching model as it performs significantly worse than itsstochastic counterpart

24 Estimated in each block are as follows 1000 10 7 6 8 2000 7 6 5 4 2018 7

2 3 3 1020 5 3 3 2 2020 1 0 0 02040 2 0 0 1

25 In all treatments the calibrated discount rate is r 01

1531VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Table 10 presents the results of two probitmodels with clustering at the individual level Inthe first specification we do not include the costof matching as a regressor In this specificationthe coefficient for D00 is negative and signifi-cant thus reducing from 20 to 0 decreases theprobability that player 2 will try to match player1rsquos price Including the cost of matching in theregression however we find that neither thedummies nor their interactions are signifi-cantmdashthe coefficient for the previously signifi-cant D00 dummy falls almost by a half whilethe precision of the estimate stays about thesame The coefficient on the cost of matchinghowever is significant and negative Thisfinding suggests that the rise in matching that

occurs when we increase from 0 to 20 is notbecause matching is more focal but ratherbecause an increase of decreases the cost ofmatching26

VII Concluding Remarks

The appeal of games with strategic comple-mentarities is simple as long as players areadaptive learners the game will in the limitconverge to the set bounded by the largest andsmallest Nash equilibria This convergence de-pends on neither initial conditions nor the as-sumption of a particular learning modelUnfortunately while many competitive envi-ronments are supermodular many are not Inthis study we examine whether there are non-supermodular games which have the same con-vergence properties as supermodular ones Wealso study whether there exists a clear conver-gence ranking among games with strategiccomplementarities

Our results confirm the findings of previousexperimental studies that supermodular gamesperform significantly better than games far be-low the supermodular threshold From a littlebelow the threshold to the threshold howeverthe change in convergence level is statisticallyinsignificant This results suggests that in thecontext of mechanism design the designerneed not be overly concerned with setting pa-rameters that are firmly above the supermodularthreshold close is just as good It also enlargesthe set of robustly stable games For examplethe Falkinger mechanism (Falkinger 1996) isnot supermodular but close These results sug-gest that near-supermodular games perform likesupermodular ones

Our next result concerns convergence perfor-mance within the class of games with strategiccomplementarities Variations in the degree ofcomplementarities have no significant effect onperformance within the 60 experimental roundsOur simulations suggest however an increased

26 We consider other specifications of anticipated priceincluding the averages of the last one two three and fourprices seen In only one specification was a dummy or itsinteraction with ln(Round) significant In that specificationthe coefficient for D18 is positive and its interaction withln(Round) is negative a finding which is not consistent witha focal point hypothesis

TABLE 10mdashPLAYER 2 MATCHING HYPOTHESIS

(Probit models with individual-level clustering comparingmodels with and without the cost to Player 2 of matching

the anticipated price of Player 1)

Model

Dependent variableProbability player 2 priceequals anticipated player 1

price

(1) (2)

D10 0014 0017(0043) (0041)

D00 0077 0043(0035) (0036)

D18 0046 0032(0053) (0056)

D40 0008 0041(0061) (0042)

ln(Round) 0041 0028(0011) (0010)

Match Cost 0001(0000)

ln(Round)D10 0014 0012(0012) (0012)

ln(Round)D00 0003 0000(0013) (0012)

ln(Round)D18 0006 0002(0021) (0020)

ln(Round)D40 0001 0015(0018) (0016)

Observations 9588 9588Log pseudo-likelihood 324321 317716

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses are adjusted for cluster-ing at the individual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummyvariable for treatment y Excluded dummy is D2020Match Cost is the difference in cents of matching andbest-responding to anticipated Player 1 price

1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 18: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

and does not explicitly take into account thelikelihood of alternate states Let uj(t) denotethe subjective assessment of strategy pj at timet and r the discount factor Payoff assessmentsare updated through a weighted average of hisprevious assessments and the payoff he actuallyobtains at time t If strategy k is chosen at timet then

(10) uj t 1 1 rfjk h tuj t

rfjk h tk t j

Each period the assessed strategy payoffs aresubject to zero-mean symmetrically distributedshocks zj(t) The decision maker chooses on thebasis of his shock-distorted subjective assess-ments uj(t) uj(t) zj(t) At time t he choosesstrategy pk if

(11) uk t uj t 0 pj pk

Note that mood shocks affect only his choicesand not the manner in which assessments areupdated

B Calibration

Literature assessing the performance oflearning models contains two approaches to cal-ibration and validation The first approach cal-ibrates the model on the first t rounds andvalidates on the remaining rounds The secondapproach uses half of the sessions in each treat-ment to calibrate and the other half to validateWe choose the latter approach for two reasonsFirst this approach is feasible as we have mul-tiple independent sessions for each treatmentSecond we need not assume that the parametersof later rounds are the same as those in earlierrounds We thus calibrate the parameters ofeach model in blocks of 15 rounds using theexperimental data from the first two sessions ofeach treatment We then evaluate each model bymeasuring how well the parameterized modelpredicts play in the remaining two or threesessions

For parameter estimation we conduct MonteCarlo simulations designed to replicate thecharacteristics of the experimental settings In

all calibrations we exclude the last two rounds(59 and 60) to avoid any end-of-game effectsWe then compare the simulated paths with theexperimental data to find those parameters thatminimize the mean-squared deviation (MSD)scores Since the final distributions of our pricedata are unimodal the simulated mean is aninformative statistic and is well captured byMSD (Ernan Haruvy and Dale Stahl 2000) Inall simulations we use the k-period-aheadrather than the one-period-ahead approach14 be-cause we are interested in forecasting the long-run mechanism performance In doing so wechoose k 10 15 20 30 and 58 We look atblocks of 10 15 20 and 30 rounds because asplayers gain information and experience infor-mation use may change over time We use k 15 rather than the other values because it bestcaptures the actual dynamics in the experimen-tal data

Each simulation consists of 1500 games(18000 players) and the following steps

(i) Simulated players are randomly matchedinto pairs at the beginning of each round

(ii) Simulated players select price announce-ments(a) Initial round Almost half of all play-

ers selected a first-round price of 20(44 percent of player 1s and 43 percentof player 2s) There are also consider-able spikes in the frequency of select-ing other prices divisible by 515 Basedon Kolmogorov-Smirnov tests on theactual round-one price distribution wereject the null hypotheses of uniformdistribution (d 0250 p-value 0000) and normal distribution (d 0381 p-value 0000) We thus fol-low the convention (eg Ho et al2001) and use the actual first-roundempirical distribution of choices togenerate the first-round choices

(b) Subsequent rounds Simulated playerstrategies are determined via equation(7) for sFP and fEWA and equation(11) for PA

14 Ido Erev and Haruvy (2000) discuss the tradeoffs ofthe two approaches

15 For example the seven most frequently selected pricesare divisible by 5

1522 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iii) Simulated player 1rsquos quantity choice isbased on the following steps(a) Determine each player 1rsquos best re-

sponse to p2(b) Determine whether player 1 will devi-

ate from the best response via the ac-tual probability of errors for eachblock

(c) If yes deviate via the actual mean andstandard deviation for the block Oth-erwise play the best response

(iv) Payoffs are determined by the payoff func-tion of the compensation mechanismequation (1) for each treatment

(v) Assessments are updated according toequation (8) for fEWA and equation (10)for PA

(vi) Proceed to the next round

The discount factor r [0 1] is searched ata grid size of 01 The parameter 13 is searchedat a grid size of 01 in the interval [15 105] forfEWA and [15 250] for sFP The size of thesimilarity window h [1 10] is searched at agrid size of 1 Mood shocks z are drawn from

a uniform distribution16 on an interval [a a]where a is searched on [0 500] with a step sizeof 50 For all parameters intervals and gridsizes are determined by payoff magnitude

Table 6 reports the calibrated parameters(discount factor sensitivity parameter moodshock interval and similarity window size) forthe first two sessions of each treatment in 15-round blocks17 Estimated parameters for thesupermodular and near-supermodular treatmentsare consistent with the increased level of con-vergence over time while the 00 treatmentsare not The second column (sFP) reports thebest-fit parameters for the stochastic fictitiousplay model With the exception of treatment2000 the discount factor is close to 10

16 Chen and Yuri Khoroshilov (2003) compare threeversions of PA models where shocks were drawn fromuniform logistic and double exponential distributions ontwo sets of experimental data and find that the performanceof the three versions of PA were statistically indistinguish-able

17 We also calibrate all parameters for the entire 60rounds but do not report results due to space limitations

TABLE 6mdashCALIBRATION OF THREE LEARNING MODELS IN FIFTEEN-ROUND BLOCKS

Model sFP fEWA PA

Block 1 2 3 4 1 2 3 4 1 2 3 4

Discount rate (r) Discount rate (r)1000 10 07 09 10 01 00 09 102000 07 06 01 01 05 10 02 012018 10 10 09 09 02 10 03 021020 10 10 10 09 01 03 06 032020 10 10 10 09 02 01 05 022040 10 10 10 07 01 10 04 03

Sensitivity (13) Sensitivity (13) Shock interval (a)1000 15 44 45 28 16 45 44 37 500 500 250 502000 15 43 140 180 17 40 51 74 50 100 150 502018 16 80 115 183 15 53 62 95 50 300 500 1501020 19 59 86 138 18 47 62 82 100 500 450 3002020 28 56 81 112 24 38 61 67 200 300 350 3502040 34 65 140 205 32 49 67 99 150 50 150 50

Similarity window (h)1000 1 1 10 92000 10 10 9 62018 5 4 6 11020 1 1 8 32020 2 2 7 12040 1 3 2 2

1523VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

indicating that players keep track of all past playwhen forming beliefs about opponentsrsquo nextmove18 Given these beliefs the increasing sen-sitivity parameter 13 for all treatments except1000 indicates that the likelihood of bestresponse increases over time The third columnreports calibration results for the fEWA modelAgain with the exception of treatment 1000the parameter 13 increases over time The finalcolumn reports calibration of parameters in thePA model For each treatment except 1000 amiddle block has the highest discount factorindicating more weight on new informationabout a strategyrsquos performance The second pa-rameter in the PA model represents the upperbound of the interval from which shocks aredrawn a Experimentation should decrease inthe final rounds Indeed for all treatments theestimated mood-shock ranges (weakly) de-crease from the third to the last block Finallythe decreasing similarity windows from thethird to final block indicate decreasing strategyspillover Relatively large discount factors inthe third block combined with relatively largesimilarity windows flatten the payoff assess-ments around the most recently played strate-gies consistent with the local experimentation(or relatively stabilized play) observed in thedata

C Validation

Using the parameters calibrated on the firsttwo sessions of each treatment we next com-pare the performance of the three learning mod-els in predicting play in the hold-out sampleFor comparison we also present the perfor-mance of two static models The random choicemodel assumes that each player randomlychooses any strategy with equal probability forall rounds This model incorporates only num-ber of strategies and thus provides a minimumstandard for a dynamic learning model Theequilibrium model assumes that each playerplays the subgame perfect Nash equilibriumevery round Its fit conveys the same informa-tion as the proportion of equilibrium play pre-

sented in Section IV but with a different metric(MSD)

Table 7 presents each modelrsquos MSD scoresfor each hold-out session Recall that two ofeach treatmentrsquos four or five independent ses-sions are used for calibration and the rest forvalidation The results indicate that all threelearning models perform significantly betterthan the random choice model (p-value 001one-sided permutation tests) and by a largermargin significantly better than the equilibriummodel (p-value 001 one-sided permutationtests) While the equilibrium model does a poorjob of explaining overall experimental data itsperformance improvement over time19 justifiesthe use of learning models to explain the dy-namics of play Within each of the top threepanels (ie learning models) session-levelMSD scores are lower for the supermodular andnear-supermodular treatments indicating thateach learning model does a better job explainingbehavior in those treatments Indeed in the em-pirical learning literature learning models fitbetter in experiments with better equilibriumconvergence (see eg Chen and Khoroshilov2003)

RESULT 4 (Comparison of Learning Mod-els) The performance of PA is weakly betterthan that of fEWA and strictly better than that ofsFP The performances of fEWA and sFP arenot significantly different

SUPPORT Table 7 reports the MSD scores foreach independent session in the hold-out sampleunder each model The Wilcoxon signed-rankstests show that the MSD scores under PA areweakly lower than those under fEWA (z 0085 p-value 0068) and strictly lowerthan those under sFP (z 0028 p-value 0023) Using the same test we cannot reject thehypothesis that fEWA and sFP yield the sameMSD scores (z 0662 p-value 0508)

Although the performance of PA is onlyweakly better than that of fEWA the overallMSD scores for PA are lower than those for

18 As the discount factor is zero in Cournot best replywe can reject this model based on our estimation of thediscount factors

19 We omit the table to show the performance of theequilibrium model in blocks of 15 rounds as the informa-tion is repetitive with Figures 1 and 2

1524 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

fEWA for every treatment Therefore we usePA for forecasting beyond 60 rounds

Figure 3 presents the simulated path of thecalibrated PA model and compares it with theactual data in the hold-out sample by superim-posing the simulated mean (black line) plus andminus one standard deviation (grey lines) on theactual mean (black box) and standard deviation(error bars) The simulation does a good job oftracking both the mean and the standard devia-tion of the actual data In treatments 1000and 2040 however the reduction in variancelags that seen in the middle rounds of theexperiment

D Forecasting

We now report the results from the PAmodel As the strategic complementarity predic-tions concern long-run performance we use thecalibrated parameters for the first 58 rounds tosimulate play in later rounds This exercise al-

lows us to study convergence and efficiency inlong but finite horizons

In all forecasting exercises we base all post-58-round parameters on those from the lastblock (calibrated from rounds 46 to 58) Sincewe do not know how these parameters mightchange beyond 60 rounds we use three differ-ent specifications First we retain all final-blockparameters Second we exponentially decay insubsequent blocks the probability of deviationfrom the best response in the production stage20

Third we exponentially decay the shock inter-val in subsequent blocks As all three specifica-tions yield qualitatively similar results wereport results from only the third specificationdue to space limitations21 In presenting the

20 In block k 4 we use the probability of deviationPrk minPr4(k 4)2 108 We set a lower bound of108 to avoid the division-by-zero problem

21 Different sequences of random numbers produceslightly different parameters estimates While the conver-

TABLE 7mdashVALIDATION ON HOLD-OUT SESSIONS

Session

Stochastic fictitious play

1000 2000 2018 1020 2020 2040

1 0955 0934 0940 0930 0888 09402 0960 0946 0890 0917 0973 08783 0948 0883 0873Overall 0957 0943 0915 0910 0912 0909

fEWA

1 0956 0934 0942 0928 0887 09482 0960 0946 0890 0922 0978 08863 0946 0879 0871Overall 0958 0942 0916 0910 0912 0917

Payoff assessment

1 0952 0933 0914 0941 0888 09162 0957 0944 0899 0922 0917 08983 0947 0894 0903Overall 0955 0941 0907 0919 0902 0907

Equilibrium play

1 1867 1853 1833 1833 1489 17922 1889 1919 1606 1839 1953 16693 1917 1447 1539Overall 1878 1896 1719 1706 1660 1731

Random play

Overall 0976 0976 0976 0976 0976 0976

1525VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

results we use the shorthand notation x y todenote a measure under treatment x is signifi-cantly higher than that under treatment y at the5-percent level or less and x y to denote thatthe measures are not significantly different atthe 5-percent level

Figure 4 reports the simulated proportion of-equilibrium play We report only the resultsfor the first 500 rounds since the dynamics donot change much thereafter In our simulationall treatments reach higher convergence levelsthan those achieved in the 60 rounds of the

experiment Since the simulated variance reduc-tion lags that of the experiment round 60 resultsare not achieved until approximately 90 roundsof simulation We further note that these im-provements slow down after 200 rounds Theproportion of -equilibrium play is bounded by72 percent for player 1 and 93 percent for player2 We now outline the simulation results

RESULT 5 (Level of Convergence in Round500) In the simulated data for player 1 atround 500 we have the following level of con-vergence ranking in

(i) p1 2018 2040 2020 1020 2000 1000

(ii) -p1 2018 2040 2020 1020 2000 1000

(iii) p2 2040 2018 1020 2020 2000 1000 and

gence level of a given treatment for a given set of parameterestimates is not affected by the sequence of random num-bers this convergence level is somewhat sensitive to theexact parameter calibration We therefore conduct four dif-ferent calibrations for each treatment and select the param-eters that yield the highest level The relative rankings oftreatments are quite robust with respect to ordinal selected

FIGURE 3 SIMULATED DYNAMIC PATH VERSUS ACTUAL DATA

1526 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iv) -p2 2040 2018 1020 2020 2000 1000

SUPPORT The fourth and seventh columns ofTable 8 report p-values for t-tests of the preced-ing alternative hypotheses for players 1 and 2respectively

Comparing Results 1 and 5 we first note thatthe four supermodular and near-supermodulartreatments continue to dominate the 00 treat-ments In fact the simulations suggest that thegap remains constant compared with 2000and actually increases compared with 1000Unlike Result 1 however the four dominanttreatments differ In particular 2018 per-forms better in terms of player 1rsquos convergenceand 2040 in terms of player 2rsquos although thedifferences within the top four treatments aresmaller than the differences between these four

and the 00 treatments In addition an increasein significantly improves convergence by alarge margin (40 to 80 percent) for both playerswhen 0

It is instructive to compare these results withthose concerning the speed of convergence inour experimental treatments In terms of con-vergence to -p2 our regression results (Table4) indicate that 2018rsquos performance in laterrounds enables it to achieve the same conver-gence level as the supermodular treatmentsThis trend continues in our simulations as2018 performs robustly well compared to thesupermodular treatments Likewise in our anal-ysis of the experimental results the conver-gence speed of 2040 was not significantlydifferent from those of the other dominant treat-ments In our simulations however this treat-ment dominates all treatments in player 2equilibrium play

FIGURE 3 CONTINUED

1527VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

In our simulations the convergence improve-ment due to increasing from 0 to 20 doesdepend on (-effect on -effect) An increasein significantly decreases the improvement in

convergence level (all p-values for one-sidedt-tests are less than 001) Due to the relativelypoor performance of 1000 in our simula-tions increasing from 10 to 20 when 0

Player 1

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500Round

Ep

silo

n-e

qu

ilib

riu

m p

lay

α10β00α20β00α20β18α10β20α20β20α20β40

Player 2

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500

Round

Ep

silo

n-e

qu

ilib

riu

m p

lay α10β00

α20β00α20β18α10β20α20β20α20β40

FIGURE 4 SIMULATED -EQUILIBRIUM PLAY

1528 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

improves convergence more dramatically inround 500 than in rounds 40 to 60 In the longrun an increase in is a partial substitute for anincrease in

We now use definition 2 to examine the con-vergence speed in the long run Table 9 presentsresults for the first time a treatment reaches thelevel of convergence L We omit the two 00treatments since they do not converge to thesame levels as the other treatments In terms ofplayer 1 2018 achieves the fastest conver-gence for all levels L The picture for player 2convergence however is more subtle First forthe 20 and 18 treatments the speeds of con-vergence to all L are remarkably similarWhile 2040 lags the other treatments interms of achieving 80-percent -equilibriumplay for player 2 it is the only treatment toachieve L 90 percent

Apart from convergence to equilibrium it isalso important to look at long-run welfare prop-erties We evaluate mechanism performance us-ing the efficiency measure in Section IV

Figure 5 summarizes the simulated and actualefficiency achieved by each of the treatmentsThe top panel reports simulated results whilethe bottom panel reports efficiency in the ex-perimental data In the long run supermodularand near-supermodular treatments continue todiffer from the 00 treatments with 20 supe-rior to 10 in the 00 treatments

VI Interpretation and Discussion

The underlying force for convergence ingames with strategic complementarities is thecombination of the slope of the best-responsefunctions and adaptive learning by players In

TABLE 9mdashSPEED OF CONVERGENCE IN SIMULATED DATA

(Threshold is the percent of players playing epsilon-equilibrium strategy Entry indicates round in which went overthreshold for good where ldquomdashrdquo indicates that treatment never achieved threshold)

Threshold

Player 1 Player 2

04 05 06 07 04 05 06 07 08 09

2018 54 83 126 316 38 52 62 87 127 mdash1020 129 221 mdash mdash 36 48 63 91 148 mdash2020 61 91 149 mdash 39 51 61 84 137 mdash2040 101 166 263 496 30 38 56 94 166 339

TABLE 8mdashRESULTS OF t-TESTS COMPARING LEVEL OF CONVERGENCE OF SIMULATIONS OF EXPERIMENTAL TREATMENTS

(Values are for round 500 and based on 1500 simulated games)

Treatment

Player 1 equilibrium price Player 2 equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0073 00822000 0188 2000 1000 0000 0213 2000 1000 00002018 0265 2018 2040 0187 0381 2018 1020 02641020 0211 1020 2000 0000 0376 1020 2020 00012020 0243 2020 1020 0000 0353 2020 2000 00002040 0259 2040 2020 0007 0442 2040 2018 0000

Treatment

Player 1 -equilibrium price Player 2 -equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0216 02322000 0519 2000 1000 0000 0573 2000 1000 00002018 0713 2018 2040 0045 0870 2018 1020 00081020 0584 1020 2000 0000 0857 1020 2020 00002020 0662 2020 1020 0000 0834 2020 2000 00002040 0701 2040 2020 0000 0925 2040 2018 0000

1529VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the compensation mechanisms the slope ofplayer 2rsquos best response function is determinedby As and determine the penalty for

mismatched prices we seriously consider thepossibility that convergence in supermodularand near-supermodular treatments to an equilib-

Simulation Data

75

80

85

90

95

100

50 150 250 350 450

Round

α10β00α20β00α20β18α10β20α20β20α20β40

Experimental Data

40

45

50

55

60

65

70

75

80

85

90

0 10 20 30 40 50 60

Round

Fra

ctio

n o

f M

axim

um

Wel

fare

α10β00α20β00α20β18α10β20α20β20α20β40

Fra

ctio

n o

f M

axim

um

Wel

fare

FIGURE 5 EFFICIENCY IN THE SIMULATED AND ACTUAL DATA

1530 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium where both players announce the sameprice is not due to strategic complementaritiesbut rather to increases in and making thepenalty terms more prominent and matching strat-egies more focal22 We thus have two hypoth-eses to explain the observed improvement inconvergence to equilibrium play in supermodu-lar and near-supermodular treatments a best-response hypothesis and a matching hypothesisIn this section we present evidence that theimprovement in convergence is due to payoff-relevant changes to best responses and not tomatching becoming more focal

While the focal point hypothesis suggeststhat players converge to a match it does notspecify the price at which they match In thefirst round of our experiments almost 50 per-cent of all prices were 20 and less than 10percent were in the -equilibrium range of[15 17] In the final rounds of our experi-ments play in the four supermodular or near-supermodular treatments converges stronglyto this range suggesting more than simplematching

By equation (4) player 1rsquos best response isalways to match regardless of p2 By theGeneral Incentive Hypothesis we expectmore matching behavior by player 1 as increases By equation (5) however match-ing is a best response for player 2 only inequilibrium Therefore data for player 2 canhelp us separate the best-response and match-ing hypotheses

We first investigate which model better ex-plains our experimental data We operationalizethe best response hypothesis by the stochasticfictitious play model of Section V and thematching hypothesis with a stochastic matchingmodel23 In both models the predicted player 1price is specified by equation (6) In the match-ing model player 2 plays this price with prob-ability 1 and one of the other 40 priceswith probability 40 We estimate using thefirst two sessions of each treatment24 We then

compare the performance of the matchingmodel with the sFP model using the hold-outsample Using the Wilcoxon signed-rank teststo compare the MSD scores in the hold-outsample we conclude that sFP explains player2rsquos behavior significantly better than the match-ing model (p-value 0006) We conclude thatbest response to a predicted price explains be-havior significantly better than the matchingmodel

We next investigate whether differences inthe cost of matching defined as the differencebetween matching and best-response profitscan explain all of the differences in match-ing behavior among treatments or whetherthere exists additional matching behaviorthat cannot be explained by payoff-relevantfactors

We examine the probability that player 2 triesto match the anticipated player 1 price We donot know what price player 2 anticipates Toestimate how players weigh history we cali-brate a matching model in a manner similar tothat outlined in Section V B We assume aplayer matches the price he anticipates givenby equation (6) and we find the discount ratefor each treatment that minimizes MSD25 Thisgives us the anticipated price that best explainsthe matching hypothesis Using this discountrate we calculate for each t 1 an anticipatedplayer 1 price p1 for each player 2 We alsocalculate the opportunity cost of matching p1equal to the difference between best-responseand matching profits This opportunity cost cap-tures the payoff relevant effects of and alsodepends on the price to be matched

We next regress the probability that player 2matches the anticipated price Pr(p2 p1) onln(Round) treatment dummies and treatmentdummies interacted with ln(Round) In onespecification we include the cost of matching asa regressor If parameter changes make match-ing more focal then changes in the cost ofmatching will not explain all of the changes inprobability of matching

22 We thank an anonymous referee and the co-editor forpointing this out

23 We do not report the results of the deterministicmatching model as it performs significantly worse than itsstochastic counterpart

24 Estimated in each block are as follows 1000 10 7 6 8 2000 7 6 5 4 2018 7

2 3 3 1020 5 3 3 2 2020 1 0 0 02040 2 0 0 1

25 In all treatments the calibrated discount rate is r 01

1531VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Table 10 presents the results of two probitmodels with clustering at the individual level Inthe first specification we do not include the costof matching as a regressor In this specificationthe coefficient for D00 is negative and signifi-cant thus reducing from 20 to 0 decreases theprobability that player 2 will try to match player1rsquos price Including the cost of matching in theregression however we find that neither thedummies nor their interactions are signifi-cantmdashthe coefficient for the previously signifi-cant D00 dummy falls almost by a half whilethe precision of the estimate stays about thesame The coefficient on the cost of matchinghowever is significant and negative Thisfinding suggests that the rise in matching that

occurs when we increase from 0 to 20 is notbecause matching is more focal but ratherbecause an increase of decreases the cost ofmatching26

VII Concluding Remarks

The appeal of games with strategic comple-mentarities is simple as long as players areadaptive learners the game will in the limitconverge to the set bounded by the largest andsmallest Nash equilibria This convergence de-pends on neither initial conditions nor the as-sumption of a particular learning modelUnfortunately while many competitive envi-ronments are supermodular many are not Inthis study we examine whether there are non-supermodular games which have the same con-vergence properties as supermodular ones Wealso study whether there exists a clear conver-gence ranking among games with strategiccomplementarities

Our results confirm the findings of previousexperimental studies that supermodular gamesperform significantly better than games far be-low the supermodular threshold From a littlebelow the threshold to the threshold howeverthe change in convergence level is statisticallyinsignificant This results suggests that in thecontext of mechanism design the designerneed not be overly concerned with setting pa-rameters that are firmly above the supermodularthreshold close is just as good It also enlargesthe set of robustly stable games For examplethe Falkinger mechanism (Falkinger 1996) isnot supermodular but close These results sug-gest that near-supermodular games perform likesupermodular ones

Our next result concerns convergence perfor-mance within the class of games with strategiccomplementarities Variations in the degree ofcomplementarities have no significant effect onperformance within the 60 experimental roundsOur simulations suggest however an increased

26 We consider other specifications of anticipated priceincluding the averages of the last one two three and fourprices seen In only one specification was a dummy or itsinteraction with ln(Round) significant In that specificationthe coefficient for D18 is positive and its interaction withln(Round) is negative a finding which is not consistent witha focal point hypothesis

TABLE 10mdashPLAYER 2 MATCHING HYPOTHESIS

(Probit models with individual-level clustering comparingmodels with and without the cost to Player 2 of matching

the anticipated price of Player 1)

Model

Dependent variableProbability player 2 priceequals anticipated player 1

price

(1) (2)

D10 0014 0017(0043) (0041)

D00 0077 0043(0035) (0036)

D18 0046 0032(0053) (0056)

D40 0008 0041(0061) (0042)

ln(Round) 0041 0028(0011) (0010)

Match Cost 0001(0000)

ln(Round)D10 0014 0012(0012) (0012)

ln(Round)D00 0003 0000(0013) (0012)

ln(Round)D18 0006 0002(0021) (0020)

ln(Round)D40 0001 0015(0018) (0016)

Observations 9588 9588Log pseudo-likelihood 324321 317716

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses are adjusted for cluster-ing at the individual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummyvariable for treatment y Excluded dummy is D2020Match Cost is the difference in cents of matching andbest-responding to anticipated Player 1 price

1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 19: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

(iii) Simulated player 1rsquos quantity choice isbased on the following steps(a) Determine each player 1rsquos best re-

sponse to p2(b) Determine whether player 1 will devi-

ate from the best response via the ac-tual probability of errors for eachblock

(c) If yes deviate via the actual mean andstandard deviation for the block Oth-erwise play the best response

(iv) Payoffs are determined by the payoff func-tion of the compensation mechanismequation (1) for each treatment

(v) Assessments are updated according toequation (8) for fEWA and equation (10)for PA

(vi) Proceed to the next round

The discount factor r [0 1] is searched ata grid size of 01 The parameter 13 is searchedat a grid size of 01 in the interval [15 105] forfEWA and [15 250] for sFP The size of thesimilarity window h [1 10] is searched at agrid size of 1 Mood shocks z are drawn from

a uniform distribution16 on an interval [a a]where a is searched on [0 500] with a step sizeof 50 For all parameters intervals and gridsizes are determined by payoff magnitude

Table 6 reports the calibrated parameters(discount factor sensitivity parameter moodshock interval and similarity window size) forthe first two sessions of each treatment in 15-round blocks17 Estimated parameters for thesupermodular and near-supermodular treatmentsare consistent with the increased level of con-vergence over time while the 00 treatmentsare not The second column (sFP) reports thebest-fit parameters for the stochastic fictitiousplay model With the exception of treatment2000 the discount factor is close to 10

16 Chen and Yuri Khoroshilov (2003) compare threeversions of PA models where shocks were drawn fromuniform logistic and double exponential distributions ontwo sets of experimental data and find that the performanceof the three versions of PA were statistically indistinguish-able

17 We also calibrate all parameters for the entire 60rounds but do not report results due to space limitations

TABLE 6mdashCALIBRATION OF THREE LEARNING MODELS IN FIFTEEN-ROUND BLOCKS

Model sFP fEWA PA

Block 1 2 3 4 1 2 3 4 1 2 3 4

Discount rate (r) Discount rate (r)1000 10 07 09 10 01 00 09 102000 07 06 01 01 05 10 02 012018 10 10 09 09 02 10 03 021020 10 10 10 09 01 03 06 032020 10 10 10 09 02 01 05 022040 10 10 10 07 01 10 04 03

Sensitivity (13) Sensitivity (13) Shock interval (a)1000 15 44 45 28 16 45 44 37 500 500 250 502000 15 43 140 180 17 40 51 74 50 100 150 502018 16 80 115 183 15 53 62 95 50 300 500 1501020 19 59 86 138 18 47 62 82 100 500 450 3002020 28 56 81 112 24 38 61 67 200 300 350 3502040 34 65 140 205 32 49 67 99 150 50 150 50

Similarity window (h)1000 1 1 10 92000 10 10 9 62018 5 4 6 11020 1 1 8 32020 2 2 7 12040 1 3 2 2

1523VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

indicating that players keep track of all past playwhen forming beliefs about opponentsrsquo nextmove18 Given these beliefs the increasing sen-sitivity parameter 13 for all treatments except1000 indicates that the likelihood of bestresponse increases over time The third columnreports calibration results for the fEWA modelAgain with the exception of treatment 1000the parameter 13 increases over time The finalcolumn reports calibration of parameters in thePA model For each treatment except 1000 amiddle block has the highest discount factorindicating more weight on new informationabout a strategyrsquos performance The second pa-rameter in the PA model represents the upperbound of the interval from which shocks aredrawn a Experimentation should decrease inthe final rounds Indeed for all treatments theestimated mood-shock ranges (weakly) de-crease from the third to the last block Finallythe decreasing similarity windows from thethird to final block indicate decreasing strategyspillover Relatively large discount factors inthe third block combined with relatively largesimilarity windows flatten the payoff assess-ments around the most recently played strate-gies consistent with the local experimentation(or relatively stabilized play) observed in thedata

C Validation

Using the parameters calibrated on the firsttwo sessions of each treatment we next com-pare the performance of the three learning mod-els in predicting play in the hold-out sampleFor comparison we also present the perfor-mance of two static models The random choicemodel assumes that each player randomlychooses any strategy with equal probability forall rounds This model incorporates only num-ber of strategies and thus provides a minimumstandard for a dynamic learning model Theequilibrium model assumes that each playerplays the subgame perfect Nash equilibriumevery round Its fit conveys the same informa-tion as the proportion of equilibrium play pre-

sented in Section IV but with a different metric(MSD)

Table 7 presents each modelrsquos MSD scoresfor each hold-out session Recall that two ofeach treatmentrsquos four or five independent ses-sions are used for calibration and the rest forvalidation The results indicate that all threelearning models perform significantly betterthan the random choice model (p-value 001one-sided permutation tests) and by a largermargin significantly better than the equilibriummodel (p-value 001 one-sided permutationtests) While the equilibrium model does a poorjob of explaining overall experimental data itsperformance improvement over time19 justifiesthe use of learning models to explain the dy-namics of play Within each of the top threepanels (ie learning models) session-levelMSD scores are lower for the supermodular andnear-supermodular treatments indicating thateach learning model does a better job explainingbehavior in those treatments Indeed in the em-pirical learning literature learning models fitbetter in experiments with better equilibriumconvergence (see eg Chen and Khoroshilov2003)

RESULT 4 (Comparison of Learning Mod-els) The performance of PA is weakly betterthan that of fEWA and strictly better than that ofsFP The performances of fEWA and sFP arenot significantly different

SUPPORT Table 7 reports the MSD scores foreach independent session in the hold-out sampleunder each model The Wilcoxon signed-rankstests show that the MSD scores under PA areweakly lower than those under fEWA (z 0085 p-value 0068) and strictly lowerthan those under sFP (z 0028 p-value 0023) Using the same test we cannot reject thehypothesis that fEWA and sFP yield the sameMSD scores (z 0662 p-value 0508)

Although the performance of PA is onlyweakly better than that of fEWA the overallMSD scores for PA are lower than those for

18 As the discount factor is zero in Cournot best replywe can reject this model based on our estimation of thediscount factors

19 We omit the table to show the performance of theequilibrium model in blocks of 15 rounds as the informa-tion is repetitive with Figures 1 and 2

1524 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

fEWA for every treatment Therefore we usePA for forecasting beyond 60 rounds

Figure 3 presents the simulated path of thecalibrated PA model and compares it with theactual data in the hold-out sample by superim-posing the simulated mean (black line) plus andminus one standard deviation (grey lines) on theactual mean (black box) and standard deviation(error bars) The simulation does a good job oftracking both the mean and the standard devia-tion of the actual data In treatments 1000and 2040 however the reduction in variancelags that seen in the middle rounds of theexperiment

D Forecasting

We now report the results from the PAmodel As the strategic complementarity predic-tions concern long-run performance we use thecalibrated parameters for the first 58 rounds tosimulate play in later rounds This exercise al-

lows us to study convergence and efficiency inlong but finite horizons

In all forecasting exercises we base all post-58-round parameters on those from the lastblock (calibrated from rounds 46 to 58) Sincewe do not know how these parameters mightchange beyond 60 rounds we use three differ-ent specifications First we retain all final-blockparameters Second we exponentially decay insubsequent blocks the probability of deviationfrom the best response in the production stage20

Third we exponentially decay the shock inter-val in subsequent blocks As all three specifica-tions yield qualitatively similar results wereport results from only the third specificationdue to space limitations21 In presenting the

20 In block k 4 we use the probability of deviationPrk minPr4(k 4)2 108 We set a lower bound of108 to avoid the division-by-zero problem

21 Different sequences of random numbers produceslightly different parameters estimates While the conver-

TABLE 7mdashVALIDATION ON HOLD-OUT SESSIONS

Session

Stochastic fictitious play

1000 2000 2018 1020 2020 2040

1 0955 0934 0940 0930 0888 09402 0960 0946 0890 0917 0973 08783 0948 0883 0873Overall 0957 0943 0915 0910 0912 0909

fEWA

1 0956 0934 0942 0928 0887 09482 0960 0946 0890 0922 0978 08863 0946 0879 0871Overall 0958 0942 0916 0910 0912 0917

Payoff assessment

1 0952 0933 0914 0941 0888 09162 0957 0944 0899 0922 0917 08983 0947 0894 0903Overall 0955 0941 0907 0919 0902 0907

Equilibrium play

1 1867 1853 1833 1833 1489 17922 1889 1919 1606 1839 1953 16693 1917 1447 1539Overall 1878 1896 1719 1706 1660 1731

Random play

Overall 0976 0976 0976 0976 0976 0976

1525VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

results we use the shorthand notation x y todenote a measure under treatment x is signifi-cantly higher than that under treatment y at the5-percent level or less and x y to denote thatthe measures are not significantly different atthe 5-percent level

Figure 4 reports the simulated proportion of-equilibrium play We report only the resultsfor the first 500 rounds since the dynamics donot change much thereafter In our simulationall treatments reach higher convergence levelsthan those achieved in the 60 rounds of the

experiment Since the simulated variance reduc-tion lags that of the experiment round 60 resultsare not achieved until approximately 90 roundsof simulation We further note that these im-provements slow down after 200 rounds Theproportion of -equilibrium play is bounded by72 percent for player 1 and 93 percent for player2 We now outline the simulation results

RESULT 5 (Level of Convergence in Round500) In the simulated data for player 1 atround 500 we have the following level of con-vergence ranking in

(i) p1 2018 2040 2020 1020 2000 1000

(ii) -p1 2018 2040 2020 1020 2000 1000

(iii) p2 2040 2018 1020 2020 2000 1000 and

gence level of a given treatment for a given set of parameterestimates is not affected by the sequence of random num-bers this convergence level is somewhat sensitive to theexact parameter calibration We therefore conduct four dif-ferent calibrations for each treatment and select the param-eters that yield the highest level The relative rankings oftreatments are quite robust with respect to ordinal selected

FIGURE 3 SIMULATED DYNAMIC PATH VERSUS ACTUAL DATA

1526 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iv) -p2 2040 2018 1020 2020 2000 1000

SUPPORT The fourth and seventh columns ofTable 8 report p-values for t-tests of the preced-ing alternative hypotheses for players 1 and 2respectively

Comparing Results 1 and 5 we first note thatthe four supermodular and near-supermodulartreatments continue to dominate the 00 treat-ments In fact the simulations suggest that thegap remains constant compared with 2000and actually increases compared with 1000Unlike Result 1 however the four dominanttreatments differ In particular 2018 per-forms better in terms of player 1rsquos convergenceand 2040 in terms of player 2rsquos although thedifferences within the top four treatments aresmaller than the differences between these four

and the 00 treatments In addition an increasein significantly improves convergence by alarge margin (40 to 80 percent) for both playerswhen 0

It is instructive to compare these results withthose concerning the speed of convergence inour experimental treatments In terms of con-vergence to -p2 our regression results (Table4) indicate that 2018rsquos performance in laterrounds enables it to achieve the same conver-gence level as the supermodular treatmentsThis trend continues in our simulations as2018 performs robustly well compared to thesupermodular treatments Likewise in our anal-ysis of the experimental results the conver-gence speed of 2040 was not significantlydifferent from those of the other dominant treat-ments In our simulations however this treat-ment dominates all treatments in player 2equilibrium play

FIGURE 3 CONTINUED

1527VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

In our simulations the convergence improve-ment due to increasing from 0 to 20 doesdepend on (-effect on -effect) An increasein significantly decreases the improvement in

convergence level (all p-values for one-sidedt-tests are less than 001) Due to the relativelypoor performance of 1000 in our simula-tions increasing from 10 to 20 when 0

Player 1

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500Round

Ep

silo

n-e

qu

ilib

riu

m p

lay

α10β00α20β00α20β18α10β20α20β20α20β40

Player 2

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500

Round

Ep

silo

n-e

qu

ilib

riu

m p

lay α10β00

α20β00α20β18α10β20α20β20α20β40

FIGURE 4 SIMULATED -EQUILIBRIUM PLAY

1528 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

improves convergence more dramatically inround 500 than in rounds 40 to 60 In the longrun an increase in is a partial substitute for anincrease in

We now use definition 2 to examine the con-vergence speed in the long run Table 9 presentsresults for the first time a treatment reaches thelevel of convergence L We omit the two 00treatments since they do not converge to thesame levels as the other treatments In terms ofplayer 1 2018 achieves the fastest conver-gence for all levels L The picture for player 2convergence however is more subtle First forthe 20 and 18 treatments the speeds of con-vergence to all L are remarkably similarWhile 2040 lags the other treatments interms of achieving 80-percent -equilibriumplay for player 2 it is the only treatment toachieve L 90 percent

Apart from convergence to equilibrium it isalso important to look at long-run welfare prop-erties We evaluate mechanism performance us-ing the efficiency measure in Section IV

Figure 5 summarizes the simulated and actualefficiency achieved by each of the treatmentsThe top panel reports simulated results whilethe bottom panel reports efficiency in the ex-perimental data In the long run supermodularand near-supermodular treatments continue todiffer from the 00 treatments with 20 supe-rior to 10 in the 00 treatments

VI Interpretation and Discussion

The underlying force for convergence ingames with strategic complementarities is thecombination of the slope of the best-responsefunctions and adaptive learning by players In

TABLE 9mdashSPEED OF CONVERGENCE IN SIMULATED DATA

(Threshold is the percent of players playing epsilon-equilibrium strategy Entry indicates round in which went overthreshold for good where ldquomdashrdquo indicates that treatment never achieved threshold)

Threshold

Player 1 Player 2

04 05 06 07 04 05 06 07 08 09

2018 54 83 126 316 38 52 62 87 127 mdash1020 129 221 mdash mdash 36 48 63 91 148 mdash2020 61 91 149 mdash 39 51 61 84 137 mdash2040 101 166 263 496 30 38 56 94 166 339

TABLE 8mdashRESULTS OF t-TESTS COMPARING LEVEL OF CONVERGENCE OF SIMULATIONS OF EXPERIMENTAL TREATMENTS

(Values are for round 500 and based on 1500 simulated games)

Treatment

Player 1 equilibrium price Player 2 equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0073 00822000 0188 2000 1000 0000 0213 2000 1000 00002018 0265 2018 2040 0187 0381 2018 1020 02641020 0211 1020 2000 0000 0376 1020 2020 00012020 0243 2020 1020 0000 0353 2020 2000 00002040 0259 2040 2020 0007 0442 2040 2018 0000

Treatment

Player 1 -equilibrium price Player 2 -equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0216 02322000 0519 2000 1000 0000 0573 2000 1000 00002018 0713 2018 2040 0045 0870 2018 1020 00081020 0584 1020 2000 0000 0857 1020 2020 00002020 0662 2020 1020 0000 0834 2020 2000 00002040 0701 2040 2020 0000 0925 2040 2018 0000

1529VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the compensation mechanisms the slope ofplayer 2rsquos best response function is determinedby As and determine the penalty for

mismatched prices we seriously consider thepossibility that convergence in supermodularand near-supermodular treatments to an equilib-

Simulation Data

75

80

85

90

95

100

50 150 250 350 450

Round

α10β00α20β00α20β18α10β20α20β20α20β40

Experimental Data

40

45

50

55

60

65

70

75

80

85

90

0 10 20 30 40 50 60

Round

Fra

ctio

n o

f M

axim

um

Wel

fare

α10β00α20β00α20β18α10β20α20β20α20β40

Fra

ctio

n o

f M

axim

um

Wel

fare

FIGURE 5 EFFICIENCY IN THE SIMULATED AND ACTUAL DATA

1530 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium where both players announce the sameprice is not due to strategic complementaritiesbut rather to increases in and making thepenalty terms more prominent and matching strat-egies more focal22 We thus have two hypoth-eses to explain the observed improvement inconvergence to equilibrium play in supermodu-lar and near-supermodular treatments a best-response hypothesis and a matching hypothesisIn this section we present evidence that theimprovement in convergence is due to payoff-relevant changes to best responses and not tomatching becoming more focal

While the focal point hypothesis suggeststhat players converge to a match it does notspecify the price at which they match In thefirst round of our experiments almost 50 per-cent of all prices were 20 and less than 10percent were in the -equilibrium range of[15 17] In the final rounds of our experi-ments play in the four supermodular or near-supermodular treatments converges stronglyto this range suggesting more than simplematching

By equation (4) player 1rsquos best response isalways to match regardless of p2 By theGeneral Incentive Hypothesis we expectmore matching behavior by player 1 as increases By equation (5) however match-ing is a best response for player 2 only inequilibrium Therefore data for player 2 canhelp us separate the best-response and match-ing hypotheses

We first investigate which model better ex-plains our experimental data We operationalizethe best response hypothesis by the stochasticfictitious play model of Section V and thematching hypothesis with a stochastic matchingmodel23 In both models the predicted player 1price is specified by equation (6) In the match-ing model player 2 plays this price with prob-ability 1 and one of the other 40 priceswith probability 40 We estimate using thefirst two sessions of each treatment24 We then

compare the performance of the matchingmodel with the sFP model using the hold-outsample Using the Wilcoxon signed-rank teststo compare the MSD scores in the hold-outsample we conclude that sFP explains player2rsquos behavior significantly better than the match-ing model (p-value 0006) We conclude thatbest response to a predicted price explains be-havior significantly better than the matchingmodel

We next investigate whether differences inthe cost of matching defined as the differencebetween matching and best-response profitscan explain all of the differences in match-ing behavior among treatments or whetherthere exists additional matching behaviorthat cannot be explained by payoff-relevantfactors

We examine the probability that player 2 triesto match the anticipated player 1 price We donot know what price player 2 anticipates Toestimate how players weigh history we cali-brate a matching model in a manner similar tothat outlined in Section V B We assume aplayer matches the price he anticipates givenby equation (6) and we find the discount ratefor each treatment that minimizes MSD25 Thisgives us the anticipated price that best explainsthe matching hypothesis Using this discountrate we calculate for each t 1 an anticipatedplayer 1 price p1 for each player 2 We alsocalculate the opportunity cost of matching p1equal to the difference between best-responseand matching profits This opportunity cost cap-tures the payoff relevant effects of and alsodepends on the price to be matched

We next regress the probability that player 2matches the anticipated price Pr(p2 p1) onln(Round) treatment dummies and treatmentdummies interacted with ln(Round) In onespecification we include the cost of matching asa regressor If parameter changes make match-ing more focal then changes in the cost ofmatching will not explain all of the changes inprobability of matching

22 We thank an anonymous referee and the co-editor forpointing this out

23 We do not report the results of the deterministicmatching model as it performs significantly worse than itsstochastic counterpart

24 Estimated in each block are as follows 1000 10 7 6 8 2000 7 6 5 4 2018 7

2 3 3 1020 5 3 3 2 2020 1 0 0 02040 2 0 0 1

25 In all treatments the calibrated discount rate is r 01

1531VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Table 10 presents the results of two probitmodels with clustering at the individual level Inthe first specification we do not include the costof matching as a regressor In this specificationthe coefficient for D00 is negative and signifi-cant thus reducing from 20 to 0 decreases theprobability that player 2 will try to match player1rsquos price Including the cost of matching in theregression however we find that neither thedummies nor their interactions are signifi-cantmdashthe coefficient for the previously signifi-cant D00 dummy falls almost by a half whilethe precision of the estimate stays about thesame The coefficient on the cost of matchinghowever is significant and negative Thisfinding suggests that the rise in matching that

occurs when we increase from 0 to 20 is notbecause matching is more focal but ratherbecause an increase of decreases the cost ofmatching26

VII Concluding Remarks

The appeal of games with strategic comple-mentarities is simple as long as players areadaptive learners the game will in the limitconverge to the set bounded by the largest andsmallest Nash equilibria This convergence de-pends on neither initial conditions nor the as-sumption of a particular learning modelUnfortunately while many competitive envi-ronments are supermodular many are not Inthis study we examine whether there are non-supermodular games which have the same con-vergence properties as supermodular ones Wealso study whether there exists a clear conver-gence ranking among games with strategiccomplementarities

Our results confirm the findings of previousexperimental studies that supermodular gamesperform significantly better than games far be-low the supermodular threshold From a littlebelow the threshold to the threshold howeverthe change in convergence level is statisticallyinsignificant This results suggests that in thecontext of mechanism design the designerneed not be overly concerned with setting pa-rameters that are firmly above the supermodularthreshold close is just as good It also enlargesthe set of robustly stable games For examplethe Falkinger mechanism (Falkinger 1996) isnot supermodular but close These results sug-gest that near-supermodular games perform likesupermodular ones

Our next result concerns convergence perfor-mance within the class of games with strategiccomplementarities Variations in the degree ofcomplementarities have no significant effect onperformance within the 60 experimental roundsOur simulations suggest however an increased

26 We consider other specifications of anticipated priceincluding the averages of the last one two three and fourprices seen In only one specification was a dummy or itsinteraction with ln(Round) significant In that specificationthe coefficient for D18 is positive and its interaction withln(Round) is negative a finding which is not consistent witha focal point hypothesis

TABLE 10mdashPLAYER 2 MATCHING HYPOTHESIS

(Probit models with individual-level clustering comparingmodels with and without the cost to Player 2 of matching

the anticipated price of Player 1)

Model

Dependent variableProbability player 2 priceequals anticipated player 1

price

(1) (2)

D10 0014 0017(0043) (0041)

D00 0077 0043(0035) (0036)

D18 0046 0032(0053) (0056)

D40 0008 0041(0061) (0042)

ln(Round) 0041 0028(0011) (0010)

Match Cost 0001(0000)

ln(Round)D10 0014 0012(0012) (0012)

ln(Round)D00 0003 0000(0013) (0012)

ln(Round)D18 0006 0002(0021) (0020)

ln(Round)D40 0001 0015(0018) (0016)

Observations 9588 9588Log pseudo-likelihood 324321 317716

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses are adjusted for cluster-ing at the individual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummyvariable for treatment y Excluded dummy is D2020Match Cost is the difference in cents of matching andbest-responding to anticipated Player 1 price

1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 20: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

indicating that players keep track of all past playwhen forming beliefs about opponentsrsquo nextmove18 Given these beliefs the increasing sen-sitivity parameter 13 for all treatments except1000 indicates that the likelihood of bestresponse increases over time The third columnreports calibration results for the fEWA modelAgain with the exception of treatment 1000the parameter 13 increases over time The finalcolumn reports calibration of parameters in thePA model For each treatment except 1000 amiddle block has the highest discount factorindicating more weight on new informationabout a strategyrsquos performance The second pa-rameter in the PA model represents the upperbound of the interval from which shocks aredrawn a Experimentation should decrease inthe final rounds Indeed for all treatments theestimated mood-shock ranges (weakly) de-crease from the third to the last block Finallythe decreasing similarity windows from thethird to final block indicate decreasing strategyspillover Relatively large discount factors inthe third block combined with relatively largesimilarity windows flatten the payoff assess-ments around the most recently played strate-gies consistent with the local experimentation(or relatively stabilized play) observed in thedata

C Validation

Using the parameters calibrated on the firsttwo sessions of each treatment we next com-pare the performance of the three learning mod-els in predicting play in the hold-out sampleFor comparison we also present the perfor-mance of two static models The random choicemodel assumes that each player randomlychooses any strategy with equal probability forall rounds This model incorporates only num-ber of strategies and thus provides a minimumstandard for a dynamic learning model Theequilibrium model assumes that each playerplays the subgame perfect Nash equilibriumevery round Its fit conveys the same informa-tion as the proportion of equilibrium play pre-

sented in Section IV but with a different metric(MSD)

Table 7 presents each modelrsquos MSD scoresfor each hold-out session Recall that two ofeach treatmentrsquos four or five independent ses-sions are used for calibration and the rest forvalidation The results indicate that all threelearning models perform significantly betterthan the random choice model (p-value 001one-sided permutation tests) and by a largermargin significantly better than the equilibriummodel (p-value 001 one-sided permutationtests) While the equilibrium model does a poorjob of explaining overall experimental data itsperformance improvement over time19 justifiesthe use of learning models to explain the dy-namics of play Within each of the top threepanels (ie learning models) session-levelMSD scores are lower for the supermodular andnear-supermodular treatments indicating thateach learning model does a better job explainingbehavior in those treatments Indeed in the em-pirical learning literature learning models fitbetter in experiments with better equilibriumconvergence (see eg Chen and Khoroshilov2003)

RESULT 4 (Comparison of Learning Mod-els) The performance of PA is weakly betterthan that of fEWA and strictly better than that ofsFP The performances of fEWA and sFP arenot significantly different

SUPPORT Table 7 reports the MSD scores foreach independent session in the hold-out sampleunder each model The Wilcoxon signed-rankstests show that the MSD scores under PA areweakly lower than those under fEWA (z 0085 p-value 0068) and strictly lowerthan those under sFP (z 0028 p-value 0023) Using the same test we cannot reject thehypothesis that fEWA and sFP yield the sameMSD scores (z 0662 p-value 0508)

Although the performance of PA is onlyweakly better than that of fEWA the overallMSD scores for PA are lower than those for

18 As the discount factor is zero in Cournot best replywe can reject this model based on our estimation of thediscount factors

19 We omit the table to show the performance of theequilibrium model in blocks of 15 rounds as the informa-tion is repetitive with Figures 1 and 2

1524 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

fEWA for every treatment Therefore we usePA for forecasting beyond 60 rounds

Figure 3 presents the simulated path of thecalibrated PA model and compares it with theactual data in the hold-out sample by superim-posing the simulated mean (black line) plus andminus one standard deviation (grey lines) on theactual mean (black box) and standard deviation(error bars) The simulation does a good job oftracking both the mean and the standard devia-tion of the actual data In treatments 1000and 2040 however the reduction in variancelags that seen in the middle rounds of theexperiment

D Forecasting

We now report the results from the PAmodel As the strategic complementarity predic-tions concern long-run performance we use thecalibrated parameters for the first 58 rounds tosimulate play in later rounds This exercise al-

lows us to study convergence and efficiency inlong but finite horizons

In all forecasting exercises we base all post-58-round parameters on those from the lastblock (calibrated from rounds 46 to 58) Sincewe do not know how these parameters mightchange beyond 60 rounds we use three differ-ent specifications First we retain all final-blockparameters Second we exponentially decay insubsequent blocks the probability of deviationfrom the best response in the production stage20

Third we exponentially decay the shock inter-val in subsequent blocks As all three specifica-tions yield qualitatively similar results wereport results from only the third specificationdue to space limitations21 In presenting the

20 In block k 4 we use the probability of deviationPrk minPr4(k 4)2 108 We set a lower bound of108 to avoid the division-by-zero problem

21 Different sequences of random numbers produceslightly different parameters estimates While the conver-

TABLE 7mdashVALIDATION ON HOLD-OUT SESSIONS

Session

Stochastic fictitious play

1000 2000 2018 1020 2020 2040

1 0955 0934 0940 0930 0888 09402 0960 0946 0890 0917 0973 08783 0948 0883 0873Overall 0957 0943 0915 0910 0912 0909

fEWA

1 0956 0934 0942 0928 0887 09482 0960 0946 0890 0922 0978 08863 0946 0879 0871Overall 0958 0942 0916 0910 0912 0917

Payoff assessment

1 0952 0933 0914 0941 0888 09162 0957 0944 0899 0922 0917 08983 0947 0894 0903Overall 0955 0941 0907 0919 0902 0907

Equilibrium play

1 1867 1853 1833 1833 1489 17922 1889 1919 1606 1839 1953 16693 1917 1447 1539Overall 1878 1896 1719 1706 1660 1731

Random play

Overall 0976 0976 0976 0976 0976 0976

1525VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

results we use the shorthand notation x y todenote a measure under treatment x is signifi-cantly higher than that under treatment y at the5-percent level or less and x y to denote thatthe measures are not significantly different atthe 5-percent level

Figure 4 reports the simulated proportion of-equilibrium play We report only the resultsfor the first 500 rounds since the dynamics donot change much thereafter In our simulationall treatments reach higher convergence levelsthan those achieved in the 60 rounds of the

experiment Since the simulated variance reduc-tion lags that of the experiment round 60 resultsare not achieved until approximately 90 roundsof simulation We further note that these im-provements slow down after 200 rounds Theproportion of -equilibrium play is bounded by72 percent for player 1 and 93 percent for player2 We now outline the simulation results

RESULT 5 (Level of Convergence in Round500) In the simulated data for player 1 atround 500 we have the following level of con-vergence ranking in

(i) p1 2018 2040 2020 1020 2000 1000

(ii) -p1 2018 2040 2020 1020 2000 1000

(iii) p2 2040 2018 1020 2020 2000 1000 and

gence level of a given treatment for a given set of parameterestimates is not affected by the sequence of random num-bers this convergence level is somewhat sensitive to theexact parameter calibration We therefore conduct four dif-ferent calibrations for each treatment and select the param-eters that yield the highest level The relative rankings oftreatments are quite robust with respect to ordinal selected

FIGURE 3 SIMULATED DYNAMIC PATH VERSUS ACTUAL DATA

1526 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iv) -p2 2040 2018 1020 2020 2000 1000

SUPPORT The fourth and seventh columns ofTable 8 report p-values for t-tests of the preced-ing alternative hypotheses for players 1 and 2respectively

Comparing Results 1 and 5 we first note thatthe four supermodular and near-supermodulartreatments continue to dominate the 00 treat-ments In fact the simulations suggest that thegap remains constant compared with 2000and actually increases compared with 1000Unlike Result 1 however the four dominanttreatments differ In particular 2018 per-forms better in terms of player 1rsquos convergenceand 2040 in terms of player 2rsquos although thedifferences within the top four treatments aresmaller than the differences between these four

and the 00 treatments In addition an increasein significantly improves convergence by alarge margin (40 to 80 percent) for both playerswhen 0

It is instructive to compare these results withthose concerning the speed of convergence inour experimental treatments In terms of con-vergence to -p2 our regression results (Table4) indicate that 2018rsquos performance in laterrounds enables it to achieve the same conver-gence level as the supermodular treatmentsThis trend continues in our simulations as2018 performs robustly well compared to thesupermodular treatments Likewise in our anal-ysis of the experimental results the conver-gence speed of 2040 was not significantlydifferent from those of the other dominant treat-ments In our simulations however this treat-ment dominates all treatments in player 2equilibrium play

FIGURE 3 CONTINUED

1527VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

In our simulations the convergence improve-ment due to increasing from 0 to 20 doesdepend on (-effect on -effect) An increasein significantly decreases the improvement in

convergence level (all p-values for one-sidedt-tests are less than 001) Due to the relativelypoor performance of 1000 in our simula-tions increasing from 10 to 20 when 0

Player 1

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500Round

Ep

silo

n-e

qu

ilib

riu

m p

lay

α10β00α20β00α20β18α10β20α20β20α20β40

Player 2

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500

Round

Ep

silo

n-e

qu

ilib

riu

m p

lay α10β00

α20β00α20β18α10β20α20β20α20β40

FIGURE 4 SIMULATED -EQUILIBRIUM PLAY

1528 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

improves convergence more dramatically inround 500 than in rounds 40 to 60 In the longrun an increase in is a partial substitute for anincrease in

We now use definition 2 to examine the con-vergence speed in the long run Table 9 presentsresults for the first time a treatment reaches thelevel of convergence L We omit the two 00treatments since they do not converge to thesame levels as the other treatments In terms ofplayer 1 2018 achieves the fastest conver-gence for all levels L The picture for player 2convergence however is more subtle First forthe 20 and 18 treatments the speeds of con-vergence to all L are remarkably similarWhile 2040 lags the other treatments interms of achieving 80-percent -equilibriumplay for player 2 it is the only treatment toachieve L 90 percent

Apart from convergence to equilibrium it isalso important to look at long-run welfare prop-erties We evaluate mechanism performance us-ing the efficiency measure in Section IV

Figure 5 summarizes the simulated and actualefficiency achieved by each of the treatmentsThe top panel reports simulated results whilethe bottom panel reports efficiency in the ex-perimental data In the long run supermodularand near-supermodular treatments continue todiffer from the 00 treatments with 20 supe-rior to 10 in the 00 treatments

VI Interpretation and Discussion

The underlying force for convergence ingames with strategic complementarities is thecombination of the slope of the best-responsefunctions and adaptive learning by players In

TABLE 9mdashSPEED OF CONVERGENCE IN SIMULATED DATA

(Threshold is the percent of players playing epsilon-equilibrium strategy Entry indicates round in which went overthreshold for good where ldquomdashrdquo indicates that treatment never achieved threshold)

Threshold

Player 1 Player 2

04 05 06 07 04 05 06 07 08 09

2018 54 83 126 316 38 52 62 87 127 mdash1020 129 221 mdash mdash 36 48 63 91 148 mdash2020 61 91 149 mdash 39 51 61 84 137 mdash2040 101 166 263 496 30 38 56 94 166 339

TABLE 8mdashRESULTS OF t-TESTS COMPARING LEVEL OF CONVERGENCE OF SIMULATIONS OF EXPERIMENTAL TREATMENTS

(Values are for round 500 and based on 1500 simulated games)

Treatment

Player 1 equilibrium price Player 2 equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0073 00822000 0188 2000 1000 0000 0213 2000 1000 00002018 0265 2018 2040 0187 0381 2018 1020 02641020 0211 1020 2000 0000 0376 1020 2020 00012020 0243 2020 1020 0000 0353 2020 2000 00002040 0259 2040 2020 0007 0442 2040 2018 0000

Treatment

Player 1 -equilibrium price Player 2 -equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0216 02322000 0519 2000 1000 0000 0573 2000 1000 00002018 0713 2018 2040 0045 0870 2018 1020 00081020 0584 1020 2000 0000 0857 1020 2020 00002020 0662 2020 1020 0000 0834 2020 2000 00002040 0701 2040 2020 0000 0925 2040 2018 0000

1529VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the compensation mechanisms the slope ofplayer 2rsquos best response function is determinedby As and determine the penalty for

mismatched prices we seriously consider thepossibility that convergence in supermodularand near-supermodular treatments to an equilib-

Simulation Data

75

80

85

90

95

100

50 150 250 350 450

Round

α10β00α20β00α20β18α10β20α20β20α20β40

Experimental Data

40

45

50

55

60

65

70

75

80

85

90

0 10 20 30 40 50 60

Round

Fra

ctio

n o

f M

axim

um

Wel

fare

α10β00α20β00α20β18α10β20α20β20α20β40

Fra

ctio

n o

f M

axim

um

Wel

fare

FIGURE 5 EFFICIENCY IN THE SIMULATED AND ACTUAL DATA

1530 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium where both players announce the sameprice is not due to strategic complementaritiesbut rather to increases in and making thepenalty terms more prominent and matching strat-egies more focal22 We thus have two hypoth-eses to explain the observed improvement inconvergence to equilibrium play in supermodu-lar and near-supermodular treatments a best-response hypothesis and a matching hypothesisIn this section we present evidence that theimprovement in convergence is due to payoff-relevant changes to best responses and not tomatching becoming more focal

While the focal point hypothesis suggeststhat players converge to a match it does notspecify the price at which they match In thefirst round of our experiments almost 50 per-cent of all prices were 20 and less than 10percent were in the -equilibrium range of[15 17] In the final rounds of our experi-ments play in the four supermodular or near-supermodular treatments converges stronglyto this range suggesting more than simplematching

By equation (4) player 1rsquos best response isalways to match regardless of p2 By theGeneral Incentive Hypothesis we expectmore matching behavior by player 1 as increases By equation (5) however match-ing is a best response for player 2 only inequilibrium Therefore data for player 2 canhelp us separate the best-response and match-ing hypotheses

We first investigate which model better ex-plains our experimental data We operationalizethe best response hypothesis by the stochasticfictitious play model of Section V and thematching hypothesis with a stochastic matchingmodel23 In both models the predicted player 1price is specified by equation (6) In the match-ing model player 2 plays this price with prob-ability 1 and one of the other 40 priceswith probability 40 We estimate using thefirst two sessions of each treatment24 We then

compare the performance of the matchingmodel with the sFP model using the hold-outsample Using the Wilcoxon signed-rank teststo compare the MSD scores in the hold-outsample we conclude that sFP explains player2rsquos behavior significantly better than the match-ing model (p-value 0006) We conclude thatbest response to a predicted price explains be-havior significantly better than the matchingmodel

We next investigate whether differences inthe cost of matching defined as the differencebetween matching and best-response profitscan explain all of the differences in match-ing behavior among treatments or whetherthere exists additional matching behaviorthat cannot be explained by payoff-relevantfactors

We examine the probability that player 2 triesto match the anticipated player 1 price We donot know what price player 2 anticipates Toestimate how players weigh history we cali-brate a matching model in a manner similar tothat outlined in Section V B We assume aplayer matches the price he anticipates givenby equation (6) and we find the discount ratefor each treatment that minimizes MSD25 Thisgives us the anticipated price that best explainsthe matching hypothesis Using this discountrate we calculate for each t 1 an anticipatedplayer 1 price p1 for each player 2 We alsocalculate the opportunity cost of matching p1equal to the difference between best-responseand matching profits This opportunity cost cap-tures the payoff relevant effects of and alsodepends on the price to be matched

We next regress the probability that player 2matches the anticipated price Pr(p2 p1) onln(Round) treatment dummies and treatmentdummies interacted with ln(Round) In onespecification we include the cost of matching asa regressor If parameter changes make match-ing more focal then changes in the cost ofmatching will not explain all of the changes inprobability of matching

22 We thank an anonymous referee and the co-editor forpointing this out

23 We do not report the results of the deterministicmatching model as it performs significantly worse than itsstochastic counterpart

24 Estimated in each block are as follows 1000 10 7 6 8 2000 7 6 5 4 2018 7

2 3 3 1020 5 3 3 2 2020 1 0 0 02040 2 0 0 1

25 In all treatments the calibrated discount rate is r 01

1531VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Table 10 presents the results of two probitmodels with clustering at the individual level Inthe first specification we do not include the costof matching as a regressor In this specificationthe coefficient for D00 is negative and signifi-cant thus reducing from 20 to 0 decreases theprobability that player 2 will try to match player1rsquos price Including the cost of matching in theregression however we find that neither thedummies nor their interactions are signifi-cantmdashthe coefficient for the previously signifi-cant D00 dummy falls almost by a half whilethe precision of the estimate stays about thesame The coefficient on the cost of matchinghowever is significant and negative Thisfinding suggests that the rise in matching that

occurs when we increase from 0 to 20 is notbecause matching is more focal but ratherbecause an increase of decreases the cost ofmatching26

VII Concluding Remarks

The appeal of games with strategic comple-mentarities is simple as long as players areadaptive learners the game will in the limitconverge to the set bounded by the largest andsmallest Nash equilibria This convergence de-pends on neither initial conditions nor the as-sumption of a particular learning modelUnfortunately while many competitive envi-ronments are supermodular many are not Inthis study we examine whether there are non-supermodular games which have the same con-vergence properties as supermodular ones Wealso study whether there exists a clear conver-gence ranking among games with strategiccomplementarities

Our results confirm the findings of previousexperimental studies that supermodular gamesperform significantly better than games far be-low the supermodular threshold From a littlebelow the threshold to the threshold howeverthe change in convergence level is statisticallyinsignificant This results suggests that in thecontext of mechanism design the designerneed not be overly concerned with setting pa-rameters that are firmly above the supermodularthreshold close is just as good It also enlargesthe set of robustly stable games For examplethe Falkinger mechanism (Falkinger 1996) isnot supermodular but close These results sug-gest that near-supermodular games perform likesupermodular ones

Our next result concerns convergence perfor-mance within the class of games with strategiccomplementarities Variations in the degree ofcomplementarities have no significant effect onperformance within the 60 experimental roundsOur simulations suggest however an increased

26 We consider other specifications of anticipated priceincluding the averages of the last one two three and fourprices seen In only one specification was a dummy or itsinteraction with ln(Round) significant In that specificationthe coefficient for D18 is positive and its interaction withln(Round) is negative a finding which is not consistent witha focal point hypothesis

TABLE 10mdashPLAYER 2 MATCHING HYPOTHESIS

(Probit models with individual-level clustering comparingmodels with and without the cost to Player 2 of matching

the anticipated price of Player 1)

Model

Dependent variableProbability player 2 priceequals anticipated player 1

price

(1) (2)

D10 0014 0017(0043) (0041)

D00 0077 0043(0035) (0036)

D18 0046 0032(0053) (0056)

D40 0008 0041(0061) (0042)

ln(Round) 0041 0028(0011) (0010)

Match Cost 0001(0000)

ln(Round)D10 0014 0012(0012) (0012)

ln(Round)D00 0003 0000(0013) (0012)

ln(Round)D18 0006 0002(0021) (0020)

ln(Round)D40 0001 0015(0018) (0016)

Observations 9588 9588Log pseudo-likelihood 324321 317716

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses are adjusted for cluster-ing at the individual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummyvariable for treatment y Excluded dummy is D2020Match Cost is the difference in cents of matching andbest-responding to anticipated Player 1 price

1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 21: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

fEWA for every treatment Therefore we usePA for forecasting beyond 60 rounds

Figure 3 presents the simulated path of thecalibrated PA model and compares it with theactual data in the hold-out sample by superim-posing the simulated mean (black line) plus andminus one standard deviation (grey lines) on theactual mean (black box) and standard deviation(error bars) The simulation does a good job oftracking both the mean and the standard devia-tion of the actual data In treatments 1000and 2040 however the reduction in variancelags that seen in the middle rounds of theexperiment

D Forecasting

We now report the results from the PAmodel As the strategic complementarity predic-tions concern long-run performance we use thecalibrated parameters for the first 58 rounds tosimulate play in later rounds This exercise al-

lows us to study convergence and efficiency inlong but finite horizons

In all forecasting exercises we base all post-58-round parameters on those from the lastblock (calibrated from rounds 46 to 58) Sincewe do not know how these parameters mightchange beyond 60 rounds we use three differ-ent specifications First we retain all final-blockparameters Second we exponentially decay insubsequent blocks the probability of deviationfrom the best response in the production stage20

Third we exponentially decay the shock inter-val in subsequent blocks As all three specifica-tions yield qualitatively similar results wereport results from only the third specificationdue to space limitations21 In presenting the

20 In block k 4 we use the probability of deviationPrk minPr4(k 4)2 108 We set a lower bound of108 to avoid the division-by-zero problem

21 Different sequences of random numbers produceslightly different parameters estimates While the conver-

TABLE 7mdashVALIDATION ON HOLD-OUT SESSIONS

Session

Stochastic fictitious play

1000 2000 2018 1020 2020 2040

1 0955 0934 0940 0930 0888 09402 0960 0946 0890 0917 0973 08783 0948 0883 0873Overall 0957 0943 0915 0910 0912 0909

fEWA

1 0956 0934 0942 0928 0887 09482 0960 0946 0890 0922 0978 08863 0946 0879 0871Overall 0958 0942 0916 0910 0912 0917

Payoff assessment

1 0952 0933 0914 0941 0888 09162 0957 0944 0899 0922 0917 08983 0947 0894 0903Overall 0955 0941 0907 0919 0902 0907

Equilibrium play

1 1867 1853 1833 1833 1489 17922 1889 1919 1606 1839 1953 16693 1917 1447 1539Overall 1878 1896 1719 1706 1660 1731

Random play

Overall 0976 0976 0976 0976 0976 0976

1525VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

results we use the shorthand notation x y todenote a measure under treatment x is signifi-cantly higher than that under treatment y at the5-percent level or less and x y to denote thatthe measures are not significantly different atthe 5-percent level

Figure 4 reports the simulated proportion of-equilibrium play We report only the resultsfor the first 500 rounds since the dynamics donot change much thereafter In our simulationall treatments reach higher convergence levelsthan those achieved in the 60 rounds of the

experiment Since the simulated variance reduc-tion lags that of the experiment round 60 resultsare not achieved until approximately 90 roundsof simulation We further note that these im-provements slow down after 200 rounds Theproportion of -equilibrium play is bounded by72 percent for player 1 and 93 percent for player2 We now outline the simulation results

RESULT 5 (Level of Convergence in Round500) In the simulated data for player 1 atround 500 we have the following level of con-vergence ranking in

(i) p1 2018 2040 2020 1020 2000 1000

(ii) -p1 2018 2040 2020 1020 2000 1000

(iii) p2 2040 2018 1020 2020 2000 1000 and

gence level of a given treatment for a given set of parameterestimates is not affected by the sequence of random num-bers this convergence level is somewhat sensitive to theexact parameter calibration We therefore conduct four dif-ferent calibrations for each treatment and select the param-eters that yield the highest level The relative rankings oftreatments are quite robust with respect to ordinal selected

FIGURE 3 SIMULATED DYNAMIC PATH VERSUS ACTUAL DATA

1526 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iv) -p2 2040 2018 1020 2020 2000 1000

SUPPORT The fourth and seventh columns ofTable 8 report p-values for t-tests of the preced-ing alternative hypotheses for players 1 and 2respectively

Comparing Results 1 and 5 we first note thatthe four supermodular and near-supermodulartreatments continue to dominate the 00 treat-ments In fact the simulations suggest that thegap remains constant compared with 2000and actually increases compared with 1000Unlike Result 1 however the four dominanttreatments differ In particular 2018 per-forms better in terms of player 1rsquos convergenceand 2040 in terms of player 2rsquos although thedifferences within the top four treatments aresmaller than the differences between these four

and the 00 treatments In addition an increasein significantly improves convergence by alarge margin (40 to 80 percent) for both playerswhen 0

It is instructive to compare these results withthose concerning the speed of convergence inour experimental treatments In terms of con-vergence to -p2 our regression results (Table4) indicate that 2018rsquos performance in laterrounds enables it to achieve the same conver-gence level as the supermodular treatmentsThis trend continues in our simulations as2018 performs robustly well compared to thesupermodular treatments Likewise in our anal-ysis of the experimental results the conver-gence speed of 2040 was not significantlydifferent from those of the other dominant treat-ments In our simulations however this treat-ment dominates all treatments in player 2equilibrium play

FIGURE 3 CONTINUED

1527VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

In our simulations the convergence improve-ment due to increasing from 0 to 20 doesdepend on (-effect on -effect) An increasein significantly decreases the improvement in

convergence level (all p-values for one-sidedt-tests are less than 001) Due to the relativelypoor performance of 1000 in our simula-tions increasing from 10 to 20 when 0

Player 1

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500Round

Ep

silo

n-e

qu

ilib

riu

m p

lay

α10β00α20β00α20β18α10β20α20β20α20β40

Player 2

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500

Round

Ep

silo

n-e

qu

ilib

riu

m p

lay α10β00

α20β00α20β18α10β20α20β20α20β40

FIGURE 4 SIMULATED -EQUILIBRIUM PLAY

1528 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

improves convergence more dramatically inround 500 than in rounds 40 to 60 In the longrun an increase in is a partial substitute for anincrease in

We now use definition 2 to examine the con-vergence speed in the long run Table 9 presentsresults for the first time a treatment reaches thelevel of convergence L We omit the two 00treatments since they do not converge to thesame levels as the other treatments In terms ofplayer 1 2018 achieves the fastest conver-gence for all levels L The picture for player 2convergence however is more subtle First forthe 20 and 18 treatments the speeds of con-vergence to all L are remarkably similarWhile 2040 lags the other treatments interms of achieving 80-percent -equilibriumplay for player 2 it is the only treatment toachieve L 90 percent

Apart from convergence to equilibrium it isalso important to look at long-run welfare prop-erties We evaluate mechanism performance us-ing the efficiency measure in Section IV

Figure 5 summarizes the simulated and actualefficiency achieved by each of the treatmentsThe top panel reports simulated results whilethe bottom panel reports efficiency in the ex-perimental data In the long run supermodularand near-supermodular treatments continue todiffer from the 00 treatments with 20 supe-rior to 10 in the 00 treatments

VI Interpretation and Discussion

The underlying force for convergence ingames with strategic complementarities is thecombination of the slope of the best-responsefunctions and adaptive learning by players In

TABLE 9mdashSPEED OF CONVERGENCE IN SIMULATED DATA

(Threshold is the percent of players playing epsilon-equilibrium strategy Entry indicates round in which went overthreshold for good where ldquomdashrdquo indicates that treatment never achieved threshold)

Threshold

Player 1 Player 2

04 05 06 07 04 05 06 07 08 09

2018 54 83 126 316 38 52 62 87 127 mdash1020 129 221 mdash mdash 36 48 63 91 148 mdash2020 61 91 149 mdash 39 51 61 84 137 mdash2040 101 166 263 496 30 38 56 94 166 339

TABLE 8mdashRESULTS OF t-TESTS COMPARING LEVEL OF CONVERGENCE OF SIMULATIONS OF EXPERIMENTAL TREATMENTS

(Values are for round 500 and based on 1500 simulated games)

Treatment

Player 1 equilibrium price Player 2 equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0073 00822000 0188 2000 1000 0000 0213 2000 1000 00002018 0265 2018 2040 0187 0381 2018 1020 02641020 0211 1020 2000 0000 0376 1020 2020 00012020 0243 2020 1020 0000 0353 2020 2000 00002040 0259 2040 2020 0007 0442 2040 2018 0000

Treatment

Player 1 -equilibrium price Player 2 -equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0216 02322000 0519 2000 1000 0000 0573 2000 1000 00002018 0713 2018 2040 0045 0870 2018 1020 00081020 0584 1020 2000 0000 0857 1020 2020 00002020 0662 2020 1020 0000 0834 2020 2000 00002040 0701 2040 2020 0000 0925 2040 2018 0000

1529VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the compensation mechanisms the slope ofplayer 2rsquos best response function is determinedby As and determine the penalty for

mismatched prices we seriously consider thepossibility that convergence in supermodularand near-supermodular treatments to an equilib-

Simulation Data

75

80

85

90

95

100

50 150 250 350 450

Round

α10β00α20β00α20β18α10β20α20β20α20β40

Experimental Data

40

45

50

55

60

65

70

75

80

85

90

0 10 20 30 40 50 60

Round

Fra

ctio

n o

f M

axim

um

Wel

fare

α10β00α20β00α20β18α10β20α20β20α20β40

Fra

ctio

n o

f M

axim

um

Wel

fare

FIGURE 5 EFFICIENCY IN THE SIMULATED AND ACTUAL DATA

1530 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium where both players announce the sameprice is not due to strategic complementaritiesbut rather to increases in and making thepenalty terms more prominent and matching strat-egies more focal22 We thus have two hypoth-eses to explain the observed improvement inconvergence to equilibrium play in supermodu-lar and near-supermodular treatments a best-response hypothesis and a matching hypothesisIn this section we present evidence that theimprovement in convergence is due to payoff-relevant changes to best responses and not tomatching becoming more focal

While the focal point hypothesis suggeststhat players converge to a match it does notspecify the price at which they match In thefirst round of our experiments almost 50 per-cent of all prices were 20 and less than 10percent were in the -equilibrium range of[15 17] In the final rounds of our experi-ments play in the four supermodular or near-supermodular treatments converges stronglyto this range suggesting more than simplematching

By equation (4) player 1rsquos best response isalways to match regardless of p2 By theGeneral Incentive Hypothesis we expectmore matching behavior by player 1 as increases By equation (5) however match-ing is a best response for player 2 only inequilibrium Therefore data for player 2 canhelp us separate the best-response and match-ing hypotheses

We first investigate which model better ex-plains our experimental data We operationalizethe best response hypothesis by the stochasticfictitious play model of Section V and thematching hypothesis with a stochastic matchingmodel23 In both models the predicted player 1price is specified by equation (6) In the match-ing model player 2 plays this price with prob-ability 1 and one of the other 40 priceswith probability 40 We estimate using thefirst two sessions of each treatment24 We then

compare the performance of the matchingmodel with the sFP model using the hold-outsample Using the Wilcoxon signed-rank teststo compare the MSD scores in the hold-outsample we conclude that sFP explains player2rsquos behavior significantly better than the match-ing model (p-value 0006) We conclude thatbest response to a predicted price explains be-havior significantly better than the matchingmodel

We next investigate whether differences inthe cost of matching defined as the differencebetween matching and best-response profitscan explain all of the differences in match-ing behavior among treatments or whetherthere exists additional matching behaviorthat cannot be explained by payoff-relevantfactors

We examine the probability that player 2 triesto match the anticipated player 1 price We donot know what price player 2 anticipates Toestimate how players weigh history we cali-brate a matching model in a manner similar tothat outlined in Section V B We assume aplayer matches the price he anticipates givenby equation (6) and we find the discount ratefor each treatment that minimizes MSD25 Thisgives us the anticipated price that best explainsthe matching hypothesis Using this discountrate we calculate for each t 1 an anticipatedplayer 1 price p1 for each player 2 We alsocalculate the opportunity cost of matching p1equal to the difference between best-responseand matching profits This opportunity cost cap-tures the payoff relevant effects of and alsodepends on the price to be matched

We next regress the probability that player 2matches the anticipated price Pr(p2 p1) onln(Round) treatment dummies and treatmentdummies interacted with ln(Round) In onespecification we include the cost of matching asa regressor If parameter changes make match-ing more focal then changes in the cost ofmatching will not explain all of the changes inprobability of matching

22 We thank an anonymous referee and the co-editor forpointing this out

23 We do not report the results of the deterministicmatching model as it performs significantly worse than itsstochastic counterpart

24 Estimated in each block are as follows 1000 10 7 6 8 2000 7 6 5 4 2018 7

2 3 3 1020 5 3 3 2 2020 1 0 0 02040 2 0 0 1

25 In all treatments the calibrated discount rate is r 01

1531VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Table 10 presents the results of two probitmodels with clustering at the individual level Inthe first specification we do not include the costof matching as a regressor In this specificationthe coefficient for D00 is negative and signifi-cant thus reducing from 20 to 0 decreases theprobability that player 2 will try to match player1rsquos price Including the cost of matching in theregression however we find that neither thedummies nor their interactions are signifi-cantmdashthe coefficient for the previously signifi-cant D00 dummy falls almost by a half whilethe precision of the estimate stays about thesame The coefficient on the cost of matchinghowever is significant and negative Thisfinding suggests that the rise in matching that

occurs when we increase from 0 to 20 is notbecause matching is more focal but ratherbecause an increase of decreases the cost ofmatching26

VII Concluding Remarks

The appeal of games with strategic comple-mentarities is simple as long as players areadaptive learners the game will in the limitconverge to the set bounded by the largest andsmallest Nash equilibria This convergence de-pends on neither initial conditions nor the as-sumption of a particular learning modelUnfortunately while many competitive envi-ronments are supermodular many are not Inthis study we examine whether there are non-supermodular games which have the same con-vergence properties as supermodular ones Wealso study whether there exists a clear conver-gence ranking among games with strategiccomplementarities

Our results confirm the findings of previousexperimental studies that supermodular gamesperform significantly better than games far be-low the supermodular threshold From a littlebelow the threshold to the threshold howeverthe change in convergence level is statisticallyinsignificant This results suggests that in thecontext of mechanism design the designerneed not be overly concerned with setting pa-rameters that are firmly above the supermodularthreshold close is just as good It also enlargesthe set of robustly stable games For examplethe Falkinger mechanism (Falkinger 1996) isnot supermodular but close These results sug-gest that near-supermodular games perform likesupermodular ones

Our next result concerns convergence perfor-mance within the class of games with strategiccomplementarities Variations in the degree ofcomplementarities have no significant effect onperformance within the 60 experimental roundsOur simulations suggest however an increased

26 We consider other specifications of anticipated priceincluding the averages of the last one two three and fourprices seen In only one specification was a dummy or itsinteraction with ln(Round) significant In that specificationthe coefficient for D18 is positive and its interaction withln(Round) is negative a finding which is not consistent witha focal point hypothesis

TABLE 10mdashPLAYER 2 MATCHING HYPOTHESIS

(Probit models with individual-level clustering comparingmodels with and without the cost to Player 2 of matching

the anticipated price of Player 1)

Model

Dependent variableProbability player 2 priceequals anticipated player 1

price

(1) (2)

D10 0014 0017(0043) (0041)

D00 0077 0043(0035) (0036)

D18 0046 0032(0053) (0056)

D40 0008 0041(0061) (0042)

ln(Round) 0041 0028(0011) (0010)

Match Cost 0001(0000)

ln(Round)D10 0014 0012(0012) (0012)

ln(Round)D00 0003 0000(0013) (0012)

ln(Round)D18 0006 0002(0021) (0020)

ln(Round)D40 0001 0015(0018) (0016)

Observations 9588 9588Log pseudo-likelihood 324321 317716

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses are adjusted for cluster-ing at the individual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummyvariable for treatment y Excluded dummy is D2020Match Cost is the difference in cents of matching andbest-responding to anticipated Player 1 price

1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 22: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

results we use the shorthand notation x y todenote a measure under treatment x is signifi-cantly higher than that under treatment y at the5-percent level or less and x y to denote thatthe measures are not significantly different atthe 5-percent level

Figure 4 reports the simulated proportion of-equilibrium play We report only the resultsfor the first 500 rounds since the dynamics donot change much thereafter In our simulationall treatments reach higher convergence levelsthan those achieved in the 60 rounds of the

experiment Since the simulated variance reduc-tion lags that of the experiment round 60 resultsare not achieved until approximately 90 roundsof simulation We further note that these im-provements slow down after 200 rounds Theproportion of -equilibrium play is bounded by72 percent for player 1 and 93 percent for player2 We now outline the simulation results

RESULT 5 (Level of Convergence in Round500) In the simulated data for player 1 atround 500 we have the following level of con-vergence ranking in

(i) p1 2018 2040 2020 1020 2000 1000

(ii) -p1 2018 2040 2020 1020 2000 1000

(iii) p2 2040 2018 1020 2020 2000 1000 and

gence level of a given treatment for a given set of parameterestimates is not affected by the sequence of random num-bers this convergence level is somewhat sensitive to theexact parameter calibration We therefore conduct four dif-ferent calibrations for each treatment and select the param-eters that yield the highest level The relative rankings oftreatments are quite robust with respect to ordinal selected

FIGURE 3 SIMULATED DYNAMIC PATH VERSUS ACTUAL DATA

1526 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

(iv) -p2 2040 2018 1020 2020 2000 1000

SUPPORT The fourth and seventh columns ofTable 8 report p-values for t-tests of the preced-ing alternative hypotheses for players 1 and 2respectively

Comparing Results 1 and 5 we first note thatthe four supermodular and near-supermodulartreatments continue to dominate the 00 treat-ments In fact the simulations suggest that thegap remains constant compared with 2000and actually increases compared with 1000Unlike Result 1 however the four dominanttreatments differ In particular 2018 per-forms better in terms of player 1rsquos convergenceand 2040 in terms of player 2rsquos although thedifferences within the top four treatments aresmaller than the differences between these four

and the 00 treatments In addition an increasein significantly improves convergence by alarge margin (40 to 80 percent) for both playerswhen 0

It is instructive to compare these results withthose concerning the speed of convergence inour experimental treatments In terms of con-vergence to -p2 our regression results (Table4) indicate that 2018rsquos performance in laterrounds enables it to achieve the same conver-gence level as the supermodular treatmentsThis trend continues in our simulations as2018 performs robustly well compared to thesupermodular treatments Likewise in our anal-ysis of the experimental results the conver-gence speed of 2040 was not significantlydifferent from those of the other dominant treat-ments In our simulations however this treat-ment dominates all treatments in player 2equilibrium play

FIGURE 3 CONTINUED

1527VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

In our simulations the convergence improve-ment due to increasing from 0 to 20 doesdepend on (-effect on -effect) An increasein significantly decreases the improvement in

convergence level (all p-values for one-sidedt-tests are less than 001) Due to the relativelypoor performance of 1000 in our simula-tions increasing from 10 to 20 when 0

Player 1

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500Round

Ep

silo

n-e

qu

ilib

riu

m p

lay

α10β00α20β00α20β18α10β20α20β20α20β40

Player 2

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500

Round

Ep

silo

n-e

qu

ilib

riu

m p

lay α10β00

α20β00α20β18α10β20α20β20α20β40

FIGURE 4 SIMULATED -EQUILIBRIUM PLAY

1528 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

improves convergence more dramatically inround 500 than in rounds 40 to 60 In the longrun an increase in is a partial substitute for anincrease in

We now use definition 2 to examine the con-vergence speed in the long run Table 9 presentsresults for the first time a treatment reaches thelevel of convergence L We omit the two 00treatments since they do not converge to thesame levels as the other treatments In terms ofplayer 1 2018 achieves the fastest conver-gence for all levels L The picture for player 2convergence however is more subtle First forthe 20 and 18 treatments the speeds of con-vergence to all L are remarkably similarWhile 2040 lags the other treatments interms of achieving 80-percent -equilibriumplay for player 2 it is the only treatment toachieve L 90 percent

Apart from convergence to equilibrium it isalso important to look at long-run welfare prop-erties We evaluate mechanism performance us-ing the efficiency measure in Section IV

Figure 5 summarizes the simulated and actualefficiency achieved by each of the treatmentsThe top panel reports simulated results whilethe bottom panel reports efficiency in the ex-perimental data In the long run supermodularand near-supermodular treatments continue todiffer from the 00 treatments with 20 supe-rior to 10 in the 00 treatments

VI Interpretation and Discussion

The underlying force for convergence ingames with strategic complementarities is thecombination of the slope of the best-responsefunctions and adaptive learning by players In

TABLE 9mdashSPEED OF CONVERGENCE IN SIMULATED DATA

(Threshold is the percent of players playing epsilon-equilibrium strategy Entry indicates round in which went overthreshold for good where ldquomdashrdquo indicates that treatment never achieved threshold)

Threshold

Player 1 Player 2

04 05 06 07 04 05 06 07 08 09

2018 54 83 126 316 38 52 62 87 127 mdash1020 129 221 mdash mdash 36 48 63 91 148 mdash2020 61 91 149 mdash 39 51 61 84 137 mdash2040 101 166 263 496 30 38 56 94 166 339

TABLE 8mdashRESULTS OF t-TESTS COMPARING LEVEL OF CONVERGENCE OF SIMULATIONS OF EXPERIMENTAL TREATMENTS

(Values are for round 500 and based on 1500 simulated games)

Treatment

Player 1 equilibrium price Player 2 equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0073 00822000 0188 2000 1000 0000 0213 2000 1000 00002018 0265 2018 2040 0187 0381 2018 1020 02641020 0211 1020 2000 0000 0376 1020 2020 00012020 0243 2020 1020 0000 0353 2020 2000 00002040 0259 2040 2020 0007 0442 2040 2018 0000

Treatment

Player 1 -equilibrium price Player 2 -equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0216 02322000 0519 2000 1000 0000 0573 2000 1000 00002018 0713 2018 2040 0045 0870 2018 1020 00081020 0584 1020 2000 0000 0857 1020 2020 00002020 0662 2020 1020 0000 0834 2020 2000 00002040 0701 2040 2020 0000 0925 2040 2018 0000

1529VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the compensation mechanisms the slope ofplayer 2rsquos best response function is determinedby As and determine the penalty for

mismatched prices we seriously consider thepossibility that convergence in supermodularand near-supermodular treatments to an equilib-

Simulation Data

75

80

85

90

95

100

50 150 250 350 450

Round

α10β00α20β00α20β18α10β20α20β20α20β40

Experimental Data

40

45

50

55

60

65

70

75

80

85

90

0 10 20 30 40 50 60

Round

Fra

ctio

n o

f M

axim

um

Wel

fare

α10β00α20β00α20β18α10β20α20β20α20β40

Fra

ctio

n o

f M

axim

um

Wel

fare

FIGURE 5 EFFICIENCY IN THE SIMULATED AND ACTUAL DATA

1530 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium where both players announce the sameprice is not due to strategic complementaritiesbut rather to increases in and making thepenalty terms more prominent and matching strat-egies more focal22 We thus have two hypoth-eses to explain the observed improvement inconvergence to equilibrium play in supermodu-lar and near-supermodular treatments a best-response hypothesis and a matching hypothesisIn this section we present evidence that theimprovement in convergence is due to payoff-relevant changes to best responses and not tomatching becoming more focal

While the focal point hypothesis suggeststhat players converge to a match it does notspecify the price at which they match In thefirst round of our experiments almost 50 per-cent of all prices were 20 and less than 10percent were in the -equilibrium range of[15 17] In the final rounds of our experi-ments play in the four supermodular or near-supermodular treatments converges stronglyto this range suggesting more than simplematching

By equation (4) player 1rsquos best response isalways to match regardless of p2 By theGeneral Incentive Hypothesis we expectmore matching behavior by player 1 as increases By equation (5) however match-ing is a best response for player 2 only inequilibrium Therefore data for player 2 canhelp us separate the best-response and match-ing hypotheses

We first investigate which model better ex-plains our experimental data We operationalizethe best response hypothesis by the stochasticfictitious play model of Section V and thematching hypothesis with a stochastic matchingmodel23 In both models the predicted player 1price is specified by equation (6) In the match-ing model player 2 plays this price with prob-ability 1 and one of the other 40 priceswith probability 40 We estimate using thefirst two sessions of each treatment24 We then

compare the performance of the matchingmodel with the sFP model using the hold-outsample Using the Wilcoxon signed-rank teststo compare the MSD scores in the hold-outsample we conclude that sFP explains player2rsquos behavior significantly better than the match-ing model (p-value 0006) We conclude thatbest response to a predicted price explains be-havior significantly better than the matchingmodel

We next investigate whether differences inthe cost of matching defined as the differencebetween matching and best-response profitscan explain all of the differences in match-ing behavior among treatments or whetherthere exists additional matching behaviorthat cannot be explained by payoff-relevantfactors

We examine the probability that player 2 triesto match the anticipated player 1 price We donot know what price player 2 anticipates Toestimate how players weigh history we cali-brate a matching model in a manner similar tothat outlined in Section V B We assume aplayer matches the price he anticipates givenby equation (6) and we find the discount ratefor each treatment that minimizes MSD25 Thisgives us the anticipated price that best explainsthe matching hypothesis Using this discountrate we calculate for each t 1 an anticipatedplayer 1 price p1 for each player 2 We alsocalculate the opportunity cost of matching p1equal to the difference between best-responseand matching profits This opportunity cost cap-tures the payoff relevant effects of and alsodepends on the price to be matched

We next regress the probability that player 2matches the anticipated price Pr(p2 p1) onln(Round) treatment dummies and treatmentdummies interacted with ln(Round) In onespecification we include the cost of matching asa regressor If parameter changes make match-ing more focal then changes in the cost ofmatching will not explain all of the changes inprobability of matching

22 We thank an anonymous referee and the co-editor forpointing this out

23 We do not report the results of the deterministicmatching model as it performs significantly worse than itsstochastic counterpart

24 Estimated in each block are as follows 1000 10 7 6 8 2000 7 6 5 4 2018 7

2 3 3 1020 5 3 3 2 2020 1 0 0 02040 2 0 0 1

25 In all treatments the calibrated discount rate is r 01

1531VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Table 10 presents the results of two probitmodels with clustering at the individual level Inthe first specification we do not include the costof matching as a regressor In this specificationthe coefficient for D00 is negative and signifi-cant thus reducing from 20 to 0 decreases theprobability that player 2 will try to match player1rsquos price Including the cost of matching in theregression however we find that neither thedummies nor their interactions are signifi-cantmdashthe coefficient for the previously signifi-cant D00 dummy falls almost by a half whilethe precision of the estimate stays about thesame The coefficient on the cost of matchinghowever is significant and negative Thisfinding suggests that the rise in matching that

occurs when we increase from 0 to 20 is notbecause matching is more focal but ratherbecause an increase of decreases the cost ofmatching26

VII Concluding Remarks

The appeal of games with strategic comple-mentarities is simple as long as players areadaptive learners the game will in the limitconverge to the set bounded by the largest andsmallest Nash equilibria This convergence de-pends on neither initial conditions nor the as-sumption of a particular learning modelUnfortunately while many competitive envi-ronments are supermodular many are not Inthis study we examine whether there are non-supermodular games which have the same con-vergence properties as supermodular ones Wealso study whether there exists a clear conver-gence ranking among games with strategiccomplementarities

Our results confirm the findings of previousexperimental studies that supermodular gamesperform significantly better than games far be-low the supermodular threshold From a littlebelow the threshold to the threshold howeverthe change in convergence level is statisticallyinsignificant This results suggests that in thecontext of mechanism design the designerneed not be overly concerned with setting pa-rameters that are firmly above the supermodularthreshold close is just as good It also enlargesthe set of robustly stable games For examplethe Falkinger mechanism (Falkinger 1996) isnot supermodular but close These results sug-gest that near-supermodular games perform likesupermodular ones

Our next result concerns convergence perfor-mance within the class of games with strategiccomplementarities Variations in the degree ofcomplementarities have no significant effect onperformance within the 60 experimental roundsOur simulations suggest however an increased

26 We consider other specifications of anticipated priceincluding the averages of the last one two three and fourprices seen In only one specification was a dummy or itsinteraction with ln(Round) significant In that specificationthe coefficient for D18 is positive and its interaction withln(Round) is negative a finding which is not consistent witha focal point hypothesis

TABLE 10mdashPLAYER 2 MATCHING HYPOTHESIS

(Probit models with individual-level clustering comparingmodels with and without the cost to Player 2 of matching

the anticipated price of Player 1)

Model

Dependent variableProbability player 2 priceequals anticipated player 1

price

(1) (2)

D10 0014 0017(0043) (0041)

D00 0077 0043(0035) (0036)

D18 0046 0032(0053) (0056)

D40 0008 0041(0061) (0042)

ln(Round) 0041 0028(0011) (0010)

Match Cost 0001(0000)

ln(Round)D10 0014 0012(0012) (0012)

ln(Round)D00 0003 0000(0013) (0012)

ln(Round)D18 0006 0002(0021) (0020)

ln(Round)D40 0001 0015(0018) (0016)

Observations 9588 9588Log pseudo-likelihood 324321 317716

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses are adjusted for cluster-ing at the individual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummyvariable for treatment y Excluded dummy is D2020Match Cost is the difference in cents of matching andbest-responding to anticipated Player 1 price

1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 23: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

(iv) -p2 2040 2018 1020 2020 2000 1000

SUPPORT The fourth and seventh columns ofTable 8 report p-values for t-tests of the preced-ing alternative hypotheses for players 1 and 2respectively

Comparing Results 1 and 5 we first note thatthe four supermodular and near-supermodulartreatments continue to dominate the 00 treat-ments In fact the simulations suggest that thegap remains constant compared with 2000and actually increases compared with 1000Unlike Result 1 however the four dominanttreatments differ In particular 2018 per-forms better in terms of player 1rsquos convergenceand 2040 in terms of player 2rsquos although thedifferences within the top four treatments aresmaller than the differences between these four

and the 00 treatments In addition an increasein significantly improves convergence by alarge margin (40 to 80 percent) for both playerswhen 0

It is instructive to compare these results withthose concerning the speed of convergence inour experimental treatments In terms of con-vergence to -p2 our regression results (Table4) indicate that 2018rsquos performance in laterrounds enables it to achieve the same conver-gence level as the supermodular treatmentsThis trend continues in our simulations as2018 performs robustly well compared to thesupermodular treatments Likewise in our anal-ysis of the experimental results the conver-gence speed of 2040 was not significantlydifferent from those of the other dominant treat-ments In our simulations however this treat-ment dominates all treatments in player 2equilibrium play

FIGURE 3 CONTINUED

1527VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

In our simulations the convergence improve-ment due to increasing from 0 to 20 doesdepend on (-effect on -effect) An increasein significantly decreases the improvement in

convergence level (all p-values for one-sidedt-tests are less than 001) Due to the relativelypoor performance of 1000 in our simula-tions increasing from 10 to 20 when 0

Player 1

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500Round

Ep

silo

n-e

qu

ilib

riu

m p

lay

α10β00α20β00α20β18α10β20α20β20α20β40

Player 2

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500

Round

Ep

silo

n-e

qu

ilib

riu

m p

lay α10β00

α20β00α20β18α10β20α20β20α20β40

FIGURE 4 SIMULATED -EQUILIBRIUM PLAY

1528 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

improves convergence more dramatically inround 500 than in rounds 40 to 60 In the longrun an increase in is a partial substitute for anincrease in

We now use definition 2 to examine the con-vergence speed in the long run Table 9 presentsresults for the first time a treatment reaches thelevel of convergence L We omit the two 00treatments since they do not converge to thesame levels as the other treatments In terms ofplayer 1 2018 achieves the fastest conver-gence for all levels L The picture for player 2convergence however is more subtle First forthe 20 and 18 treatments the speeds of con-vergence to all L are remarkably similarWhile 2040 lags the other treatments interms of achieving 80-percent -equilibriumplay for player 2 it is the only treatment toachieve L 90 percent

Apart from convergence to equilibrium it isalso important to look at long-run welfare prop-erties We evaluate mechanism performance us-ing the efficiency measure in Section IV

Figure 5 summarizes the simulated and actualefficiency achieved by each of the treatmentsThe top panel reports simulated results whilethe bottom panel reports efficiency in the ex-perimental data In the long run supermodularand near-supermodular treatments continue todiffer from the 00 treatments with 20 supe-rior to 10 in the 00 treatments

VI Interpretation and Discussion

The underlying force for convergence ingames with strategic complementarities is thecombination of the slope of the best-responsefunctions and adaptive learning by players In

TABLE 9mdashSPEED OF CONVERGENCE IN SIMULATED DATA

(Threshold is the percent of players playing epsilon-equilibrium strategy Entry indicates round in which went overthreshold for good where ldquomdashrdquo indicates that treatment never achieved threshold)

Threshold

Player 1 Player 2

04 05 06 07 04 05 06 07 08 09

2018 54 83 126 316 38 52 62 87 127 mdash1020 129 221 mdash mdash 36 48 63 91 148 mdash2020 61 91 149 mdash 39 51 61 84 137 mdash2040 101 166 263 496 30 38 56 94 166 339

TABLE 8mdashRESULTS OF t-TESTS COMPARING LEVEL OF CONVERGENCE OF SIMULATIONS OF EXPERIMENTAL TREATMENTS

(Values are for round 500 and based on 1500 simulated games)

Treatment

Player 1 equilibrium price Player 2 equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0073 00822000 0188 2000 1000 0000 0213 2000 1000 00002018 0265 2018 2040 0187 0381 2018 1020 02641020 0211 1020 2000 0000 0376 1020 2020 00012020 0243 2020 1020 0000 0353 2020 2000 00002040 0259 2040 2020 0007 0442 2040 2018 0000

Treatment

Player 1 -equilibrium price Player 2 -equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0216 02322000 0519 2000 1000 0000 0573 2000 1000 00002018 0713 2018 2040 0045 0870 2018 1020 00081020 0584 1020 2000 0000 0857 1020 2020 00002020 0662 2020 1020 0000 0834 2020 2000 00002040 0701 2040 2020 0000 0925 2040 2018 0000

1529VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the compensation mechanisms the slope ofplayer 2rsquos best response function is determinedby As and determine the penalty for

mismatched prices we seriously consider thepossibility that convergence in supermodularand near-supermodular treatments to an equilib-

Simulation Data

75

80

85

90

95

100

50 150 250 350 450

Round

α10β00α20β00α20β18α10β20α20β20α20β40

Experimental Data

40

45

50

55

60

65

70

75

80

85

90

0 10 20 30 40 50 60

Round

Fra

ctio

n o

f M

axim

um

Wel

fare

α10β00α20β00α20β18α10β20α20β20α20β40

Fra

ctio

n o

f M

axim

um

Wel

fare

FIGURE 5 EFFICIENCY IN THE SIMULATED AND ACTUAL DATA

1530 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium where both players announce the sameprice is not due to strategic complementaritiesbut rather to increases in and making thepenalty terms more prominent and matching strat-egies more focal22 We thus have two hypoth-eses to explain the observed improvement inconvergence to equilibrium play in supermodu-lar and near-supermodular treatments a best-response hypothesis and a matching hypothesisIn this section we present evidence that theimprovement in convergence is due to payoff-relevant changes to best responses and not tomatching becoming more focal

While the focal point hypothesis suggeststhat players converge to a match it does notspecify the price at which they match In thefirst round of our experiments almost 50 per-cent of all prices were 20 and less than 10percent were in the -equilibrium range of[15 17] In the final rounds of our experi-ments play in the four supermodular or near-supermodular treatments converges stronglyto this range suggesting more than simplematching

By equation (4) player 1rsquos best response isalways to match regardless of p2 By theGeneral Incentive Hypothesis we expectmore matching behavior by player 1 as increases By equation (5) however match-ing is a best response for player 2 only inequilibrium Therefore data for player 2 canhelp us separate the best-response and match-ing hypotheses

We first investigate which model better ex-plains our experimental data We operationalizethe best response hypothesis by the stochasticfictitious play model of Section V and thematching hypothesis with a stochastic matchingmodel23 In both models the predicted player 1price is specified by equation (6) In the match-ing model player 2 plays this price with prob-ability 1 and one of the other 40 priceswith probability 40 We estimate using thefirst two sessions of each treatment24 We then

compare the performance of the matchingmodel with the sFP model using the hold-outsample Using the Wilcoxon signed-rank teststo compare the MSD scores in the hold-outsample we conclude that sFP explains player2rsquos behavior significantly better than the match-ing model (p-value 0006) We conclude thatbest response to a predicted price explains be-havior significantly better than the matchingmodel

We next investigate whether differences inthe cost of matching defined as the differencebetween matching and best-response profitscan explain all of the differences in match-ing behavior among treatments or whetherthere exists additional matching behaviorthat cannot be explained by payoff-relevantfactors

We examine the probability that player 2 triesto match the anticipated player 1 price We donot know what price player 2 anticipates Toestimate how players weigh history we cali-brate a matching model in a manner similar tothat outlined in Section V B We assume aplayer matches the price he anticipates givenby equation (6) and we find the discount ratefor each treatment that minimizes MSD25 Thisgives us the anticipated price that best explainsthe matching hypothesis Using this discountrate we calculate for each t 1 an anticipatedplayer 1 price p1 for each player 2 We alsocalculate the opportunity cost of matching p1equal to the difference between best-responseand matching profits This opportunity cost cap-tures the payoff relevant effects of and alsodepends on the price to be matched

We next regress the probability that player 2matches the anticipated price Pr(p2 p1) onln(Round) treatment dummies and treatmentdummies interacted with ln(Round) In onespecification we include the cost of matching asa regressor If parameter changes make match-ing more focal then changes in the cost ofmatching will not explain all of the changes inprobability of matching

22 We thank an anonymous referee and the co-editor forpointing this out

23 We do not report the results of the deterministicmatching model as it performs significantly worse than itsstochastic counterpart

24 Estimated in each block are as follows 1000 10 7 6 8 2000 7 6 5 4 2018 7

2 3 3 1020 5 3 3 2 2020 1 0 0 02040 2 0 0 1

25 In all treatments the calibrated discount rate is r 01

1531VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Table 10 presents the results of two probitmodels with clustering at the individual level Inthe first specification we do not include the costof matching as a regressor In this specificationthe coefficient for D00 is negative and signifi-cant thus reducing from 20 to 0 decreases theprobability that player 2 will try to match player1rsquos price Including the cost of matching in theregression however we find that neither thedummies nor their interactions are signifi-cantmdashthe coefficient for the previously signifi-cant D00 dummy falls almost by a half whilethe precision of the estimate stays about thesame The coefficient on the cost of matchinghowever is significant and negative Thisfinding suggests that the rise in matching that

occurs when we increase from 0 to 20 is notbecause matching is more focal but ratherbecause an increase of decreases the cost ofmatching26

VII Concluding Remarks

The appeal of games with strategic comple-mentarities is simple as long as players areadaptive learners the game will in the limitconverge to the set bounded by the largest andsmallest Nash equilibria This convergence de-pends on neither initial conditions nor the as-sumption of a particular learning modelUnfortunately while many competitive envi-ronments are supermodular many are not Inthis study we examine whether there are non-supermodular games which have the same con-vergence properties as supermodular ones Wealso study whether there exists a clear conver-gence ranking among games with strategiccomplementarities

Our results confirm the findings of previousexperimental studies that supermodular gamesperform significantly better than games far be-low the supermodular threshold From a littlebelow the threshold to the threshold howeverthe change in convergence level is statisticallyinsignificant This results suggests that in thecontext of mechanism design the designerneed not be overly concerned with setting pa-rameters that are firmly above the supermodularthreshold close is just as good It also enlargesthe set of robustly stable games For examplethe Falkinger mechanism (Falkinger 1996) isnot supermodular but close These results sug-gest that near-supermodular games perform likesupermodular ones

Our next result concerns convergence perfor-mance within the class of games with strategiccomplementarities Variations in the degree ofcomplementarities have no significant effect onperformance within the 60 experimental roundsOur simulations suggest however an increased

26 We consider other specifications of anticipated priceincluding the averages of the last one two three and fourprices seen In only one specification was a dummy or itsinteraction with ln(Round) significant In that specificationthe coefficient for D18 is positive and its interaction withln(Round) is negative a finding which is not consistent witha focal point hypothesis

TABLE 10mdashPLAYER 2 MATCHING HYPOTHESIS

(Probit models with individual-level clustering comparingmodels with and without the cost to Player 2 of matching

the anticipated price of Player 1)

Model

Dependent variableProbability player 2 priceequals anticipated player 1

price

(1) (2)

D10 0014 0017(0043) (0041)

D00 0077 0043(0035) (0036)

D18 0046 0032(0053) (0056)

D40 0008 0041(0061) (0042)

ln(Round) 0041 0028(0011) (0010)

Match Cost 0001(0000)

ln(Round)D10 0014 0012(0012) (0012)

ln(Round)D00 0003 0000(0013) (0012)

ln(Round)D18 0006 0002(0021) (0020)

ln(Round)D40 0001 0015(0018) (0016)

Observations 9588 9588Log pseudo-likelihood 324321 317716

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses are adjusted for cluster-ing at the individual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummyvariable for treatment y Excluded dummy is D2020Match Cost is the difference in cents of matching andbest-responding to anticipated Player 1 price

1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 24: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

In our simulations the convergence improve-ment due to increasing from 0 to 20 doesdepend on (-effect on -effect) An increasein significantly decreases the improvement in

convergence level (all p-values for one-sidedt-tests are less than 001) Due to the relativelypoor performance of 1000 in our simula-tions increasing from 10 to 20 when 0

Player 1

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500Round

Ep

silo

n-e

qu

ilib

riu

m p

lay

α10β00α20β00α20β18α10β20α20β20α20β40

Player 2

0

10

20

30

40

50

60

70

80

90

100

0 100 200 300 400 500

Round

Ep

silo

n-e

qu

ilib

riu

m p

lay α10β00

α20β00α20β18α10β20α20β20α20β40

FIGURE 4 SIMULATED -EQUILIBRIUM PLAY

1528 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

improves convergence more dramatically inround 500 than in rounds 40 to 60 In the longrun an increase in is a partial substitute for anincrease in

We now use definition 2 to examine the con-vergence speed in the long run Table 9 presentsresults for the first time a treatment reaches thelevel of convergence L We omit the two 00treatments since they do not converge to thesame levels as the other treatments In terms ofplayer 1 2018 achieves the fastest conver-gence for all levels L The picture for player 2convergence however is more subtle First forthe 20 and 18 treatments the speeds of con-vergence to all L are remarkably similarWhile 2040 lags the other treatments interms of achieving 80-percent -equilibriumplay for player 2 it is the only treatment toachieve L 90 percent

Apart from convergence to equilibrium it isalso important to look at long-run welfare prop-erties We evaluate mechanism performance us-ing the efficiency measure in Section IV

Figure 5 summarizes the simulated and actualefficiency achieved by each of the treatmentsThe top panel reports simulated results whilethe bottom panel reports efficiency in the ex-perimental data In the long run supermodularand near-supermodular treatments continue todiffer from the 00 treatments with 20 supe-rior to 10 in the 00 treatments

VI Interpretation and Discussion

The underlying force for convergence ingames with strategic complementarities is thecombination of the slope of the best-responsefunctions and adaptive learning by players In

TABLE 9mdashSPEED OF CONVERGENCE IN SIMULATED DATA

(Threshold is the percent of players playing epsilon-equilibrium strategy Entry indicates round in which went overthreshold for good where ldquomdashrdquo indicates that treatment never achieved threshold)

Threshold

Player 1 Player 2

04 05 06 07 04 05 06 07 08 09

2018 54 83 126 316 38 52 62 87 127 mdash1020 129 221 mdash mdash 36 48 63 91 148 mdash2020 61 91 149 mdash 39 51 61 84 137 mdash2040 101 166 263 496 30 38 56 94 166 339

TABLE 8mdashRESULTS OF t-TESTS COMPARING LEVEL OF CONVERGENCE OF SIMULATIONS OF EXPERIMENTAL TREATMENTS

(Values are for round 500 and based on 1500 simulated games)

Treatment

Player 1 equilibrium price Player 2 equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0073 00822000 0188 2000 1000 0000 0213 2000 1000 00002018 0265 2018 2040 0187 0381 2018 1020 02641020 0211 1020 2000 0000 0376 1020 2020 00012020 0243 2020 1020 0000 0353 2020 2000 00002040 0259 2040 2020 0007 0442 2040 2018 0000

Treatment

Player 1 -equilibrium price Player 2 -equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0216 02322000 0519 2000 1000 0000 0573 2000 1000 00002018 0713 2018 2040 0045 0870 2018 1020 00081020 0584 1020 2000 0000 0857 1020 2020 00002020 0662 2020 1020 0000 0834 2020 2000 00002040 0701 2040 2020 0000 0925 2040 2018 0000

1529VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the compensation mechanisms the slope ofplayer 2rsquos best response function is determinedby As and determine the penalty for

mismatched prices we seriously consider thepossibility that convergence in supermodularand near-supermodular treatments to an equilib-

Simulation Data

75

80

85

90

95

100

50 150 250 350 450

Round

α10β00α20β00α20β18α10β20α20β20α20β40

Experimental Data

40

45

50

55

60

65

70

75

80

85

90

0 10 20 30 40 50 60

Round

Fra

ctio

n o

f M

axim

um

Wel

fare

α10β00α20β00α20β18α10β20α20β20α20β40

Fra

ctio

n o

f M

axim

um

Wel

fare

FIGURE 5 EFFICIENCY IN THE SIMULATED AND ACTUAL DATA

1530 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium where both players announce the sameprice is not due to strategic complementaritiesbut rather to increases in and making thepenalty terms more prominent and matching strat-egies more focal22 We thus have two hypoth-eses to explain the observed improvement inconvergence to equilibrium play in supermodu-lar and near-supermodular treatments a best-response hypothesis and a matching hypothesisIn this section we present evidence that theimprovement in convergence is due to payoff-relevant changes to best responses and not tomatching becoming more focal

While the focal point hypothesis suggeststhat players converge to a match it does notspecify the price at which they match In thefirst round of our experiments almost 50 per-cent of all prices were 20 and less than 10percent were in the -equilibrium range of[15 17] In the final rounds of our experi-ments play in the four supermodular or near-supermodular treatments converges stronglyto this range suggesting more than simplematching

By equation (4) player 1rsquos best response isalways to match regardless of p2 By theGeneral Incentive Hypothesis we expectmore matching behavior by player 1 as increases By equation (5) however match-ing is a best response for player 2 only inequilibrium Therefore data for player 2 canhelp us separate the best-response and match-ing hypotheses

We first investigate which model better ex-plains our experimental data We operationalizethe best response hypothesis by the stochasticfictitious play model of Section V and thematching hypothesis with a stochastic matchingmodel23 In both models the predicted player 1price is specified by equation (6) In the match-ing model player 2 plays this price with prob-ability 1 and one of the other 40 priceswith probability 40 We estimate using thefirst two sessions of each treatment24 We then

compare the performance of the matchingmodel with the sFP model using the hold-outsample Using the Wilcoxon signed-rank teststo compare the MSD scores in the hold-outsample we conclude that sFP explains player2rsquos behavior significantly better than the match-ing model (p-value 0006) We conclude thatbest response to a predicted price explains be-havior significantly better than the matchingmodel

We next investigate whether differences inthe cost of matching defined as the differencebetween matching and best-response profitscan explain all of the differences in match-ing behavior among treatments or whetherthere exists additional matching behaviorthat cannot be explained by payoff-relevantfactors

We examine the probability that player 2 triesto match the anticipated player 1 price We donot know what price player 2 anticipates Toestimate how players weigh history we cali-brate a matching model in a manner similar tothat outlined in Section V B We assume aplayer matches the price he anticipates givenby equation (6) and we find the discount ratefor each treatment that minimizes MSD25 Thisgives us the anticipated price that best explainsthe matching hypothesis Using this discountrate we calculate for each t 1 an anticipatedplayer 1 price p1 for each player 2 We alsocalculate the opportunity cost of matching p1equal to the difference between best-responseand matching profits This opportunity cost cap-tures the payoff relevant effects of and alsodepends on the price to be matched

We next regress the probability that player 2matches the anticipated price Pr(p2 p1) onln(Round) treatment dummies and treatmentdummies interacted with ln(Round) In onespecification we include the cost of matching asa regressor If parameter changes make match-ing more focal then changes in the cost ofmatching will not explain all of the changes inprobability of matching

22 We thank an anonymous referee and the co-editor forpointing this out

23 We do not report the results of the deterministicmatching model as it performs significantly worse than itsstochastic counterpart

24 Estimated in each block are as follows 1000 10 7 6 8 2000 7 6 5 4 2018 7

2 3 3 1020 5 3 3 2 2020 1 0 0 02040 2 0 0 1

25 In all treatments the calibrated discount rate is r 01

1531VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Table 10 presents the results of two probitmodels with clustering at the individual level Inthe first specification we do not include the costof matching as a regressor In this specificationthe coefficient for D00 is negative and signifi-cant thus reducing from 20 to 0 decreases theprobability that player 2 will try to match player1rsquos price Including the cost of matching in theregression however we find that neither thedummies nor their interactions are signifi-cantmdashthe coefficient for the previously signifi-cant D00 dummy falls almost by a half whilethe precision of the estimate stays about thesame The coefficient on the cost of matchinghowever is significant and negative Thisfinding suggests that the rise in matching that

occurs when we increase from 0 to 20 is notbecause matching is more focal but ratherbecause an increase of decreases the cost ofmatching26

VII Concluding Remarks

The appeal of games with strategic comple-mentarities is simple as long as players areadaptive learners the game will in the limitconverge to the set bounded by the largest andsmallest Nash equilibria This convergence de-pends on neither initial conditions nor the as-sumption of a particular learning modelUnfortunately while many competitive envi-ronments are supermodular many are not Inthis study we examine whether there are non-supermodular games which have the same con-vergence properties as supermodular ones Wealso study whether there exists a clear conver-gence ranking among games with strategiccomplementarities

Our results confirm the findings of previousexperimental studies that supermodular gamesperform significantly better than games far be-low the supermodular threshold From a littlebelow the threshold to the threshold howeverthe change in convergence level is statisticallyinsignificant This results suggests that in thecontext of mechanism design the designerneed not be overly concerned with setting pa-rameters that are firmly above the supermodularthreshold close is just as good It also enlargesthe set of robustly stable games For examplethe Falkinger mechanism (Falkinger 1996) isnot supermodular but close These results sug-gest that near-supermodular games perform likesupermodular ones

Our next result concerns convergence perfor-mance within the class of games with strategiccomplementarities Variations in the degree ofcomplementarities have no significant effect onperformance within the 60 experimental roundsOur simulations suggest however an increased

26 We consider other specifications of anticipated priceincluding the averages of the last one two three and fourprices seen In only one specification was a dummy or itsinteraction with ln(Round) significant In that specificationthe coefficient for D18 is positive and its interaction withln(Round) is negative a finding which is not consistent witha focal point hypothesis

TABLE 10mdashPLAYER 2 MATCHING HYPOTHESIS

(Probit models with individual-level clustering comparingmodels with and without the cost to Player 2 of matching

the anticipated price of Player 1)

Model

Dependent variableProbability player 2 priceequals anticipated player 1

price

(1) (2)

D10 0014 0017(0043) (0041)

D00 0077 0043(0035) (0036)

D18 0046 0032(0053) (0056)

D40 0008 0041(0061) (0042)

ln(Round) 0041 0028(0011) (0010)

Match Cost 0001(0000)

ln(Round)D10 0014 0012(0012) (0012)

ln(Round)D00 0003 0000(0013) (0012)

ln(Round)D18 0006 0002(0021) (0020)

ln(Round)D40 0001 0015(0018) (0016)

Observations 9588 9588Log pseudo-likelihood 324321 317716

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses are adjusted for cluster-ing at the individual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummyvariable for treatment y Excluded dummy is D2020Match Cost is the difference in cents of matching andbest-responding to anticipated Player 1 price

1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 25: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

improves convergence more dramatically inround 500 than in rounds 40 to 60 In the longrun an increase in is a partial substitute for anincrease in

We now use definition 2 to examine the con-vergence speed in the long run Table 9 presentsresults for the first time a treatment reaches thelevel of convergence L We omit the two 00treatments since they do not converge to thesame levels as the other treatments In terms ofplayer 1 2018 achieves the fastest conver-gence for all levels L The picture for player 2convergence however is more subtle First forthe 20 and 18 treatments the speeds of con-vergence to all L are remarkably similarWhile 2040 lags the other treatments interms of achieving 80-percent -equilibriumplay for player 2 it is the only treatment toachieve L 90 percent

Apart from convergence to equilibrium it isalso important to look at long-run welfare prop-erties We evaluate mechanism performance us-ing the efficiency measure in Section IV

Figure 5 summarizes the simulated and actualefficiency achieved by each of the treatmentsThe top panel reports simulated results whilethe bottom panel reports efficiency in the ex-perimental data In the long run supermodularand near-supermodular treatments continue todiffer from the 00 treatments with 20 supe-rior to 10 in the 00 treatments

VI Interpretation and Discussion

The underlying force for convergence ingames with strategic complementarities is thecombination of the slope of the best-responsefunctions and adaptive learning by players In

TABLE 9mdashSPEED OF CONVERGENCE IN SIMULATED DATA

(Threshold is the percent of players playing epsilon-equilibrium strategy Entry indicates round in which went overthreshold for good where ldquomdashrdquo indicates that treatment never achieved threshold)

Threshold

Player 1 Player 2

04 05 06 07 04 05 06 07 08 09

2018 54 83 126 316 38 52 62 87 127 mdash1020 129 221 mdash mdash 36 48 63 91 148 mdash2020 61 91 149 mdash 39 51 61 84 137 mdash2040 101 166 263 496 30 38 56 94 166 339

TABLE 8mdashRESULTS OF t-TESTS COMPARING LEVEL OF CONVERGENCE OF SIMULATIONS OF EXPERIMENTAL TREATMENTS

(Values are for round 500 and based on 1500 simulated games)

Treatment

Player 1 equilibrium price Player 2 equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0073 00822000 0188 2000 1000 0000 0213 2000 1000 00002018 0265 2018 2040 0187 0381 2018 1020 02641020 0211 1020 2000 0000 0376 1020 2020 00012020 0243 2020 1020 0000 0353 2020 2000 00002040 0259 2040 2020 0007 0442 2040 2018 0000

Treatment

Player 1 -equilibrium price Player 2 -equilibrium price

Probability H1 p-value Probability H1 p-value

1000 0216 02322000 0519 2000 1000 0000 0573 2000 1000 00002018 0713 2018 2040 0045 0870 2018 1020 00081020 0584 1020 2000 0000 0857 1020 2020 00002020 0662 2020 1020 0000 0834 2020 2000 00002040 0701 2040 2020 0000 0925 2040 2018 0000

1529VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

the compensation mechanisms the slope ofplayer 2rsquos best response function is determinedby As and determine the penalty for

mismatched prices we seriously consider thepossibility that convergence in supermodularand near-supermodular treatments to an equilib-

Simulation Data

75

80

85

90

95

100

50 150 250 350 450

Round

α10β00α20β00α20β18α10β20α20β20α20β40

Experimental Data

40

45

50

55

60

65

70

75

80

85

90

0 10 20 30 40 50 60

Round

Fra

ctio

n o

f M

axim

um

Wel

fare

α10β00α20β00α20β18α10β20α20β20α20β40

Fra

ctio

n o

f M

axim

um

Wel

fare

FIGURE 5 EFFICIENCY IN THE SIMULATED AND ACTUAL DATA

1530 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium where both players announce the sameprice is not due to strategic complementaritiesbut rather to increases in and making thepenalty terms more prominent and matching strat-egies more focal22 We thus have two hypoth-eses to explain the observed improvement inconvergence to equilibrium play in supermodu-lar and near-supermodular treatments a best-response hypothesis and a matching hypothesisIn this section we present evidence that theimprovement in convergence is due to payoff-relevant changes to best responses and not tomatching becoming more focal

While the focal point hypothesis suggeststhat players converge to a match it does notspecify the price at which they match In thefirst round of our experiments almost 50 per-cent of all prices were 20 and less than 10percent were in the -equilibrium range of[15 17] In the final rounds of our experi-ments play in the four supermodular or near-supermodular treatments converges stronglyto this range suggesting more than simplematching

By equation (4) player 1rsquos best response isalways to match regardless of p2 By theGeneral Incentive Hypothesis we expectmore matching behavior by player 1 as increases By equation (5) however match-ing is a best response for player 2 only inequilibrium Therefore data for player 2 canhelp us separate the best-response and match-ing hypotheses

We first investigate which model better ex-plains our experimental data We operationalizethe best response hypothesis by the stochasticfictitious play model of Section V and thematching hypothesis with a stochastic matchingmodel23 In both models the predicted player 1price is specified by equation (6) In the match-ing model player 2 plays this price with prob-ability 1 and one of the other 40 priceswith probability 40 We estimate using thefirst two sessions of each treatment24 We then

compare the performance of the matchingmodel with the sFP model using the hold-outsample Using the Wilcoxon signed-rank teststo compare the MSD scores in the hold-outsample we conclude that sFP explains player2rsquos behavior significantly better than the match-ing model (p-value 0006) We conclude thatbest response to a predicted price explains be-havior significantly better than the matchingmodel

We next investigate whether differences inthe cost of matching defined as the differencebetween matching and best-response profitscan explain all of the differences in match-ing behavior among treatments or whetherthere exists additional matching behaviorthat cannot be explained by payoff-relevantfactors

We examine the probability that player 2 triesto match the anticipated player 1 price We donot know what price player 2 anticipates Toestimate how players weigh history we cali-brate a matching model in a manner similar tothat outlined in Section V B We assume aplayer matches the price he anticipates givenby equation (6) and we find the discount ratefor each treatment that minimizes MSD25 Thisgives us the anticipated price that best explainsthe matching hypothesis Using this discountrate we calculate for each t 1 an anticipatedplayer 1 price p1 for each player 2 We alsocalculate the opportunity cost of matching p1equal to the difference between best-responseand matching profits This opportunity cost cap-tures the payoff relevant effects of and alsodepends on the price to be matched

We next regress the probability that player 2matches the anticipated price Pr(p2 p1) onln(Round) treatment dummies and treatmentdummies interacted with ln(Round) In onespecification we include the cost of matching asa regressor If parameter changes make match-ing more focal then changes in the cost ofmatching will not explain all of the changes inprobability of matching

22 We thank an anonymous referee and the co-editor forpointing this out

23 We do not report the results of the deterministicmatching model as it performs significantly worse than itsstochastic counterpart

24 Estimated in each block are as follows 1000 10 7 6 8 2000 7 6 5 4 2018 7

2 3 3 1020 5 3 3 2 2020 1 0 0 02040 2 0 0 1

25 In all treatments the calibrated discount rate is r 01

1531VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Table 10 presents the results of two probitmodels with clustering at the individual level Inthe first specification we do not include the costof matching as a regressor In this specificationthe coefficient for D00 is negative and signifi-cant thus reducing from 20 to 0 decreases theprobability that player 2 will try to match player1rsquos price Including the cost of matching in theregression however we find that neither thedummies nor their interactions are signifi-cantmdashthe coefficient for the previously signifi-cant D00 dummy falls almost by a half whilethe precision of the estimate stays about thesame The coefficient on the cost of matchinghowever is significant and negative Thisfinding suggests that the rise in matching that

occurs when we increase from 0 to 20 is notbecause matching is more focal but ratherbecause an increase of decreases the cost ofmatching26

VII Concluding Remarks

The appeal of games with strategic comple-mentarities is simple as long as players areadaptive learners the game will in the limitconverge to the set bounded by the largest andsmallest Nash equilibria This convergence de-pends on neither initial conditions nor the as-sumption of a particular learning modelUnfortunately while many competitive envi-ronments are supermodular many are not Inthis study we examine whether there are non-supermodular games which have the same con-vergence properties as supermodular ones Wealso study whether there exists a clear conver-gence ranking among games with strategiccomplementarities

Our results confirm the findings of previousexperimental studies that supermodular gamesperform significantly better than games far be-low the supermodular threshold From a littlebelow the threshold to the threshold howeverthe change in convergence level is statisticallyinsignificant This results suggests that in thecontext of mechanism design the designerneed not be overly concerned with setting pa-rameters that are firmly above the supermodularthreshold close is just as good It also enlargesthe set of robustly stable games For examplethe Falkinger mechanism (Falkinger 1996) isnot supermodular but close These results sug-gest that near-supermodular games perform likesupermodular ones

Our next result concerns convergence perfor-mance within the class of games with strategiccomplementarities Variations in the degree ofcomplementarities have no significant effect onperformance within the 60 experimental roundsOur simulations suggest however an increased

26 We consider other specifications of anticipated priceincluding the averages of the last one two three and fourprices seen In only one specification was a dummy or itsinteraction with ln(Round) significant In that specificationthe coefficient for D18 is positive and its interaction withln(Round) is negative a finding which is not consistent witha focal point hypothesis

TABLE 10mdashPLAYER 2 MATCHING HYPOTHESIS

(Probit models with individual-level clustering comparingmodels with and without the cost to Player 2 of matching

the anticipated price of Player 1)

Model

Dependent variableProbability player 2 priceequals anticipated player 1

price

(1) (2)

D10 0014 0017(0043) (0041)

D00 0077 0043(0035) (0036)

D18 0046 0032(0053) (0056)

D40 0008 0041(0061) (0042)

ln(Round) 0041 0028(0011) (0010)

Match Cost 0001(0000)

ln(Round)D10 0014 0012(0012) (0012)

ln(Round)D00 0003 0000(0013) (0012)

ln(Round)D18 0006 0002(0021) (0020)

ln(Round)D40 0001 0015(0018) (0016)

Observations 9588 9588Log pseudo-likelihood 324321 317716

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses are adjusted for cluster-ing at the individual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummyvariable for treatment y Excluded dummy is D2020Match Cost is the difference in cents of matching andbest-responding to anticipated Player 1 price

1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 26: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

the compensation mechanisms the slope ofplayer 2rsquos best response function is determinedby As and determine the penalty for

mismatched prices we seriously consider thepossibility that convergence in supermodularand near-supermodular treatments to an equilib-

Simulation Data

75

80

85

90

95

100

50 150 250 350 450

Round

α10β00α20β00α20β18α10β20α20β20α20β40

Experimental Data

40

45

50

55

60

65

70

75

80

85

90

0 10 20 30 40 50 60

Round

Fra

ctio

n o

f M

axim

um

Wel

fare

α10β00α20β00α20β18α10β20α20β20α20β40

Fra

ctio

n o

f M

axim

um

Wel

fare

FIGURE 5 EFFICIENCY IN THE SIMULATED AND ACTUAL DATA

1530 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

rium where both players announce the sameprice is not due to strategic complementaritiesbut rather to increases in and making thepenalty terms more prominent and matching strat-egies more focal22 We thus have two hypoth-eses to explain the observed improvement inconvergence to equilibrium play in supermodu-lar and near-supermodular treatments a best-response hypothesis and a matching hypothesisIn this section we present evidence that theimprovement in convergence is due to payoff-relevant changes to best responses and not tomatching becoming more focal

While the focal point hypothesis suggeststhat players converge to a match it does notspecify the price at which they match In thefirst round of our experiments almost 50 per-cent of all prices were 20 and less than 10percent were in the -equilibrium range of[15 17] In the final rounds of our experi-ments play in the four supermodular or near-supermodular treatments converges stronglyto this range suggesting more than simplematching

By equation (4) player 1rsquos best response isalways to match regardless of p2 By theGeneral Incentive Hypothesis we expectmore matching behavior by player 1 as increases By equation (5) however match-ing is a best response for player 2 only inequilibrium Therefore data for player 2 canhelp us separate the best-response and match-ing hypotheses

We first investigate which model better ex-plains our experimental data We operationalizethe best response hypothesis by the stochasticfictitious play model of Section V and thematching hypothesis with a stochastic matchingmodel23 In both models the predicted player 1price is specified by equation (6) In the match-ing model player 2 plays this price with prob-ability 1 and one of the other 40 priceswith probability 40 We estimate using thefirst two sessions of each treatment24 We then

compare the performance of the matchingmodel with the sFP model using the hold-outsample Using the Wilcoxon signed-rank teststo compare the MSD scores in the hold-outsample we conclude that sFP explains player2rsquos behavior significantly better than the match-ing model (p-value 0006) We conclude thatbest response to a predicted price explains be-havior significantly better than the matchingmodel

We next investigate whether differences inthe cost of matching defined as the differencebetween matching and best-response profitscan explain all of the differences in match-ing behavior among treatments or whetherthere exists additional matching behaviorthat cannot be explained by payoff-relevantfactors

We examine the probability that player 2 triesto match the anticipated player 1 price We donot know what price player 2 anticipates Toestimate how players weigh history we cali-brate a matching model in a manner similar tothat outlined in Section V B We assume aplayer matches the price he anticipates givenby equation (6) and we find the discount ratefor each treatment that minimizes MSD25 Thisgives us the anticipated price that best explainsthe matching hypothesis Using this discountrate we calculate for each t 1 an anticipatedplayer 1 price p1 for each player 2 We alsocalculate the opportunity cost of matching p1equal to the difference between best-responseand matching profits This opportunity cost cap-tures the payoff relevant effects of and alsodepends on the price to be matched

We next regress the probability that player 2matches the anticipated price Pr(p2 p1) onln(Round) treatment dummies and treatmentdummies interacted with ln(Round) In onespecification we include the cost of matching asa regressor If parameter changes make match-ing more focal then changes in the cost ofmatching will not explain all of the changes inprobability of matching

22 We thank an anonymous referee and the co-editor forpointing this out

23 We do not report the results of the deterministicmatching model as it performs significantly worse than itsstochastic counterpart

24 Estimated in each block are as follows 1000 10 7 6 8 2000 7 6 5 4 2018 7

2 3 3 1020 5 3 3 2 2020 1 0 0 02040 2 0 0 1

25 In all treatments the calibrated discount rate is r 01

1531VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Table 10 presents the results of two probitmodels with clustering at the individual level Inthe first specification we do not include the costof matching as a regressor In this specificationthe coefficient for D00 is negative and signifi-cant thus reducing from 20 to 0 decreases theprobability that player 2 will try to match player1rsquos price Including the cost of matching in theregression however we find that neither thedummies nor their interactions are signifi-cantmdashthe coefficient for the previously signifi-cant D00 dummy falls almost by a half whilethe precision of the estimate stays about thesame The coefficient on the cost of matchinghowever is significant and negative Thisfinding suggests that the rise in matching that

occurs when we increase from 0 to 20 is notbecause matching is more focal but ratherbecause an increase of decreases the cost ofmatching26

VII Concluding Remarks

The appeal of games with strategic comple-mentarities is simple as long as players areadaptive learners the game will in the limitconverge to the set bounded by the largest andsmallest Nash equilibria This convergence de-pends on neither initial conditions nor the as-sumption of a particular learning modelUnfortunately while many competitive envi-ronments are supermodular many are not Inthis study we examine whether there are non-supermodular games which have the same con-vergence properties as supermodular ones Wealso study whether there exists a clear conver-gence ranking among games with strategiccomplementarities

Our results confirm the findings of previousexperimental studies that supermodular gamesperform significantly better than games far be-low the supermodular threshold From a littlebelow the threshold to the threshold howeverthe change in convergence level is statisticallyinsignificant This results suggests that in thecontext of mechanism design the designerneed not be overly concerned with setting pa-rameters that are firmly above the supermodularthreshold close is just as good It also enlargesthe set of robustly stable games For examplethe Falkinger mechanism (Falkinger 1996) isnot supermodular but close These results sug-gest that near-supermodular games perform likesupermodular ones

Our next result concerns convergence perfor-mance within the class of games with strategiccomplementarities Variations in the degree ofcomplementarities have no significant effect onperformance within the 60 experimental roundsOur simulations suggest however an increased

26 We consider other specifications of anticipated priceincluding the averages of the last one two three and fourprices seen In only one specification was a dummy or itsinteraction with ln(Round) significant In that specificationthe coefficient for D18 is positive and its interaction withln(Round) is negative a finding which is not consistent witha focal point hypothesis

TABLE 10mdashPLAYER 2 MATCHING HYPOTHESIS

(Probit models with individual-level clustering comparingmodels with and without the cost to Player 2 of matching

the anticipated price of Player 1)

Model

Dependent variableProbability player 2 priceequals anticipated player 1

price

(1) (2)

D10 0014 0017(0043) (0041)

D00 0077 0043(0035) (0036)

D18 0046 0032(0053) (0056)

D40 0008 0041(0061) (0042)

ln(Round) 0041 0028(0011) (0010)

Match Cost 0001(0000)

ln(Round)D10 0014 0012(0012) (0012)

ln(Round)D00 0003 0000(0013) (0012)

ln(Round)D18 0006 0002(0021) (0020)

ln(Round)D40 0001 0015(0018) (0016)

Observations 9588 9588Log pseudo-likelihood 324321 317716

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses are adjusted for cluster-ing at the individual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummyvariable for treatment y Excluded dummy is D2020Match Cost is the difference in cents of matching andbest-responding to anticipated Player 1 price

1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 27: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

rium where both players announce the sameprice is not due to strategic complementaritiesbut rather to increases in and making thepenalty terms more prominent and matching strat-egies more focal22 We thus have two hypoth-eses to explain the observed improvement inconvergence to equilibrium play in supermodu-lar and near-supermodular treatments a best-response hypothesis and a matching hypothesisIn this section we present evidence that theimprovement in convergence is due to payoff-relevant changes to best responses and not tomatching becoming more focal

While the focal point hypothesis suggeststhat players converge to a match it does notspecify the price at which they match In thefirst round of our experiments almost 50 per-cent of all prices were 20 and less than 10percent were in the -equilibrium range of[15 17] In the final rounds of our experi-ments play in the four supermodular or near-supermodular treatments converges stronglyto this range suggesting more than simplematching

By equation (4) player 1rsquos best response isalways to match regardless of p2 By theGeneral Incentive Hypothesis we expectmore matching behavior by player 1 as increases By equation (5) however match-ing is a best response for player 2 only inequilibrium Therefore data for player 2 canhelp us separate the best-response and match-ing hypotheses

We first investigate which model better ex-plains our experimental data We operationalizethe best response hypothesis by the stochasticfictitious play model of Section V and thematching hypothesis with a stochastic matchingmodel23 In both models the predicted player 1price is specified by equation (6) In the match-ing model player 2 plays this price with prob-ability 1 and one of the other 40 priceswith probability 40 We estimate using thefirst two sessions of each treatment24 We then

compare the performance of the matchingmodel with the sFP model using the hold-outsample Using the Wilcoxon signed-rank teststo compare the MSD scores in the hold-outsample we conclude that sFP explains player2rsquos behavior significantly better than the match-ing model (p-value 0006) We conclude thatbest response to a predicted price explains be-havior significantly better than the matchingmodel

We next investigate whether differences inthe cost of matching defined as the differencebetween matching and best-response profitscan explain all of the differences in match-ing behavior among treatments or whetherthere exists additional matching behaviorthat cannot be explained by payoff-relevantfactors

We examine the probability that player 2 triesto match the anticipated player 1 price We donot know what price player 2 anticipates Toestimate how players weigh history we cali-brate a matching model in a manner similar tothat outlined in Section V B We assume aplayer matches the price he anticipates givenby equation (6) and we find the discount ratefor each treatment that minimizes MSD25 Thisgives us the anticipated price that best explainsthe matching hypothesis Using this discountrate we calculate for each t 1 an anticipatedplayer 1 price p1 for each player 2 We alsocalculate the opportunity cost of matching p1equal to the difference between best-responseand matching profits This opportunity cost cap-tures the payoff relevant effects of and alsodepends on the price to be matched

We next regress the probability that player 2matches the anticipated price Pr(p2 p1) onln(Round) treatment dummies and treatmentdummies interacted with ln(Round) In onespecification we include the cost of matching asa regressor If parameter changes make match-ing more focal then changes in the cost ofmatching will not explain all of the changes inprobability of matching

22 We thank an anonymous referee and the co-editor forpointing this out

23 We do not report the results of the deterministicmatching model as it performs significantly worse than itsstochastic counterpart

24 Estimated in each block are as follows 1000 10 7 6 8 2000 7 6 5 4 2018 7

2 3 3 1020 5 3 3 2 2020 1 0 0 02040 2 0 0 1

25 In all treatments the calibrated discount rate is r 01

1531VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Table 10 presents the results of two probitmodels with clustering at the individual level Inthe first specification we do not include the costof matching as a regressor In this specificationthe coefficient for D00 is negative and signifi-cant thus reducing from 20 to 0 decreases theprobability that player 2 will try to match player1rsquos price Including the cost of matching in theregression however we find that neither thedummies nor their interactions are signifi-cantmdashthe coefficient for the previously signifi-cant D00 dummy falls almost by a half whilethe precision of the estimate stays about thesame The coefficient on the cost of matchinghowever is significant and negative Thisfinding suggests that the rise in matching that

occurs when we increase from 0 to 20 is notbecause matching is more focal but ratherbecause an increase of decreases the cost ofmatching26

VII Concluding Remarks

The appeal of games with strategic comple-mentarities is simple as long as players areadaptive learners the game will in the limitconverge to the set bounded by the largest andsmallest Nash equilibria This convergence de-pends on neither initial conditions nor the as-sumption of a particular learning modelUnfortunately while many competitive envi-ronments are supermodular many are not Inthis study we examine whether there are non-supermodular games which have the same con-vergence properties as supermodular ones Wealso study whether there exists a clear conver-gence ranking among games with strategiccomplementarities

Our results confirm the findings of previousexperimental studies that supermodular gamesperform significantly better than games far be-low the supermodular threshold From a littlebelow the threshold to the threshold howeverthe change in convergence level is statisticallyinsignificant This results suggests that in thecontext of mechanism design the designerneed not be overly concerned with setting pa-rameters that are firmly above the supermodularthreshold close is just as good It also enlargesthe set of robustly stable games For examplethe Falkinger mechanism (Falkinger 1996) isnot supermodular but close These results sug-gest that near-supermodular games perform likesupermodular ones

Our next result concerns convergence perfor-mance within the class of games with strategiccomplementarities Variations in the degree ofcomplementarities have no significant effect onperformance within the 60 experimental roundsOur simulations suggest however an increased

26 We consider other specifications of anticipated priceincluding the averages of the last one two three and fourprices seen In only one specification was a dummy or itsinteraction with ln(Round) significant In that specificationthe coefficient for D18 is positive and its interaction withln(Round) is negative a finding which is not consistent witha focal point hypothesis

TABLE 10mdashPLAYER 2 MATCHING HYPOTHESIS

(Probit models with individual-level clustering comparingmodels with and without the cost to Player 2 of matching

the anticipated price of Player 1)

Model

Dependent variableProbability player 2 priceequals anticipated player 1

price

(1) (2)

D10 0014 0017(0043) (0041)

D00 0077 0043(0035) (0036)

D18 0046 0032(0053) (0056)

D40 0008 0041(0061) (0042)

ln(Round) 0041 0028(0011) (0010)

Match Cost 0001(0000)

ln(Round)D10 0014 0012(0012) (0012)

ln(Round)D00 0003 0000(0013) (0012)

ln(Round)D18 0006 0002(0021) (0020)

ln(Round)D40 0001 0015(0018) (0016)

Observations 9588 9588Log pseudo-likelihood 324321 317716

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses are adjusted for cluster-ing at the individual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummyvariable for treatment y Excluded dummy is D2020Match Cost is the difference in cents of matching andbest-responding to anticipated Player 1 price

1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 28: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

Table 10 presents the results of two probitmodels with clustering at the individual level Inthe first specification we do not include the costof matching as a regressor In this specificationthe coefficient for D00 is negative and signifi-cant thus reducing from 20 to 0 decreases theprobability that player 2 will try to match player1rsquos price Including the cost of matching in theregression however we find that neither thedummies nor their interactions are signifi-cantmdashthe coefficient for the previously signifi-cant D00 dummy falls almost by a half whilethe precision of the estimate stays about thesame The coefficient on the cost of matchinghowever is significant and negative Thisfinding suggests that the rise in matching that

occurs when we increase from 0 to 20 is notbecause matching is more focal but ratherbecause an increase of decreases the cost ofmatching26

VII Concluding Remarks

The appeal of games with strategic comple-mentarities is simple as long as players areadaptive learners the game will in the limitconverge to the set bounded by the largest andsmallest Nash equilibria This convergence de-pends on neither initial conditions nor the as-sumption of a particular learning modelUnfortunately while many competitive envi-ronments are supermodular many are not Inthis study we examine whether there are non-supermodular games which have the same con-vergence properties as supermodular ones Wealso study whether there exists a clear conver-gence ranking among games with strategiccomplementarities

Our results confirm the findings of previousexperimental studies that supermodular gamesperform significantly better than games far be-low the supermodular threshold From a littlebelow the threshold to the threshold howeverthe change in convergence level is statisticallyinsignificant This results suggests that in thecontext of mechanism design the designerneed not be overly concerned with setting pa-rameters that are firmly above the supermodularthreshold close is just as good It also enlargesthe set of robustly stable games For examplethe Falkinger mechanism (Falkinger 1996) isnot supermodular but close These results sug-gest that near-supermodular games perform likesupermodular ones

Our next result concerns convergence perfor-mance within the class of games with strategiccomplementarities Variations in the degree ofcomplementarities have no significant effect onperformance within the 60 experimental roundsOur simulations suggest however an increased

26 We consider other specifications of anticipated priceincluding the averages of the last one two three and fourprices seen In only one specification was a dummy or itsinteraction with ln(Round) significant In that specificationthe coefficient for D18 is positive and its interaction withln(Round) is negative a finding which is not consistent witha focal point hypothesis

TABLE 10mdashPLAYER 2 MATCHING HYPOTHESIS

(Probit models with individual-level clustering comparingmodels with and without the cost to Player 2 of matching

the anticipated price of Player 1)

Model

Dependent variableProbability player 2 priceequals anticipated player 1

price

(1) (2)

D10 0014 0017(0043) (0041)

D00 0077 0043(0035) (0036)

D18 0046 0032(0053) (0056)

D40 0008 0041(0061) (0042)

ln(Round) 0041 0028(0011) (0010)

Match Cost 0001(0000)

ln(Round)D10 0014 0012(0012) (0012)

ln(Round)D00 0003 0000(0013) (0012)

ln(Round)D18 0006 0002(0021) (0020)

ln(Round)D40 0001 0015(0018) (0016)

Observations 9588 9588Log pseudo-likelihood 324321 317716

Notes Coefficients reported are probability derivatives Ro-bust standard errors in parentheses are adjusted for cluster-ing at the individual level Significant at 10-percent level 5-percent level 1-percent level Dy is the dummyvariable for treatment y Excluded dummy is D2020Match Cost is the difference in cents of matching andbest-responding to anticipated Player 1 price

1532 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 29: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

degree of strategic complementarities leads toimproved convergence in the long run

Finally the generalized compensation mech-anism we use to study convergence has aparameter unrelated to the degree of comple-mentarities We use this parameter to study theeffects factors not related to strategic comple-mentarities on the performance of supermodulargames For a given level of strategic comple-mentarity the factors have partial effects onconvergence level and speed but the effects ofstrategic complementarities are largely robust tovariations in these other factors In the long runthese effects persist and we find evidence thatstrengthening the strategic complementarities inone playerrsquos strategy can partially substitute forthe lack of strategic complementarities in theotherrsquos

A word of caution is in order In a singleexperimental setting it is infeasible to study alarge number of games in a wide range of com-plex environments While this is the first sys-tematic experimental study of the role ofstrategic complementarities in equilibrium con-vergence the applicability of our results toother games needs to be verified in future ex-periments In the only other study of this kindArifovic and Ledyard (2001) examine similarquestions using the Groves-Ledyard mecha-nism Their results are encouraging as theirresults are consistent with ours

APPENDIX PROOF OF PROPOSITION 2

We omit the proofs of parts (i) (ii) and (iv)as they follow directly from the previous anal-ysis and Proposition 1 We now present theproof for part (iii) If players follow Cournotbest-reply dynamics we can rewrite equations(4) and (5) as

p1 t 1 p2 t

p2 t 1 mp1 t n

where m [ (14c)][ (e4c2)] and n (er4c2)[ (e4c2)] The analogous differen-tial equation system is

(12) p1 p2 p1

p2 mp1 p2 n

We now use the Lyapunov second method toshow that system (12) is globally asymptoti-cally stable Define

V p1 p2 p1 p2 2

2

p0

p1

p mp n dp

where p0 0 is a constant We now show thatV(p1 p2) is the Lyapunov function of system(12)

Define G(p1) p0

p1 (p mp n) dp We get

G p1 12

1 mp12 np1

121 mp0

2 np0

As c 0 and e 0 for any 0 we alwayshave m 1 Therefore the function G(p1) hasa global minimum which is characterized bythe following first-order condition

dG p1

dp1 1 mp1 n 0

Therefore p1 n(1 m) er(e c) pis the global minimum When p1 p2 p(p1 p2)22 also reaches its global minimumTherefore V(p1 p2) is at its global minimumwhen p1 p2 p

Next we show that V 0

V V

p1p1

V

p2p2

p1 p2 1 mp1 np2 p1

p2 p1mp1 p2 n

2p1 p22 0

Let B be an open ball around (p p) in theplane For all (p1 p2) (p p) V 0

1533VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 30: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

Therefore (p p) is a globally asymptoticallystable equilibrium of (12)

REFERENCES

Amir Rabah ldquoCournot Oligopoly and the The-ory of Supermodular Gamesrdquo Games andEconomic Behavior 1996 15(2) pp 132ndash48

Andreoni James and Varian Hal R ldquoPre-PlayContracting in the Prisonersrsquo Dilemmardquo Pro-ceedings of the National Academy of Science1999 96(19) pp 10933ndash38

Arifovic Jasmina and Ledyard John O ldquoCom-puter Testbeds and Mechanism Design Ap-plication to the Class of Groves-LedyardMechanisms for Provision of Public GoodsrdquoUnpublished Paper 2001

Boylan Richard T and El-Gamal Mahmoud AldquoFictitious Play A Statistical Study of Mul-tiple Economic Experimentsrdquo Games andEconomic Behavior 1993 5(2) pp 205ndash22

Camerer Colin F Behavioral game theory Ex-periments in strategic interaction (Round-table Series in Behavioral Economics)Princeton Princeton University Press 2003

Camerer Colin F and Ho Teck-Hua ldquoExperience-Weighted Attraction Learning in NormalForm Gamesrdquo Econometrica 1999 67(4)pp 827ndash74

Chen Yan ldquoA Family of Supermodular NashMechanisms Implementing Lindahl Alloca-tionsrdquo Economic Theory 2002 19(4) pp773ndash90

Chen Yan ldquoIncentive-Compatible Mechanismsfor Pure Public Goods A Survey of Experi-mental Researchrdquo in C R Plott and V LSmith eds The handbook of experimentaleconomics results Amsterdam Elsevierforthcoming

Chen Yan ldquoDynamic Stability of Nash-Efficient Public Goods Mechanisms Recon-ciling Theory and Experimentsrdquo in RamiZwick and Amnon Rapoport eds Experi-mental business research Vol 2 Norwelland Dordrecht Kluwer Academic Publishersin press

Chen Yan and Plott Charles R ldquoThe Groves-Ledyard Mechanism An Experimental Studyof Institutional Designrdquo Journal of PublicEconomics 1996 59(3) pp 335ndash64

Chen Yan and Tang Fang-Fang ldquoLearning andIncentive-Compatible Mechanisms for PublicGoods Provision An Experimental StudyrdquoJournal of Political Economy 1998 106(3)pp 633ndash62

Chen Yan and Khoroshilov Yuri ldquoLearning un-der Limited Informationrdquo Games and Eco-nomic Behavior 2003 44(1) pp 1ndash25

Cheng Qiang John ldquoEssays on Designing Eco-nomic Mechanismsrdquo Unpublished Paper1998

Cheung Yin-Wong and Friedman Daniel ldquoIndi-vidual Learning in Normal Form GamesSome Laboratory Resultsrdquo Games and Eco-nomic Behavior 1997 19(1) pp 46ndash76

Cooper Russell and John Andrew ldquoCoordinat-ing Coordination Failures in Keynesian Mod-elsrdquo Quarterly Journal of Economics 1988103(3) pp 441ndash63

Cox James C and Walker Mark ldquoLearning toPlay Cournot Duopoly Strategiesrdquo Journal ofEconomic Behavior and Organization 199836(2) pp 141ndash61

Diamond Douglas W and Dybvig Philip HldquoBank Runs Deposit Insurance and Liquid-ityrdquo Journal of Political Economy 198391(3) pp 401ndash19

Diamond Peter A ldquoAggregate Demand Man-agement in Search Equilibriumrdquo The Journalof Political Economy 1982 90(5) pp 881ndash94

Dybvig Philip H and Spatt Chester S ldquoAdop-tion Externalities as Public Goodsrdquo Journalof Public Economics 1983 20(2) pp 231ndash47

Erev Ido and Haruvy Ernan ldquoOn the PotentialValue and Current Limitations of Data-Driven Learning Modelsrdquo Unpublished Pa-per 2000

Falkinger Josef ldquoEfficient Private Provision ofPublic Goods by Rewarding Deviations fromAveragerdquo Journal of Public Economics1996 62(3) pp 413ndash22

Falkinger Josef Fehr Ernst Gachter Simonand Winter-Ebmer Rudlof ldquoA Simple Mech-anism for the Efficient Provision of Pub-lic Goods Experimental Evidencerdquo Ameri-can Economic Review 2000 90(1) pp 247ndash64

Fudenberg Drew and Levine David K ldquoTheTheory of Learning in Gamesrdquo in MIT Pressseries on economic learning and social evo-

1534 THE AMERICAN ECONOMIC REVIEW DECEMBER 2004

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES

Page 31: When Does Learning in Games Generate Convergence to Nash ...yanchen.people.si.umich.edu/published/Chen_Gazzale_AER_2004.pdf · When Does Learning in Games Generate Convergence to

lution Vol 2 Cambridge MA MIT Press1998

Hamaguchi Yasuyo Maitani Satoshi and SaijoTatsuyoshi ldquoDoes the Varian MechanismWork Emissions Trading as an ExamplerdquoInternational Journal of Business and Eco-nomics 2003 2(2) pp 85ndash96

Harstad Ronald M and Marrese Michael ldquoBe-havioral Explanations of Efficient PublicGood Allocationsrdquo Journal of Public Eco-nomics 1982 19(3) pp 367ndash83

Haruvy Ernan and Stahl Dale ldquoInitial Con-ditions Adaptive Dynamics and FinalOutcomes An Approach to Equilibrium Se-lectionrdquo Unpublished Paper 2000

Ho Teck-Hua Camerer Colin F and ChongJuin-Kuan Economic value of EWA Lite Afunctional theory of learning in games Cal-ifornia Institute of Technology Division ofthe Humanities and Social Sciences WorkingPaper No 1122-2001

Milgrom Paul R and Shannon Chris ldquoMono-tone Comparative Staticsrdquo Econometrica1994 62(1) pp 157ndash80

Milgrom Paul R and Roberts John ldquoRational-izability Learning and Equilibrium inGames with Strategic ComplementaritiesrdquoEconometrica 1990 58(6) pp 1255ndash77

Milgrom Paul R and Roberts John ldquoAdaptiveand Sophisticated Learning in Normal FormGamesrdquo Games and Economic Behavior1991 3(1) pp 82ndash100

Moulin Herve ldquoDominance Solvability andCournot Stabilityrdquo Mathematical Social Sci-ences 1984 7(1) pp 83ndash102

Postlewaite Andrew and Vives Xavier ldquoBankRuns as an Equilibrium Phenomenonrdquo Jour-

nal of Political Economy 1987 95(3) pp485ndash91

Sarin Rajiv and Vahid Farshid ldquoPayoff Assess-ments without Probabilities A Simple Dy-namic Model of Choicerdquo Games andEconomic Behavior 1999 28(2) pp 294ndash309

Siegel Disney and Castellan N John Jr Non-parametric statistics for the behavioral sci-ences 2nd ed New York McGraw-Hill1988

Smith Vernon L ldquoAn Experimental Compari-son of Three Public Good Decision Mecha-nismsrdquo Scandinavian Journal of Economics1979 81(2) pp 198ndash215

Topkis Donald ldquoMinimizing a SubmodularFunction on a Latticerdquo Operations Research1978 26(2) pp 305ndash21

Topkis Donald ldquoEquilibrium Points in Nonzero-Sum N-Person Submodular Gamesrdquo SIAMJournal on Control and Optimization 197917(6) pp 773ndash87

Van Huyck John B Battalio Raymond C andBeil Richard O ldquoTacit Coordination GamesStrategic Uncertainty and Coordination Fail-urerdquo American Economic Review 199080(1) pp 234ndash48

Varian Hal R ldquoA Solution to the Problem ofExternalities When Agents Are Well In-formedrdquo American Economic Review 199484(5) pp 1278ndash93

Vives Xavier ldquoNash Equilibrium in OligopolyGames with Monotone Best Responserdquo Uni-versity of Pennsylvania CARESS WorkingPaper 85-10 1985

Vives Xavier ldquoNash Equilibrium with StrategicComplementaritiesrdquo Journal of Mathemati-cal Economics 1990 19(3) pp 305ndash21

1535VOL 94 NO 5 CHEN AND GAZZALE LEARNING IN GAMES