Game theory, byrne

BASIC GAME THEORY AN INTRODUCTORY TEXT ON MATHEMATICAL

MODELING OF DECISION MAKING

ELI C.C. BYRNE

Dedicated to my wife Caryl Sue, who has always encouraged me to do things my own way, my advisor Leon Vaserstein, for whom the expression “former advisor” is as absurd as “former father”, my daughters Keely, Adrienne, and Grace, who started my teaching career almost 30 years ago, and, last but not least, my students, who make teaching a perennially rewarding pursuit.

i

Preface Audience: This book is written first and foremost for my own teaching of Math 486 at Penn State University. It is further written for anyone who likes my organization and presentation of basic introductory game theory material. I undertook to write this book because no single text in existence included the selection of topics I like to cover in an introductory course and, perhaps in small part, to improve the presentation of this material to the audience I have come to expect in my class: a mixture of junior and senior undergraduates and handful of graduate students from a variety of majors including math, engineering, biology, education, business, economics and operations research.

Perspective: The perspective that guides the choice of material and its presentation is first of all that the goals of teaching are

i) to develop students’ abilities for independent learning and critical thinking, more so than to merely convey information

ii) to make complicated ideas seem simpler, not to make simple ideas seem complicated

iii) in keeping with (1) and (2), to introduce students to a sampling of the most prevalent models in game theory in their simplest form, leaving the more complicated refinements and extensions of these models as topics for future study or independent research that the students can pursue when they are so motivated

ii

Prerequisites: The course is primarily about using math to model applied problems, rather than about learning new mathematical theory or machinery. Advanced mathematics requiring specialized training is kept to a minimum.

Specifically, the most advanced topics expected of the students’ background are:

i) using derivatives to find local extrema,

ii) using matrix notation to represent linear operations and systems of equations,

iii) solving systems of linear equations,

iv) elementary probability and combinatorics (e.g. counting choices)

This does not make the material easy – it merely reduces the specialized prerequisites to make the material accessible to a wide variety of students. The challenging part of this course comes from three sources:

i) thinking in abstraction and understanding the maps between concise mathematical models and the complicated world they represent,

ii) employing logic to understand and sometimes produce proofs and/or solve problems that require logical derivation more so than formulaic computation, and

iii) combining (i) and (ii) to identify the implicit assumptions of models; i.e. using the logic skills of (ii) to derive properties of the “real world” that must be true to make the maps of (i) valid and/or consistent.

iii

Legend to symbols and notation Mathematics is a language that makes use of numerous symbols for convenience and efficiency and also for clarity and disambiguation. The list that follows is a key of mathematical symbols.

Symbol Meaning

{A | B} a set A of objects satisfying the qualifications B “for all” a sum over an implicitly or explicitly indexed set “implies” “there exist(s)” “such that” “approximately” · denotes scalar multiplication or vector dot product,

according to context d/dx denotes the derivative operator denotes the gradient operator denotes Cartesian product: denotes the empty set “is an element of” “is not an element of” |A| cardinality of the set A (number of elements, if A is

finite) f:XY the function f maps elements of X to elements of Y denotes intersection denotes union denotes weak superset denotes strict superset denotes weak subset denotes strict subset not strict subset =,,,, ,,

these symbols are used with the usual arithmetic meaning

iv

Other Notation and reference system.

Many figures in this book labeled “Figure x.y.z” are in fact numerical or graphical examples of ideas discussed in the text. References elsewhere in the text use the expression “example x.y.z” in place of “figure x.y.z” to avoid clumsy expressions such as “the example of figure x.y.z”. Some examples requiring lengthy text have their own subsection heading “x.y.z Example”, so the expression “example x.y.z” in the text could refer to a subsection or a figure. Context and the TOC easily resolve any ambiguity.

Finally, this is a draft whose important and urgent need is to document what I am currently teaching in lectures and to offer supplemental explanation and examples, so there are ideas of others that I informally cite from memory, but for which I have not yet included precise references. Don’t let that stop any reader from looking up these references for themselves and if any citations are incorrect or missing, I would be most grateful to have such errors and omissions pointed out so I can correct them.

v

Table of Contents

PREFACE ........................................................................... I

LEGEND TO SYMBOLS AND NOTATION .............. III

TABLE OF CONTENTS ................................................ V

LIST OF FIGURES ......................................................... X

1 INTRODUCTION..................................................... 1

1.1 BASICS OF MATHEMATICAL MODELING ................. 3

1.2 DEFINITION AND CATEGORIES OF GAMES ............. 7

1.3 HISTORY OF GAME THEORY ..................................... 14

1.4 APPLICATIONS OF GAME THEORY ............... ERROR! BOOKMARK NOT DEFINED.

2 GAMES IN EXTENSIVE FORM ......................... 16

2.1 POSITIONS, MOVES, INFORMATION AND RECALL 16

2.1.2 ..................................................... MODELING SIMULTANEITY 20

2.1.3 ....................................................... MODELING RANDOMNESS 21

2.2 STRATEGIES AND STRATEGY PROFILES ................ 23

2.3 NASH EQUILIBRIUM .................................................... 26

2.4 BACKWARD INDUCTION ............................................ 28

2.4.2 SUB-GAME PERFECTION ..................................................... 30

3 GAMES IN STRATEGIC FORM ......................... 33

vi

3.1 STRATEGIC FORM – PROS AND CONS ..................... 33

3.2 PURE STRATEGY EQUILIBRIA IN STRATEGIC FORM ........................................................................................ 35

3.2.2EXERCISES (EXTRACTED TO ANGEL) ................................. 38

3.3 DOMINATION OF STRATEGIES ................................. 38


3.4 MORE ON MIXED STRATEGIES ................................. 41

3.5 MIXED STRATEGIES IN STRATEGIC FORM GAMES 42

3.6 MIXED NASH EQUILIBRIA IN STRATEGIC FORM GAMES ..................................................................................... 43

3.7 ALGORITHM TO FIND MIXED NASH EQUILIBRIA 49

3.8 TWO-PERSON ZERO-SUM GAMES ............................ 61

3.8.2THE VALUE OF A GAME ........................................................ 66

3.8.3LINEAR PROGRAMMING AND GAMES ................................ 66

3.8.4 ........... GRAPHICAL METHOD FOR 2 X N AND M X 2 GAMES 71


3.9 FAMOUS STRATEGIC FORM GAMES ....................... 74

3.9.1PRISONER’S DILEMMA .......................................................... 74

3.9.2BATTLE OF THE SEXES (COORDINATION) ......................... 76

3.9.3HAWK DOVE ........................................................................... 78

3.9.4ROCK SCISSORS PAPER ........................................................ 79

3.9.5STOP SIGN GAME ................................................................... 80

3.9.6CHICKEN ................................................................................. 80

3.9.7MATCHING PENNIES ............................................................. 81

3.9.8EXERCISES .............................................................................. 82

vii

4 GAMES WITH A CONTINUUM OF STRATEGIES ................................................................. 83

4.1 FINDING NASH EQUILIBRIA IN POLYNOMIAL GAMES ..................................................................................... 84

4.2 POLYNOMIAL GAME NASH EQUILIBRIUM EXAMPLE ................................................................................ 89

4.3 COURNOT DUOPOLY MODEL .................................... 93

4.4 HOTELLING DUOPOLY MODEL ................................ 94

4.5 HIDE AND SEEK (A.K.A. SEARCH AND DESTROY) 97

4.6 GAMES OF TIMING ....................................................... 98

4.6.1DUEL ........................................................................................ 98

4.6.2MARKET PREEMPTION ......................................................... 99

4.6.3WAR OF ATTRITION ............................................................... 99

4.7 A SIMPLE VOTING GAME ........................................... 99

4.8 AUCTIONS ...................................................................... 99

4.8.11ST PRICE AUCTION ............................................................... 99

4.8.22ND PRICE AUCTION ............................................................... 99

5 REPEATED GAMES AND ADAPTIVE LEARNING ................................................................... 100

5.1 BACKWARD INDUCTION .......................................... 100

5.1.1CHAIN STORE PARADOX ..................................................... 100

5.2 INFINITELY REPEATED GAMES AND DISCOUNT FACTOR ................................................................................. 100

5.2.1FOLK THEOREMS ................................................................. 100

5.3 REPEATED PRISONER’S DILEMMA ........................ 100

5.3.1AXELROD’S TOURNAMENTS AND TIT-FOR-TAT.............. 100

viii

5.4 ADAPTIVE AND REINFORCEMENT LEARNING ... 100

5.4.1BROWN’S METHOD OF FICTITIOUS PLAY ....................... 100

5.4.2LOCAL INTERACTION MODELS AND IMITATION ............ 100

6 COOPERATIVE GAME THEORY ................... 101

6.1 JOINT STRATEGIES .................................................... 101

6.2 IMPUTATIONS ............................................................. 102

6.3 CHARACTERISTIC FUNCTIONS AND SUPERADDITIVITY ............................................................. 103

6.3.2EXERCISES ............................................................................ 109

6.4 CHARACTERISTIC FUNCTION FORM GAMES ...... 110

6.5 ESSENTIAL AND INESSENTIAL GAMES ................ 111

6.6 DOMINANCE, COALITIONAL RATIONALITY, THE CORE ...................................................................................... 112

6.6.1EXAMPLE OF AN EMPTY CORE ......................................... 115

6.6.2EXAMPLE OF AN ALL ENCOMPASSING CORE ................. 115

6.6.3EXAMPLE OF A 0-DIMENSIONAL CORE ........................... 116

6.6.4EXAMPLE OF A 1-DIMENSIONAL CORE ........................... 117

6.6.5EXAMPLE OF A NON-TRIVIAL 2-DIMENSIONAL CORE... 118

6.7 STRATEGIC EQUIVALENCE AND NORMALIZATION 119

6.7.1EXAMPLE OF NORMALIZING A GAME .............................. 121

6.8 SHAPLEY VALUES ...................................................... 122

6.8.1EXAMPLE OF SHAPLEY VALUE COMPUTATION ............. 124

6.9 SIMPLE GAMES, COMPOUND GAMES, ELECTIONS 126

6.9.1LAKE WOBEGONE LOCAL GOVERNMENT ....................... 127

ix

6.10 NASH ARBITRATION ................................................. 129

6.10.1THEOREM (NASH ARBITRATION) ..................................... 132

6.10.2EXAMPLE OF NASH ARBITRATION .................................. 133

6.10.3SHAPLEY PROCEDURE ..................................................... 134

6.10.4EXERCISES .......................................................................... 135

7 EVOLUTIONARY GAME MODELS ................ 136

7.1 ONE-POPULATION MODELS WITHOUT MUTATION 139

7.1.1HAWK-DOVE EVOLUTION DYNAMICS .............................. 141

7.2 EVOLUTIONARY GAMES AS DYNAMICAL SYSTEMS ............................................................................... 148

7.2.1DYNAMIC EQUILIBRIA ........................................................ 149

7.2.2STABLE AND UNSTABLE EQUILIBRIA ............................... 150

7.2.3ATTRACTORS AND BASINS OF ATTRACTION ................... 153

7.2.4REPEATED PRISONER’S DILEMMA EVOLUTION ............ 155

7.2.5QUASI-MUTATION ................................................................ 159

7.2.6ROCK–SCISSORS–PAPER EVOLUTION DYNAMICS ......... 161

7.2.7 .......... FINDING ONE-POPULATION DYNAMIC EQUILIBRIA 163

7.3 TWO-POPULATION EVOLUTION MODELS ........... 164

7.3.1TWO POPULATION NUMERICAL EXAMPLE ..................... 167

7.3.2 .......... FINDING TWO-POPULATION DYNAMIC EQUILIBRIA 169

7.3.3TWO POPULATION PREDATOR – PREY EXAMPLE .......... 171

7.4 COMPARING NASH AND EVOLUTIONARY EQUILIBRIA .......................................................................... 171

x

List of Figures FIGURE 2.1 EXTENSIVE FORM COMPLETE INFORMATION GAME 18

FIGURE 2.2 EXTENSIVE FORM INCOMPLETE INFORMATION GAME 19

FIGURE 2.3 EXTENSIVE FORM INCOMPLETE RECALL GAME 20

FIGURE 2.4 SIMULTANEOUS MOVE GAME IN EXTENSIVE FORM 21

FIGURE 2.5 EXTENSIVE FORM GAME WITH RANDOM ‘MOVES’ 22

FIGURE 2.6 BRANCHING IN BACKWARD INDUCTION ............ 29

FIGURE 2.7 NON-SUBGAME PERFECT NASH EQUILIBRIUM.. 31

FIGURE 3.1 STRATEGIC FORM OF EXAMPLE 2.4 ...................... 34

FIGURE 3.2 BEST RESPONSES IN STRATEGIC FORM OF 2.7 ... 35

FIGURE 3.3 NON-UNIQUENESS OF STRATEGIC FORM ............ 37

FIGURE 3.4 3-PLAYER GAME IN STRATEGIC FORM ................ 38

FIGURE 3.5 WEAK DOMINATION AND MIXED STRATEGIES . 44

FIGURE 3.6 COMBINATORIAL POSSIBILITIES FOR MIXED EQUILIBRIA 47

FIGURE 3.7 STRATEGIC FORM GAME WITH MANY MIXED EQUILIBRIA 54

FIGURE 3.8 TABLE ANALYSIS OF EXAMPLE 3.7 ....................... 55

FIGURE 3.9 CODES IN EXAMPLE 3.7 – 3.8 ................................... 55

FIGURE 3.10 MORE MULTIPLE MIXED EQUILIBRIA ................ 58

FIGURE 3.11 TABLE ANALYSIS OF THE GAME IN FIGURE 3.10 58

FIGURE 3.16 PLAYER 1’S LINEAR PROGRAM (LP) ................... 68

xi

FIGURE 3.17 PLAYER 2’S LINEAR PROGRAM (LP) ................... 68

FIGURE 3.18 GAME 1.2.1 REVISITED ............................................ 69

FIGURE 3.19 PLAYER 1’S LP FOR GAME 1.2.1 / 1.2.3 ................. 69

FIGURE 3.20 PLAYER 2’S LP FOR GAME 1.2.1/1.2.3 ................... 69

FIGURE 3.21 2 X N ZERO SUM GAME .......................................... 71

FIGURE 3.22 GRAPHICAL SOLUTION FOR X* AND V .............. 72

FIGURE 3.25 TRADITIONAL BATTLE OF THE SEXES GAME .. 76

FIGURE 3.26 ALTERNATE BATTLE OF THE SEXES GAME ...... 77

FIGURE 3.27 SYMMETRIC HAWK DOVE GAME ........................ 79

FIGURE 3.28 ROCK SCISSORS PAPER GAME ............................. 79

FIGURE 3.29 STOP SIGN GAME ..................................................... 80

FIGURE 3.30 CHICKEN .................................................................... 81

FIGURE 3.31 MATCHING PENNIES ............................................... 81

FIGURE 6.1 STRATEGIC FORM COOPERATIVE GAME ........... 105

FIGURE 6.2 TWO-PLAYER ZERO-SUM GAME DETERMINING ({2,3}) 106

FIGURE 6.3 GRAPHICAL SOLUTION FOR ({2,3}) ................... 106

FIGURE 6.4 TWO-PLAYER ZERO-SUM GAME DETERMINING V({1}) 107

FIGURE 6.5 GRAPHICAL SOLUTION FOR ({1}) ...................... 107

FIGURE 6.6 ZERO-SUM GAMES YIELDING CHARACTERISTIC VALUES 108

FIGURE 6.7 CHARACTERISTIC VALUES FROM FIGURE 6.1/6.6 GAME 109

FIGURE 7.1 HAWK-DOVE NUMERICAL EXAMPLE ................. 142

FIGURE 7.2 HAWK-DOVE EVOLUTION (V=2,C=3,W=1,H0=3/4,D0=1/4) ........................................................ 144


xii




FIGURE 7.7 PURE ESS IN A SINGLE POPULATION MODEL ... 153

FIGURE 7.8 FOSTER & YOUNG 10-ROUND PRISONER’S DILEMMA 156

FIGURE 7.10 ROCK SCISSORS PAPER AS A GENERATION GAME 162

FIGURE 7.11 ROCK SCISSORS PAPER EVOLUTION DYNAMICS 163

FIGURE 7.12 EQUILIBRIUM POSSIBILITIES FOR EXAMPLE 7.3.1 170

1

1 Introduction This chapter offers a brief overview of the mental and cultural landscape in which game theory originally took root and continues to grow. Growth is driven partly by applications, as researchers in a variety of fields see mathematical commonalities with their problems but a need to adjust the modeling assumptions to improve the fit, and partly by the inevitable desire to remove the limitations of assumptions that provided necessary focus to achieve strong results in the earlier development of the theory. Young students should realize as fast as possible that mathematics is not a set of dogma created before the dawn of time and handed down faithfully through the ages. Rather, it is a language for describing ideas and logical relationships between ideas. The language has been invented gradually over time with contributions by thousands of thinkers. The existence of a common, international language of mathematics has enabled the development and documentation of a body of ideas and results regarding relationships between ideas that are known to be logically valid because the language of mathematics is capable of clearly representing rigorous logical thought.

The special quality of mathematics as a language is that it enforces unambiguous description. Young students are no doubt familiar with the expression “not well-defined”. In plain English or any other vernacular, people are free to do their best at describing any topic of interest, and the potential for multiple, incomplete, or paradoxically inconsistent interpretations is sometimes tolerated as a routine occupational hazard and sometimes embraced as an art form, the latter especially in humor and also poetry and

2

song. In mathematics, one is not at liberty to say absolutely anything – one can only say what can be made unambiguous. An idea is not valid math until it is well defined; i.e. unambiguously defined, at least at some level of precision.

One could say that mathematics is the only field in which anything can be absolutely proven – that all other sciences merely amass compelling experimental evidence but never escape the possibility of a counterexample. This is because mathematical results are not theorems about the state of the world, the state of the universe, or the state of reality at any level. Mathematical results tell you that if one thing is true, then something else is also true. The “one thing” is referred to as the hypothesis and the “something else” is referred to as the conclusion. It is up to those who apply mathematics in the world to worry about whether and when a hypothesis is true and, therefore, a corresponding conclusion is guaranteed.

It is precisely because of the formal system of definitions in mathematics that it is possible to state results with absolute certitude. An object or process in the physical world exists whether we can define it correctly or not, and has properties whether we can detect or understand them or not. The objects or processes discussed in mathematics and the words and symbols used to define them have absolutely no meaning except that meaning given to them by their definitions. This feature is what enables the mathematician to know that he or she has, literally, “thought of everything”: because the scope of relevant information has been kept manageable by limiting the definition to a short list of considerations. That is to say, mathematical objects are ideas – they are structured imaginings. “Exercise your imagination” is a common exhortation to children. In

3

mathematics the expression becomes a rigorous challenge to imagine objects or processes with sometimes layered, complicated and seemingly impossible or paradoxical properties which are only successfully imagined after tenacious focus, arduous logical thought and sometimes luck conspire to evoke realizations and visualizations that exactly satisfy a predefined goal. The common metaphor “mental gymnastics” is an apt description of the controlled imagination that constitutes mathematical thought.

Mathematics majors probably know this already, but to my readers with limited experience with mathematics, it will help you digest this book as well as any other math text if you remember that everything hinges on definitions. Whenever you feel “stuck”, whether trying to solve an exercise or trying to understand the statement or proof of a theorem, my advice is to elaborate the definitions of all the terminology in use. Every mathematical noun more specific than “entity”, can be replaced with a less specific noun along with specific set of criteria that must be satisfied. Elaborating these definitions is very helpful in tracking the layered meanings and, by repetition, making the required imaginations more salient, as a psychologist or neuroscientist might say.

1.1 Mathematical Modeling as an Approach to Analyze the World

The essence of mathematical modeling is to create a surrogate for something in the real world by using mathematical assumptions to describe it, thereby defining a model. We analyze the model as a proxy for the real subject of interest, with the approach that so long as our assumptions are an accurate description of the real subject,

4

the logical implications derived from our model will hold true for our real subject as well.

In practice, the assumptions of a model never capture all the details of the subject under study, and this is no more a flaw of the approach than it is the whole point. It has been said that all models are wrong, but some are more useful than others.

Two important ways to approximate reality with a model are by choice of scope and by choice of fidelity. Consider a model that uses variables to describe dynamic system properties (i.e., properties that can change). Variables whose values are computed by the model are called endogenous to the model, and variables whose values are injected into the model as inputs are called exogenous to the model. System properties that are left out of a model entirely are externalities. For example, a transportation model might consider the costs of vehicle purchase and maintenance, gas and oil, insurance and manpower for loading and unloading trucks and a company might use such a model to minimize costs by choosing the routes for its trucks to minimize mileage. The available routes and associated distances, vehicle prices, gas and oil prices, vehicle fuel efficiency, average maintenance cost per mile, labor and insurance rates are all exogenous because their values must be supplied and cannot be computed from knowing the routes. Gas, oil and maintenance costs are endogenous and can be computed from the distance traveled, which is chosen within the model. The cost of maintaining the roads is an externality if it is not include impact, as is often the case. If all roads had variable tolls that accurately paid for the impact on maintenance of every vehicle, then these toll rates would be exogenous and the portion of road maintenance costs paid for using each road

5

would be endogenous. Externalities can be intentional to deemphasize or even mask costs that an enterprise or activity is creating. A possible motive for such masking would be to avoid complaints from the person, people or enterprise that will have to pay the costs if nobody notices. This is quite often the case with industrial activities that generate air and water pollution requiring either expensive mitigation or increased health care costs or both, decrease property values, impact food and entire associated supply a chains including restaurants, intra city transportation, entertainment, and quality of life impacts that are difficult to quantify in dollars and cents. The 2010 British Petroleum (BP) offshore spill in the Mississippi river delta had a enormous costs to a wide range of commercial and governmental enterprises in New Orleans and the surrounding area. Where there is lost commercial revenue, there are lost tax revenue, in addition to other direct and indirect government costs for cleanup, healthcare, environmental mitigation and so on. BP had a strong profit motive to leave as many costs out of any damage models as possible, but this is not to say that this motive governed any decisions. The mere existence of this motive and similar motives in countless business situations creates the need for analysts who understand mathematical modeling and how to identify externalities and their impact on model results. Externalities are not always wrong, but rather a modeling choice that might or might not be appropriate. If a model is being used to optimize variable costs of a short time horizon decision, then whether or not fixed costs are included as exogenous or excluded as externalities is essentially irrelevant to the application at hand.

The other important modeling concept discussed in this introduction is fidelity, which means the level of detail

6

captured (i.e., described) by a model. A high-fidelity model captures a deep level of detail and a low-fidelity model captures only the very general features of the subject of study. In an applied setting, the most useful model is that which captures the details critical to answering the questions at hand while ignoring all details not relevant to the questions at hand. The ideal model could be described as an optimal simplification of the problem at hand: as simple as possible without ignoring anything important.

For example, Newtonian mechanics is a famous model of the physical world. By studying Newton’s model of the physical world, people have designed steam engine’s, air travel and numerous other technologies. Yet, Newtonian mechanics does not model the weak force or the nuclear force or the extremes of the relation between time and speed. If one is aiming an artillery shell, Newton’s model produces a result which is highly accurate in the context of the application and any details about the nuclear forces or speed effects on time would excessively complicate the model with no worthwhile improvement in the accuracy of the result. If one is considering the behavior of atomic particles in an accelerator, Newton’s model is not useful because it does not address details that are critical to the questions under study.

An important distinction is that precision and accuracy are not synonyms in the domain specific jargon of modeling. Precision is a synonym for fidelity and refers to the level of detail specified. Accuracy refers to “correctness” or “truth” relative to some external standard. If the true time is 12:00:01 PM and somebody asks, “what time is it?”, then “midday” is a rather imprecise but highly accurate answer, while the answer 4:37:04.987344456 AM is both extremely precise and extremely inaccurate. In some signal

7

processing applications it is common to model the echo of a pure tone off a reflective object at a great distance from a receiver as a returning pure tone from a single location but at a closer distance to the receiver to model the echo as a temporally distributed collection of returning tones from spatially distributed locations on the object.

In practice, whether or not to invest in extra precision is a question of whether increased precision will change the bottom line answer to a question being asked. Automobile’s typically have a low fidelity odometer to display the approximate life of the car in miles traveled and a higher fidelity trip odometer in tenths of miles to enable the driver to distinguish between multiple opportunities to turn that are all within a one mile stretch of road.

Fidelity can also be related to the earlier discussion of cost models and externalities. In a cost minimization application, there can be costs that are truly variable but whose variation is miniscule compared to the variation of other costs. Regarding the almost-fixed costs as fixed could be thought of as treating their variations as externalities, but could also be regarded as a choice of fidelity.

These concepts should be kept in mind when studying game theory, which is the study of mathematical models of decision scenarios.

1.2 Definition of Games and Scope of Game Theory

To keep this introduction free of formal mathematical notation, a definition in English of games is given here which is consistent with symbolic definitions given later in the text, and general enough to include the different

8

varieties of models studied as games. A decision scenario is defined as a non-empty set of decision makers, called players, each of whom has a non-empty set of decisions to make, along with a set of outcomes which affect all players. A mathematical game is a model of a decision scenario which represents outcomes as numerical payoffs. The numbers might or might not have meaning in an absolute way, but the inequalities between the payoffs always model the valuations of the players. That is, a higher payoff from one outcome over another models that a player prefers the one outcome to the other. In a natural way, the relative magnitude of a payoff difference can model the strength of a player’s preference for one outcome over another.

As discussed in the preceding remarks on modeling, there are many assumptions that go into defining a particular decision scenario and corresponding game. The set of players is often finite, though in principle it could be countably or uncountably infinite. The costs of implementing different alternatives or the probability of implementation error is often assumed away as a choice of fidelity by the modeler, though these issues are dealt with by some models.

The decision algorithm(s) of the players are not necessarily a part of the game, but they are necessary to predict the outcome of a game, which is often a key goal of analysis. Mathematician and Nobel laureate John Nash, one of the founders of the modern theory, has argued that the boundaries of what we call game theory should be defined by the game and not the players, and investigations of the consequences of different player models interacting with a game is an integration of game theory and psychology. Nash’s founding work assumed pure optimization of payoffs which provides a fixed lens, so to speak, through

9

which to examine all games. Psychologist and Nobel laureate Herb Simon introduced the concept of bounded rationality, arguing that humans were not capable of the memory capacity and computational speed implicit in the assumption of optimization. Simon noted the widespread use of “rules of thumb”, or heuristics, for decision making, especially in limited information situations, but also including situations where more information and processing would be possible for an additional investment of time, effort, and other information and processing costs. Nash’s perspective truly emphasizes analysis of the game itself as a puzzle or math problem, as it were, in terms of the strategic opportunities that the game presents, whereas Simon’s perspective emphasizes the realism of predictions derived in an applied setting.

The applied predictive value of game theory can be improved even more by recognizing that an implicit assumption of the 1-dimensional numerical payoff for each player is that each player’s preferences over outcomes are well-ordered. This means that for any two outcomes, each player strictly prefers one to the other or is indifferent, and, moreover, these preferences are transitive; i.e. if A is preferred over B which is preferred over C, then A is preferred over C. This assumption precludes modeling scenarios in which outcomes can affect us in a variety of contradictory ways.

Evolutionary psychology suggests that different functions of our brain have evolved to respond to different environmental stimuli. For example, part of our brain responds to our need for food and environmental opportunities for food while another part may respond to threats to our physical safety and a third may respond to opportunities to mate [Minsky 1989]. Consider a situation

10

in which, given a binary choice between food and safety, one prefers safety, a mating is preferred over opportunity physical safety, but food is preferred over a mating opportunity. This could occur if the response to a mating opportunity induced a feeling of invincibility, but hunger suppressed the response to a mating opportunity. The classical theory cannot model any scenario in which outcome A is preferred over outcome B, B is preferred over outcome C, and C is preferred over A.

Some researchers are beginning to extend models to enable non-well-ordered (multi-dimensional) preferences [Byrne 1995, Byrne & Kurland 2001, Achampong 2009, Byrne Achampong & Haney 2009] and also dynamic preferences, which are the next natural extension and also consistent with evolutionary psychology and the adaptive learning literature. Research in agent systems, swarms, finite state automata and neural networks all focus on the emergent dynamics resulting from local neighborhood interactions of distributed agents with (possibly) multidimensional preferences. Much of this research has been created and published by computer scientists working independently from the game theory community and both research efforts could be greatly enhanced by cross-pollination of ideas and the emphasis on multidisciplinary research in grant programs can help realize this potential.

The technical material covered in this text is limited to the historical models with 1-dimensional (real valued) payoffs and fixed preferences and readers should not consider these wrong or useless. They are very valuable for situations where these assumptions are valid. Readers are encouraged to ask of every model presented: “what are some realistic scenarios in which this model makes sense?”, “what are some realistic scenarios in which this model does not make

11

sense?”, “how could violation of each assumption alter the prediction of the model?”, and “how could the model be altered to make the assumptions better fit a scenario?”.

1.3 Categories of Games

Games can be distinguished by several characteristics, so there is not a single strict hierarchy of games, but rather several categories of games, some of which are overlapping.

Finite games are finite in very way: number of players, number of decisions, number of alternatives.

Infinite games can be infinite due to infinitely many alternatives for one or more decisions, or due to infinitely many decisions for one or more players, or even due to infinitely many players. Infinite games are often created from finite games by allowing repetition.

Strategic form games refers to concept of strategy to reduce multiple decisions by all players into a single decision by each player, made by all players simultaneously, enabling representation of the entire games by a payoff function mapping strategy choices into payoffs. For a discrete set of players this payoff function is typically represented by a matrix for with one dimension for each player, indexed by the strategies of that player, with payoffs given in each cell as a vector.

Extensive form games are directed graphs in which each node with successors represents a decision point and each node with no successors represents an outcome and has a payoff vector. Extensive form games explicitly show the sequencing of decisions (moves).

12

Static games are those which, when played, are played just once. Dynamic games are those which extend in time, although this notion can sometimes be a matter of scale. Strategic form games can be regarded as the only truly static games. Any extensive form game explicitly models sequencing of moves and therefore has a dynamic aspect to the game, but some extensive form games are only a few moves long and are usually treated as static games. In multi-stage dynamic games, each sequential stage of the game is referred to as a period or a stage and the subgame that occurs in a period is called the stage game. A period could be simply a move by any player and the subgame is thus a one player game. A period could be a longer sequence of moves or a set of simultaneous moves by some or all of the players.

Repeated games are dynamic games made from static (e.g. strategic form) games by repeating the static game in each period. Total payoffs are computed as a function of stage game payoffs, typically an average or weighted sum. These games are also called supergames and are the usual context in which reputation effects are studied.

Evolutionary games are dynamic models in which a stage game is played by one or more populations of strategies and the populations adjust based on the relative success of the member strategies. Evolutionary games are used to model Darwinian evolution and are also used by analogy to model “cultural inheritance”; e.g., adoption of technology such as cell phones, smart phones, or social network providers. Evolutionary games also used as computer learning algorithms; e.g., genetic algorithms.

13

Cooperative vs non-cooperative games distinguish between whether communication, commitments, and side payments are allowed of the players.

Signaling games put severe restrictions on communication and do not include side payments.

Constant sum, including zero-sum, distinguish games in which aggregate payoff is constant across all outcomes, so some player(s) must receive less for any player(s) to receive more.

Two-player zero-sum is a special category in which the players’ interests are always diametrically opposed. These games are called strictly competitive for this reason.

General sum games have no restrictions on the relation of payoffs across the players. Within these broad categories there are sub categories, such as coordination games, where players try to coordinate their actions without the benefit of communication.

Complete information and incomplete information refer to whether players always know their exact position when choosing their move. Chess has complete information, but Black Jack does not.

Incomplete recall is a special case of incomplete information where players fail to retain information that they possessed earlier in the game.

Imperfect information refers to the possibility of players receiving incorrect information, usually with some probability of being correct or incorrect.

14

1.4 History of Game Theory

The earliest examples of game theory usually cited are models of capitalist competition by Cournot and Bertrand in the 19th century, and Hotelling in the early 20th century. Zermelo’s 1913 paper on chess is also an early example. Many of the ideas of game theory are as ancient as war and business, two major applications of the theory. John Von Neumann and Oskar Morgenstern first declared its existence as a theory with their 1944 book The Theory of Games and Economic Behavior in which the theory of two-person zero-sum games was treated thoroughly, including the definition of optimal mixed strategies and the value of a game and the proof that these always existed for every such game.

Von Neumann and Morgenstern’s book, combined with the military interest in operations research driven by World War II, sparked a concerted effort to develop the theory. Luce and Raiffa, John Nash, Kuhn and Tucker and many other works followed quickly. The theory includes contributions from biologists including John Maynard-Smith, Richard Dawkins, Robert Trivers, and Amotz Zahavi, psychologists, notably Herbert Simon, Tversky and Kahneman who all won Nobel prizes in economics for their contributions. Computer scientists have made important contributions, such as Tim Roughgarden’s analysis of the cost of chaos. Robert Axelrod and other political scientists have used game theory to study the emergence of cooperation in competitive environments. Architecture faces game theory in how the design of a building affects the actions of people using it. Civil and industrial engineers address game theory problems in how the layout of a city or a factory affects the actions of occupants.

15

Public policy, from energy to transportation to welfare, all pose game theory problems.

Anyone whose work focuses on or is affected by human or animal behavior or automated decision making can use game theory, at least now and then, if not as a primary approach. Indeed, as humans who must make dozens if not hundreds of small and sometimes big decisions every day of our lives, we can all relate to the objective of game theory, if not its methods. The biggest limitation to the application of game theory is that imposed by the assumptions, which, once you understand the principles, can be adjusted and the variant model reanalyzed.

The interdisciplinary nature of the subject has led to the formation of the Game Theory Society (GTS). For decades, game theorists were a sparse minority in every academic department and therefore at every traditional subject based conference, such as math or political science. At GTS conferences, everyone is studying game theory, as a research method if not a primary focus of inquiry, and nearly every academic discipline is represented. Those interested in further study will find their website very helpful: www.gts.org.

16

2 Finite Games in Extensive Form

2.1 Positions, Moves, Information and Recall

A finite extensive form game G=(I,) is a finite set I = {1,…,n} of players and a directed graph satisfying

1) has exactly one start node with no predecessors,

2) every node in can be reached by traversing edges from the start node,

3) every node with no successors has an associated vector of real numbers, one for each player,

4) every node with successors is labeled to reference exactly one player iI.

The start node denotes the starting position of the game. The nodes with no successors are called terminal nodes and represent final outcomes of the game and therefore have payoff vectors associated with them. The labels denoting players indicate which player “moves” at each non-terminal node; that is , which player chooses amongst the successor nodes.

Texts differ on whether cycles are allowed or even whether any node can be reached by more than one path. This text will allow both of these phenomena but point out the modeling implications. A node represents all the information a player has about his or her position, so reaching a node by more than one path indicates that the full history of play is not available to the deciding player at that node or any subsequent nodes reached from that node.

17

Reaching the same node by multiple paths models that a player does not know by which path the node was reached. This is called incomplete information. If history lost includes one or more of a player’s own moves, this is called incomplete recall. Two different but related concepts are imperfect information and imperfect recall. Incomplete information means the players always have correct information, but not all information, and incomplete recall means players can correctly recall some of the history of play but not all of it. Imperfect information means players receive information about their position that is incorrect with some positive probability and imperfect recall means players make mistakes in recalling the history of play.

Multiple paths to a given node are one type of incomplete information, specifically, the history of play being the missing information. There can be other types of incomplete information, as well, such as the payoffs or which player moves next. The general method of representing incomplete information is to group sets of nodes into information sets. The meaning of an information set is that if the true position is at any node in the set, the player whose move it is knows the position is a node in the set but does not know which node in the set. The case discussed above, of incomplete information regarding the history of play, could be modeled by showing distinct positions for the distinct histories and grouping the distinct positions into a single information set, but sometimes using a single node yields a clearer representation. Of course, there are logical limitations to what nodes can be grouped into an information set. The nodes must be positions at which the same player makes the move and that player must at the very least have the same number of choices at that node, because the set of

18

move choices is always known so two nodes with differing choices can always be distinguished – even under incomplete information – so long as the information is not imperfect (Figures 2.1, 2.2).

Cycles in an extensive form game model incomplete recall, and also admit the possibility of infinite play via cycling. In an infinite game it is necessary to specify payoffs when no terminal node is reached, and the convention is to set them to zero.

Figure 1.1 Figure 2.1 Extensive Form Complete Information Game

19

Example 2.1 above shows a game with 3 players, one of whom could have more than one move, depending on the moves of the other players.

Figure 1.2 Extensive Form Incomplete Information Game

In example 2.2, the positions are the same, but player II does not know whether player I or player III has passed her the move, and player III does not know whether player I or player II has passed her the move.

20

Figure 1.3 Extensive Form Incomplete Recall Game

In Example 2.3 if player I gets a second move, she does not know whether player II or player III has passed her the move. This demonstrates incomplete recall in that player I does not remember her own past choice of a or b that would otherwise enable her to derive which player passed the move.

21

Figure 1.4 Simultaneous move game in extensive form

i. Modeling Simultaneity

Incomplete information is used to model simultaneous move games in extensive form. The extensive form structure forces a sequential “picture” of a game, but if the players do not know each other’s moves, then it is equivalent to simultaneous moves from a tactical or strategic point of view (Figure 2.4).

I

II II

4-3

-12

-34

2-1

B

C D C D

A

I

II II

4-3

-12

-34

2-1

B

C D C D

A

22

Figure 1.5 Extensive Form Game with Random ‘Moves’

ii. Modeling Randomness

Randomness such as shuffle of a deck of cards or a roll of dice can also be modeled in extensive form by making an extra player called Nature which makes a move corresponding to each possible outcome with a corresponding probability. In example 2.5, the first player to move is chosen randomly.

N

I I

IIIIIIII

0-1

31

01

11

3-3

31

02

3-1

2/3

L R L R

L1 R1 L2 R2 L3 R3 L4 R4

1/3

N

I I

IIIIIIII

0-1

31

01

11

3-3

31

02

3-1

2/3

L R L R

L1 R1 L2 R2 L3 R3 L4 R4

1/3

23

The two differences between Nature and other players are that

1) Nature always moves according to a fixed probability distribution whereas other players can choose any available move, so Nature’s move is not included in strategy profiles

2) Nature does not receive a payoff.

The shuffling of the deck at the beginning of a hand of poker or other card games is a common example of random moves. There are 52! shuffles so drawing the extensive form is not practical, but symmetry and other properties can be used to break down the analysis so a computer can be used to navigate the extensive form [Hundal].

1.5 Strategies and Strategy Profiles

Definition 2.1 Strategy

Given an extensive form game, a strategy for a player is a choice of a move at every information set where the player must make a move, including those which are impossible to reach due to earlier moves.

A strategy is a contingency plan specifying what move will be made in each and every possibility of what the other players do. A strategy can be likened to a set of instructions that can be handed off to a representative, who could then play the game by looking up each position at each turn and making the move specified for its information set.

It is generally the case that some positions of a game will never be reached in any given play of the game, but to completely define a strategy for a player, decisions must be

24

specified for every node in the tree where that player makes a move.

In example 2.1 above, agg, agh, ahg, ahh, bgg, bgh, bhg, bhh are strategies for player I, cc, cd, dc, dd are strategies for player II, and ee, ef, fe, ff are strategies for player III. What these strategies have in common is that they are deterministic; that is, there is no randomization. A strategy could specify a random decision at a node; e.g. n example 2.2, ½(c), ½ (d) s a strategy for player II.

Definition 2.2 Pure Strategy

A pure strategy is a strategy that dictates a definite move choice at each information set.

Definition 2.2 Behavioral Strategy

A behavioral strategy is a strategy that specifies a probability distribution at each information set, rather than a definite choice.

In examples 2.1 and 2.2, a behavioral strategy for player I is 1/2(a),1/2(b) | 2/3(g),1/3(h) | 1/4(g),3/4(h), meaning player I randomizes 1/2,1/2 for the choice of whether to move a or b, 2/3,1/3 for the choice of whether to move g or h after a (implicitly treated this as the 2nd decision point), and 1/4,3/4 for the choice of whether to move g or h (implicitly treated as the 3nd decision point). Any particular set of moves specified for Nature to model exogenous randomization constitute a behavioral strategy.

A third type of strategy is a mixed strategy,

Definition 2.3 Mixed Strategy

A mixed strategy is a probability distribution over the set of pure strategies.

25

It is easily verified by the reader that this definition indeed satisfies the definition of a strategy by specifying a decision for a player at every node where the player ever makes a decision. The distribution 1/3agg, 2/3 bhg is an example of a mixed strategy for player I in example 2.2. The difference between behavioral strategies and mixed strategies is subtle: the probability distributions at each information set in a behavioral strategy are independent. In particular, they do not depend on the previous choices by the player. Mixed strategies are placing probabilities on entire sets of choices together, and therefore the probability distributions that are induced, or realized, at each information set can be dependent on the previous choices by the player.

In a game with complete recall, there is a unique history of a player’s own moves that make reaching any given information set possible, and therefore the question of a randomization at the information set depending on the player’s own previous choices taken to get there is moot. In this case, the set of all behavioral strategies for each player and the set of all mixed strategies for a player are equivalent, in the sense that any mixed strategy induces a behavioral strategy, and any behavioral strategy can be induced by one or more mixed strategies.

In a game with incomplete recall, there can be multiple sets of previous choices leading up to an information set, and a player cannot use mixed strategies that induce probabilities dependent on information the player cannot recall. In example 2.3, ag and bh are legitimate pure strategies, because player I is moving “always left” (g) or “always right” (h); however, the mixed strategy 1/3ag, 2/3bh is not possible. This is because the induced behavioral strategy would not only dictate moving g 1/3 of the time and h 2/3

26

of the time (which is a feasible behavior), but would further dictate always moving g when a was player I’s earlier choice and always moving h when b was player I’s earlier choice, and this behavior is rendered infeasible by the lack of memory modeled by the information set. Remember that in this small example game tree such poor memory might seem like clinical amnesia, but it is a high level with no indication of time duration. Can you remember what you had for lunch exactly 3 weeks ago today? One year ago today?

Definition

A strategy profile for a game specifies a strategy for each player, and is usually notated with a vector; e.g. (ajk, cm, eg) denotes a strategy profile of pure strategies for example 2.5. In general, if i denotes a mixed strategy for player i, i = 1,…n, then = (1,…,n) denotes a strategy profile. Given a strategy profile, one can determine the outcome and corresponding payoffs of a game by tracing the path through the game graph that the decisions in the players’ strategies lead to. The nodes that are reached by a given strategy are called on path and the nodes that are not reached are called off path.

1.6 Nash Equilibrium

Having defined a strategy profile, one of the central concepts of game theory can now be described: Nash equilibrium, named for John Nash, one of the seminal contributors to game theory. A Nash equilibrium is a strategy profile in which no single player can increase its payoff by unilaterally changing strategies, that is, by changing its strategy while all other players’ strategies remain fixed. Another way to phrase this property is that

27

each player’s strategy is a best response to the strategies of the other players.

Nash equilibrium has become the backbone of modern economic theory and plays a key role in evolutionary biology and sociology and every other discipline where game theory is applied. A Nash equilibrium is called a pure strategy (respectively, mixed, behavioral) Nash equilibrium, if it specifies a pure strategy (respectively mixed, behavioral) for every player. Note that pure strategies can be regarded as special cases of mixed strategies and also as special cases of behavioral strategies.

A few key concepts will be formalized for future reference.

Definition 2.1 i will denote player i’s set of strategies. Whether this includes pure strategies only or also mixed or behavioral strategies will depend on context and will be specified as needed. = 1 x … x n is the set of strategy profiles, i.e., if = (1,…,n) where i i is a strategy for the ith player. The fact that a strategy profile determines an outcome payoff is represented by a payoff function

: Rn

where () = (1(),…n())

and i() is the payoff to player i given that the players follow , that is, player i plays i. It will also be useful to let ~i denote all the players except the ith player, ~i denote a strategy profile for all players except the ith player, and ~i denote the set of all such profiles. Using this notation, a Nash equilibrium can be defined as a strategy profile with the property that

I, i(*i,*~i) ≥ i(i,*~i) i i

28

1.7 Backward Induction

Given a strategy profile, one can verify whether it is a Nash Equilibrium simply by testing, one player at a time, whether any player can improve its payoff by switching strategies. Finding a Nash equilibrium in a game generally requires some analysis. Backward induction is an algorithm that will find pure strategy Nash equilibrium in any finite extensive form game of complete information. Backward induction begins at “penultimate” nodes; i.e. nodes at which the player’s move ends the game. The algorithm assumes that the players at these nodes move to maximize payoff. The move choices are noted and the penultimate nodes are replaced with terminal nodes using the payoffs from the moves that were assumed. The game is now one move shorter and the procedure repeats.

29

Figure 1.6 Branching in backward induction

Because the game is finite, the algorithm will end in finitely many steps. If a player is ever faced with equal payoffs form one or more choices at a node, the algorithm branches and must be completed over again for each alternate move of each player at each node where the player is ambivalent. In the game of Figure 2.6, backward induction branches once, leading to the two strategy profiles (aik,dm,eg) and (ajk,cm,eg).

The move choices noted at each step of the algorithm, when taken as a set for each player, define a strategy profile [Figure 2.6]. In fact, the strategy profile produced by the algorithm is a Nash equilibrium. To see this, consider two ways in which a player could change his or her strategy: on-path or off-path. If a player changes a choice at a node that is off-path, i.e., a node that is not reached given the

I

II III

IIIIIII

0-12

31

-2

01

-2

112

3-31

310

02

-2

3-13

a b

c d e f

g h i j k l m n

I

II III

IIIIIII

0-12

31

-2

01

-2

112

3-31

310

02

-2

3-13

a b

c d e f

g h i j k l m n

30

strategy profile, then no payoffs are changed. If a choice is changed on-path, i.e. at a node that is reached given the strategy profile, then the player who changes cannot raise their payoff, because the choices made already assumed payoff maximizing decisions.

Zermelo (1913) observed that backward induction implies that chess has a pure strategy Nash equilibrium with a definite outcome, namely white wins, black wins, or a draw. That is, either one player can always force a win or, like tic-tac-toe, either player can force a draw. We will revisit this result when zero-sum games are studied.

Even with complete information, backward is not guaranteed to find all Nash equilibria in an extensive form game, because backward induction has players maximize at every node. Because Nash equilibrium is defined for a strategy profile as a whole, it is not necessary that every player’s strategy maximize at every node. It is only necessary for Nash equilibrium that players maximize on-path. The Nash equilibria in which all players strategies dictate maximization on and off path – i.e. the equilibria found by backward induction – are called subgame perfect.

i. Sub-game perfection

Reinhart Selten, who shared the Nobel prize in economics with John Nash and John Harsanyi in 1994, defined the property of subgame perfection to distinguish between equilibria that relied on pure maximization(on and off path) and equilibria that relied on off-path sub-optimal moves by some players. Selten observed that a strategy can threaten to make a sub-optimal move because it leads to a bad payoff for another player as well, creating an incentive for the other player to avoid that node, choosing instead a path

31

that benefits the player threatening sub-optimal play. Despite the incentive created by the threat, Selten reasoned that if the other player(s) called the bluff, it would not be in the threatening player’s best interest to carry out the threat, so he called such threats incredible; that is, the literal meaning of not credible. He said Nash equilibria relying on such incredible threats should be taken with a grain of salt.

Figure 1.7 Non-subgame perfect Nash equilibrium

Mathematically, subgame perfection is defined in what follows. First, a subgame of an extensive form game is any subgraph of the extensive form graph that is also an extensive form game by itself; that is, a subgraph starting with a particular node in the original graph and including all successors along any branches. A strategy profile of the original game can be said to induce a strategy profile on any subgame in a natural way, simply by restricting the strategy profile to only those decisions at nodes in the subgame. A Nash equilibrium is subgame perfect if the induced strategy profile in any subgame is a Nash

I

II II

101

-5-5

110

-5-5

B

C D E F

A

I

II II

101

-5-5

110

-5-5

B

C D E F

A

32

equilibrium. In the one-player one-move games at penultimate leaf nodes of a game, a Nash equilibrium means the player is maximizing.

In example 2.7, (A,DE), marked in red, is a Nash equilibrium but is not subgame perfect.

33

3 Games in Strategic Form As discussed earlier in the text, the payoff vector of an extensive form game is completely determined by a strategy profile . Formally speaking, the payoff of the game is a function of the strategy profile . If 1x…xn is the strategy space of the game, then Rn denotes the payoff function of the game. If is finite, for example when only pure strategies are considered in an extensive form game, then the payoff function can be represented in a table, or matrix, where the indices i1,…,in represent pure strategy choices for the n players and the i1,…,in matrix entry is (i1,…,in). This matrix representation is called the strategic form of a game. The strategic form is used for games with finitely many players and pure strategies, and also for finitely many players with countable infinities of pure strategies, in which case the strategic form is an infinite matrix of finite dimension. Strategic form can even be used for a countable infinity of players, though in practice this is rarely considered. Strategic form cannot be used for games in which players have a continuum of pure strategies, such as hide and seek, where a submarine can hide anywhere along a continuous stretch of ocean, or duel, where each player can wait any duration of time before pulling the trigger. These games with a continuum of pure strategies will be considered later in the book. The strategic form of the game in Figure 2.4 is given below in Figure 3.1.

1.8 Strategic Form – Pros and Cons

The strategic form is efficient for finding mixed strategy Nash equilibria, and indeed the strategic form is a better

34

model than the extensive form for games in which all players have a single decision that is simultaneously made by all players– because that is exactly how the strategic form models every game.

Figure 1.8 Figure 2.8 Strategic Form of Example 2.4

But, for games that are actually sequential and not simultaneous, all time-sequencing of moves has been abstracted away in the strategic form. The assumption of strategic form is that players choose their entire strategy up front before any players have any information about what other players are doing and, once chosen, players are committed to carrying out their strategy. Extensive form is a higher fidelity model (more detail) and includes the sequencing of moves. This information can have important implications for how the game is played. For example, a player who moves earlier than another in an extensive form can use her move to signal her intentions about future moves. This cannot be analyzed with strategic form. Consider non-subgame perfect equilibria that rely on threats: what if the threatened player moves first, thus calling the bluff of the threatening player? Would the threatening player still carry out the threat after it has failed to have its desired effect as a deterrent? In real life, the answer is “sometimes”. For the purposes of math models, one must use extensive form to analyze such situations, because subgame perfection cannot even be identified in the strategic form. All that can be seen in the strategic

4,-3-1,2B

-3,42, -1A

DC

4,-3-1,2B

-3,42, -1A

DC

35

form is the highest level choices of strategies and the final consequences of using all players’ strategy choices together – chosen simultaneously and implemented without opportunity for change.

What can be done with the strategic form is the quick identification of all pure strategy and even mixed strategy Nash equilibria. The hard part of finding Nash equilibria in extensive form games is all the tracing of the graph that must be done to evaluate all the “what if” questions implicit in verifying an equilibrium: “what if player i made this change? What if player i made that change? The strategic form represents the results of all those traces in a matrix which is easy to read by humans when small and low-dimensional and easy to index by computer even when large or high dimensional.

Figure 1.9 Figure 2.9 Best Responses in Strategic Form of 2.7

1.9 Pure Nash Equilibria in Strategic Form

In strategic form, pure strategy Nash equilibria can be found by underlining the best response payoff for each player, while systematically fixing the strategies of the other player(s). In example 2.9, this means for row (fixed strategy for player I), underline the best payoff for player

10,1

1,10

DF

10,1

-5,-5

CF

-5,-5-5,-5B

1,10-5,-5A

DECE

10,1

1,10

DF

10,1

-5,-5

CF

-5,-5-5,-5B

1,10-5,-5A

DECE

36

II, as many times as it occurs, for these are all best response strategies. For each column (fixed strategy of player II), underline the best payoff for player I, as many times as it occurs. Any cell in which all players’ payoffs are underlined indicates that the corresponding strategy profile is a Nash equilibrium (highlighted with red or blue font). Referring back to the extensive form in Figure 2.7, it can be seen that only the profile (B,DF) is subgame perfect (highlighted in blue). This is because DF is the only strategy in which player II is maximizing in all one player subgames.

Having established the meaning of the strategic form representation, one can imagine that any 2-dimensional matrix with entries in R2 is the strategic form of some two-player extensive form game. Because the strategic form does not represent the sequencing information of the extensive form, there are necessarily multiple extensive form games that have the same strategic form. For example, the extensive form game below has the same strategic form as the game of Figure 2.7, namely that given in Figure 3.2 above.

37

Figure 1.10 Strategic form(extensive form) is not 1-to-1

The extensive form game of figure 3.3 is better represented by the strategic form of figure 3.2 than is the game of figure 2.7 because the information structure of 3.3 is equivalent to the simultaneous move game implicit in the strategic form. Noting this ambiguity, one can nonetheless analyze any n-dimensional matrix with entries in Rn as an n-player simultaneous move game.

The algorithm for finding pure strategy Nash equilibria demonstrated above for a two-player game works in analogous fashion for higher numbers of players. For example, consider a three-player game in which player I has pure strategies A and B, player II has pure strategies C and D, and player III has pure strategies E and F (Figure 2.11).

A B

CE DE CF DF

-5-5

110

-5-5

110

-5-5

-5-5

101

101

CE DE CF DF

A B

CE DE CF DF

-5-5

110

-5-5

110

-5-5

-5-5

101

101

CE DE CF DF

38

Figure 1.11 3-Player Game in Strategic Form

The strategic form is a 3-dimensional matrix shown in 2 dimensions by showing the slices corresponding to player III’s strategies next to each other instead of stacked. Fixing the strategy of player I means fixing a row, fixing the strategy of player II means fixing a column, and fixing a strategy of player III means fixing a slice. Thus, the algorithm underlines player I’s best payoff in each column of each slice, player II’s best payoff in each row of each slice, and player III’s best payoff in each row-column cell, comparing entries across slices. The single pure-strategy Nash equilibrium is highlighted in blue.

In principle, this can be done in any dimension, but it naturally gets more cumbersome to visualize the more dimensions are added.

i. Exercises (extracted to separate doc on Angel)

1.10 Domination of Strategies

One strategy is said to strictly dominate another if the payoff is always better using that strategy, no matter what

-3,2,02,2,2B

3,-2,0-1,3,2A

DC

-3,2,02,2,2B

3,-2,0-1,3,2A

DC

2,0,00,1,1B

-2,4,14,-1,-2A

DC

2,0,00,1,1B

-2,4,14,-1,-2A

DC

E F

39

the other players do. One strategy is said to weakly dominate another if it always pays at least as well, and pays strictly better at least some of the time. These definitions are expressed in terms of the payoff function in what follows.

Let I be particular strategies for player i and let ~i denote any strategy profile for all players except player i. Then strictly dominates if

i (,) > i (,) ~i

weakly dominates if

i (,) ≥ i (,) ~i

and

i > i for at least one ~i.

Domination is a useful property for finding Nash equilibria – especially mixed strategy equilibria. A strictly dominated strategy is never used in a Nash equilibrium, as it is never a best response. A weakly dominated strategy might be part of a Nash equilibrium because the definition allows for equal payoff against some strategy profiles of the other players, so it can happen that is a Nash equilibrium.

Iterated elimination of weakly (respectively strictly) dominated strategies is an algorithm that starts with any player and eliminates any weakly (resp., strictly) dominated strategies from the strategic form matrix. That is, rows or columns are completely removed if the row (resp. column) player would never use them because they are dominated. The remaining matrix can be examined, and weakly (resp. strictly) dominated strategies for another player can be removed. This procedure is then iterated until no players

40

have any remaining dominated strategies. If the algorithm terminates leaving only a single cell in the matrix, then that cell is a Nash equilibrium. The algorithm does not require a particular order regarding which player’s strategies are eliminated first, second, third and so on. It turns out that if there are multiple pure strategy Nash equilibria in a matrix, then the order of elimination can determine which equilibria are left at the end of the algorithm.

41

i. Exercises (extracted to Angel)

1.11 More on Mixed Strategies

Mixed strategies were defined in section 2.2 as probability distributions over pure strategies. There has been much debate over how these probability distributions should be interpreted in modeling terms. If a game is played only once, one must ask how a mixed strategy could be implemented. If a game is played many times, one might imagine a mixed strategy being realized over time. John Harsanyi suggested interpreting a mixed strategy of one player as beliefs held by others regarding how that player will play. This explanation makes sense out of the notion of optimizing against a mixed strategy even if the mixture cannot be truly realized in play. While the interpretation of mixed strategies in different contexts can be challenging, the mathematical existence of mixed strategy equilibria in every finite game has helped to establish them as a standard element of game theoretic analysis. As the purpose of this course is to survey a collection of major models and methods, I will not undertake to improve on any of these explanations, but rather we will focus on finding mixed strategy Nash equilibria in games. As with other solution concepts in game theory, they are best regarded as valuable information about the strategic structure of each game they address, but not necessarily a prediction of how the game will be played.

As probability distributions, mixed strategies can exist for games in which players have a finite, countably infinite, or continuous choice of pure strategies. Mixed strategies are fundamentally a non-cooperative concept, meaning players

42

do not communicate or make side agreements on how to play the game. Therefore when computing the expected payoff from mixed strategies, the assumption is made that the randomizations are independent. That is, given any combination of players’ pure strategies, the probability of the resulting payoff is computed by multiplying the probabilities each player’s mixed strategy puts on the particular pure strategy involved. The total expected payoff from the mixed strategies is the integral over all pure strategy profiles of the resulting payoffs weighted by their probability. In the case of a countably infinite set of strategies, the integral reduces to an infinite sum. In the case of finitely many combinations the computation reduces to a finite sum.

1.12 Mixed Strategies in Strategic Form Games

The strategic form of a game makes mixed strategies especially easy to represent by establishing a specific order for each player’s pure strategies corresponding to the jth position in each player’s dimension of the matrix (row, column, slice, and so on.) A mixed strategy for a player with n pure strategies can be represented by a vector in Rn, with non-negative components summing to unity. The joint probability of independent randomizations is computed by multiplying the individual probabilities, so if G=(aij,bij) is an mxn bi-matrix, A=(aij) is the mxn matrix of player 1’s payoffs, B=(bij) is the mxn matrix of player 2’s payoffs, x =(x1,…,xm) is a mixed strategy for player I and y = (y1,…yn) is a mixed strategy for player II, then player I’s expected payoff is

(x,y) = ijxiaijyj = xAyt and player II’s expected payoff is

43

(x,y) = ijxibijyj = xByt.

Recall that every finite game of complete information has an equilibrium in pure strategies, but without complete information there might not be a pure strategy equilibrium. Nash proved that every finite game has an equilibrium when mixed strategies are allowed. The definition of Nash equilibrium remains the same no matter what the form of the strategy sets i. In the case of finite games, certain properties of the structure of the mixed strategy payoffs lead to an algorithm for systematically finding all Nash equilibria in a game, considering pure strategies as a special case of mixed strategies.

1.13 Mixed Nash Equilibria in Strategic Form

To discuss mixed strategy Nash equilibria, the first step is to carefully examine mixed strategy best responses. Two observations are key. Given any strategy profile by the other player(s), the following facts obtain:

Theorem 2 a mixed strategy can be a best response only if each pure strategy given positive weight is a best response all by itself, and

Theorem 3 any mixed strategy that puts positive weight only on pure strategy best responses is a best response.

These two observations can be combined into an equivalence between all mixed strategy best responses and all mixtures of pure strategy best responses. These observations can be explained in terms of the mixed

44

strategy payoff computation. Fixing the strategy choices of all players except one particular player i renders the payoff to that player i a function of i’s strategy choice alone. Let f(k) be the payoff to player i from using his/her kth pure strategy. Let x = (x1,…,xn) be a mixed strategy of player i and let M = {k | xk > 0}. In other words, M is the set of pure strategies given positive weight by x. Then player i’s payoff from using x is ΣkMxkf(k). Now, suppose M* is a set of best responses; i.e., f(k) ≥ f(j) kM* and j. It should be clear that f(k) = c kM* where c = Maxjf(j), where the maximum is taken over all pure strategies j.. Therefore, ΣkMxkf(k) = ΣkMxkc = cΣkMxk = c(1) = c. Conversely, if any strategy k given positive weight by x satisfies f(k) < c, then ΣkMxkf(k) < c. In summary,

MinkMf(k) ≤ ΣkMxkf(k) ≤ MaxkMf(k).

If all f(k) are equal, equality holds throughout. If Min f(k) < Max f(k), each inequality is strict.

This characterization of mixed strategy best responses enables us to use weak domination to narrow the search for mixed strategy Nash equilibria. Consider the game in Figure 3.5, below.

Figure 1.12 Weak domination and mixed strategies

1,-1-1,00,1x3

0,11,-1-1,0x2

-1,00,11,-1x1

y3y2y1

1,-1-1,00,1x3

0,11,-1-1,0x2

-1,00,11,-1x1

y3y2y1

45

Player I’s pay is shown in black and player II’s pay in red. Pure strategy best response payoffs are bolded, and it can be seen that there is no pure strategy Nash equilibrium. Because Nash proved this game must have a Nash equilibrium, there must be at least one non-trivial mixed strategy equilibrium (x*,y*), non-trivial meaning more than one row (resp. column) must be given positive weight by x* (resp. y*). Based on 3.5 Theorem 1, we can indeed conclude that both x* and y* must be non-trivial mixtures. To understand this, notice that for any pure strategy of player II, player I has a unique pure strategy best response and therefore cannot have mixed best response by Theorem 1. This means player II must be mixing for Player I to have a mixed best response. Likewise, for any pure strategy of player I, player II has a unique pure strategy best response and so cannot have a mixed best response. Therefore player 1 must be mixing for player II to have a mixed best response.

Combinatorics yield the possibilities that exist for non-trivial mixes for x* and y*. Specifically, x* (resp. y*) can put positive weight on rows (resp. columns) 1 and 2, 1 and 3, 2 and 3, or 1, 2 and 3. For each of these possibilities, the tricky thing to keep in mind is that whether one player can have a mixed best response depends on what the other player is mixing. To see this, suppose that player 2 is mixing columns 1 and 2, that is, playing a strategy putting positive probability on columns 1 and 2 but not 3. Accordingly, the payoffs in column 3 do not affect Player 1’s payoffs. When column 3 is ignored, row 1 weakly dominates row 3. The implication of weak domination relative to a set of columns, coupled with the assumption of putting positive probability on all columns in that set, is that the payoff from the weakly dominated row will be

46

strictly less than the payoff from the dominating row. Without the assumption on the column mixture, weak domination would only mean that the dominating row sometimes pays better. But the assumption of positive probability on all the columns under consideration means that there is at least one column in which the dominated row is strictly inferior and gets positive weight, thus lowering the weighted average that is the expected payoff. Thus, a strategy that is weakly dominated relative to a subset of strategies is strictly dominated against any mixture putting positive probability on all strategies in the subset and therefore is not a best response to any such mixture and cannot be part of a mixed best response. Therefore the possibility can be eliminated of a Nash equilibrium in which player II puts positive weight on columns 1-2 but not 3 and player I puts positive weight on row 3, ruling out the mixes for player I of 1-3, 2-3, and 1-2-3. Thus if there is any Nash equilibrium in which player II mixes exactly columns 1-2, then player I must be mixing exactly rows 1-2.

Continuing with this analysis, suppose player I is mixing exactly rows 1-2. When row 3 is not considered, then column 1 is dominated by column 3, so player II has no best response to any mix of rows 1-2 that uses column 1. This rules out the possibility of any Nash equilibrium in which player II mixes exactly columns 1-2. It also rules out the possibility of any Nash equilibrium in which player I mixes rows 1-2 and player II puts positive weight on column 1. Each of the combinatorial possibilities for different subsets of pure strategies receiving positive weight in mixtures can be analyzed in this way as a possibility for a Nash equilibrium and most will be ruled out. The possibilities for Nash equilibria that are not

47

eliminated in this way will be further analyzed. Organizing all possibilities for mixed strategies in a table facilitates a systematic analysis (Figure 3.6).

Figure 1.13 Combinatorial Possibilities for Mixed Equilibria

1 2 3 12 13 23 123

1 X X X X X X X

2 X X X X X X X

3 X X X X X X X

12 X X X X X 1 X X

13 X X X X 2 X 3 X 4 X

23 X X X X 5 X 6 X 7 X

123 X X X X 8 X 9 X 10

The shaded area represents both players mixing. The black X’s represent possibilities that have been ruled so far by the preceding logic, including any combination of one player mixing and one player playing a pure strategy, as well as any possibility of player II mixing exactly columns 1-2, and any possibility of player I mixing exactly rows 1-2 and player II putting positive weight on column 1. The remaining possibilities are numbered 1 through 10 in the table. If one considers player II mixing exactly columns 1 and 3, then row 3 dominates row 2, ruling out any mixture using row 2 (Red X’s in cells 5, and 8). If one considers player II mixing exactly columns 2 and 3, then row 2 dominates row 1, so there are no mixed best responses

48

using row 1 (blue X’s in cells 1, 3 and 9). Now consider Player I mixing rows 1 and 3. Column 2 dominates column 3 so any mixture of player II using column 3 can be eliminated (brown X in cells 2 and 4). Now consider I mixing rows 2 and 3. Column 1 dominates column 2 so all cells representing column 2 in the mix can be eliminated (green X’s in cells 6 and 7). One possibility remains: each player mixing all three pure strategies. The are no dominated rows when we consider all three columns, and there are no dominated columns when we consider all three rows, so this cell represents a possible Nash equilibrium. In fact, because it is the only type of mixture that has not been ruled out, there must be an equilibrium because every finite game has at least one equilibrium.

To find the equilibrium, we must find the exact mixed strategy of player II that makes player I’s payoff equal from all rows he/she is mixing and find the exact strategy of player I that makes player II’s payoff equal from all the columns he/she is mixing. These conditions can be expressed as the solution to two constrained systems of equations:

1((1,0,0),y) = 1y1 + 0y2 – 1y3 = –1y1 + 1y2 + 0y3 = 1(0,1,0),y) 1(0,1,0),y) = –1y1 + 1y2 + 0y3 = 0y1 – 1y2 + 1y3 = 1(0,0,1),y) y1 + y2 + y3 = 1 y ≥ 0 and 2(x,(1,0,0)) = –1x1 + 0x2 + 1x3 = 1x1 – 1x2 + 0x3 = 2(x,(0,1,0)) 2(x,(0,1,0)) = 1x1 – 1x2 + 0x3 = 0x1 + 1x2 – 1x3= 2(x,(0,0,1))

49

x1 + x2 + x3 = 1 x ≥ 0

The solutions, respectively, are y* = (1/3, 1/3, 1/3) and x* = (1/3,1/3,1/3). Because the payoffs to player I from all rows are equal given y*, any and all rows are best responses to y* and any mixture (x1,x2,x3) is a best response. In particular, the mixture x* is a best response. Because the payoffs to player II from all columns are equal given x*, any and all columns are best responses to x* and any mixture (y1,y2,y3) is a best response. In particular, the mixture y* is a best response. Thus (x*,y*) is a Nash equilibrium.

The analysis just performed can be systematized in an algorithm.

1.14 Algorithm to Find Mixed Nash equilibria

The following algorithm will find all Nash equilibria in a finite strategic form two-player general sum game.

Definitions: Assume an m x n bi-matrix, that is, a matrix G = (aij,bij) in which each entry (aij,bij), aij is the payoff to the row player and bij is the payoff to the column player. Player 1 will refer to the row player and player 2 will refer to the column player. “Pure strategy for 1” is synonymous with “matrix row”. “Pure strategy for 2” is synonymous with “matrix column”. 0) (Pre-processing)

a. Eliminate from the game matrix any row or column which is strictly dominated. No strictly dominated strategy can ever be given positive weight in any Nash equilibrium strategy profile.

50

b. Create a table with rows representing all non-empty subsets of player 1’s pure strategies that remain after step (0.a) and columns representing all non-empty subsets of player 2’s pure strategies that remain after step (0.a). Each entry in this table will be regarded as a possibility, meaning a possibility for one or more Nash equilibria in which each player puts positive weight on every pure strategy in the respective subset corresponding to the table entry. The algorithm will methodically eliminate or validate these equilibrium possibilities.

1) For each row of the table, eliminate from consideration any possibilities that put positive weight on strategies of player 2 that are dominated (weakly or strongly) relative to the subset of player 1’s strategies represented by the table row. That is, if the table row represents a subset I of pure strategies of player 1 and the table column represents a subset J of pure strategies of player 2, eliminate the entry if any matrix column j J is dominated relative to the subset I of rows. Elaborating yet further, eliminate the table entry if there exists any matrix column j J and any other matrix column k (that need not be in J) such that bik ≥ bij for all i I and bik > bij for at least one i I. Note that if the set I contains only one row index i, then the rule eliminates every column j except those which satisfy bij == Maxkbik.

2) For each column of the table, eliminate from consideration any (remaining) possibilities that put positive weight on strategies of player 1 that are dominated relative to the subset of player 2’s strategies represented by the table column. That is, If the table

51

column represents a subset J of pure strategies of player 2 and the table row represents a subset I of pure strategies of player 1, eliminate the table entry if any matrix row i I is dominated relative to the subset J of matrix columns. Elaborating yet further, eliminate the table entry if there exists ANY matrix row i I and any other matrix row k (that need not be in I) such that akj ≥ aij for all j J and akj > aij for at least one j J. Note that if the set J contains only one column index j, then the rule eliminates every row i except those which satisfy aij == Maxkakj.

3) For each possibility that has not been eliminated by steps (2) and (3), use algebra to either discard the possibility or find one or more equilibria. In so doing, keep in mind two things:

The criterion determining Nash equilibrium is reciprocal best response.

The purpose of the table is to organize the search for equilibria so as to avoid missing any possibilities. The divisions of the table are relatively unimportant once the equilibria have been found. Specifically, each table entry corresponds to strategies for the players that put positive weight on every pure strategy in the subsets I and J represented by the table entry. Therefore, a continuous set of equilibria can be split across two or more table entries. Do not let this confuse you. The equilibria, once found, are of lasting importance. The table is just a tool to find them. a. If an entry corresponds to each player using a pure

strategy, then that strategy profile can be recognized as a Nash equilibrium without further analysis.

b. If an entry corresponds to a set I of two or more pure strategies for player 1 and a pure strategy j for

52

player 2, then player 1 has multiple pure best responses to player 2’s pure strategy and any mixture of those is equally a best response. Therefore, any mixture x for player 1 to which player 2’s pure strategy is a best response makes the strategy profile (x,j) a Nash equilibrium. Such an equilibrium is found by solving the system of inequalities expressing that player 2’s pure strategy j pays as well against x as any other pure strategy k. That is,

x·(b1j,…,bmj) ≥ x·(b1k,…,bmk) k (3.7.3.b) c. If an entry corresponds to a set J of two or more

pure strategies for player 2 and a pure strategy i for player 1, then player 2 has multiple pure best responses to player 1’s pure strategy and any mixture of those is equally a best response. Therefore, any mixture y for player 2 to which player 1’s pure strategy is a best response makes the strategy profile (i,y) a Nash equilibrium. Such an equilibrium is found by solving the system of inequalities expressing that player 1’s pure strategy i pays as well against y as any other pure strategy k. That is,

y·(ai1,…,ain) ≥ y·(ak1,…,akn) k (3.7.3.c) d. If an entry corresponds to a set I of two or more

pure strategies i for player 1 and a set J of two or more strategies j for player 2, then condition (3.7.3.b) must be met iI and condition (3.7.3.c) also must be met jJ, so the possibility is eliminated if either test is failed. These tests are elaborated in (i) and (ii) below. i. An equilibrium (x,y) exists only if every pure

strategy in the set I is a best response for 1 to

53

player 2’s mixed strategy y, which means y is a solution to the set of inequalities

y·(ai1,…,ain) ≥ y·(ak1,…,akn) i I and k which can only happen if y is a solution to the system of equations y·(ai1,…,ain) = y·(ai’1,…,ai’n) i, i’ I The solution to the system of equations is generally easier to check first, but if a solution is found, it must still be verified that the solution y also solves the system of inequalities y·(ai1,…,ain) ≥ y·(ak1,…,akn) k I, for one (representative) iI or else the strategies in I are not best responses to y and so, neither, is the mixture x. In solving the above systems, recall that y must solve the equation y·(1,…,1) = 1 and the inequality y ≥ (0,…,0) or y is not a mixed strategy. This equation can be explicitly included in a system of |I| equations in |J| unknowns or implicitly included in a system of |I|-1 equations in |J|-1 unknowns by expressing the last non-zero component of y as 1 minus all the other non-zero components.

ii. An equilibrium (x,y) exists only if every pure strategy j J is a best response for 2 to player 1’s mixed strategy x, which means x is a solution to the set of inequalities

x·(b1j,…,bmj) ≥ x·(b1k,…,bmk) jJ and k which can only happen if x is a solution to the system of equations x·(b1j,…,bmj) = x·(b1j’,…,bmj’) j, j’ J It is efficient to check the system of equations first, but still it must be verified that any solution x also solves the system of inequalities

54

x·(b1j,…,bmj) ≥ x·(b1k,…,bmk) k J, for one (representative) jJ or else the strategies in J are not best responses to x and so, neither, is the mixture y. Any solution x must also satisfy x·(1,…,1) = 1, and x ≥ (0,…,0) This equation can be explicitly included in a system of |J| equations in |I| unknowns or implicitly included in a system of |J|-1 equations in |I|-1 unknowns by expressing the last non-zero component of x as 1 minus all the other non-zero components.

iii. If a strategy profile (x,y) has been found that satisfies (i) and (ii), then it is a Nash equilibrium. If sets of strategy profiles {(x,y)} have been found that satisfy (i) and (ii), they are all Nash equilibria.

This algorithm can be extended in a natural way to higher dimensional matrix games. The next example illustrates the analysis for the case of equilibrium in which one player mixes and the other uses a pure strategy (Figures 1.14 – 1.16).

Figure 1.14 Strategic Form Game with Many Mixed Equilibria

1,11,02,3

2,11,2-1,1

1,-10,12,1

1,11,02,3

2,11,2-1,1

1,-10,12,1

55

Figure 1.15 Table Analysis of Example 3.7

Figure 1.16 Codes in Example 3.7 – 3.8

X1: column 3 is an inferior response to row 1 X2: columns 1 and 3 are inferior responses to row 2 X3: columns 2 and 3 are inferior responses to row 3 X4: column 2 weakly dominates columns 1 and 3 relative to rows 1 and 2 X5: column 1 weakly dominates columns 2 and 3 relative to rows 1 and 3 X6: column 1 weakly dominates column 3 relative to rows 2 and 3 X7: column 1 weakly dominates column 3 relative to rows 1, 2 and 3 *Note that X7 – this is not enough to remove column 3 from any equilibrium. Column 3 is eliminated from any equilibrium because column 3 is never a best response – one of the two columns 1 and 2 is always strictly better. Z1: row 2 is inferior response to column 1 Z2: row 1 in inferior response to column 2 Z3: row 2 is the unique best response to column 3 Z4: row 3 dominates rows 1 and 2 relative to columns 1

X7,Z7X7,Z6X7,Z5Z4X7,Z3Z2Z1123

X6X6,Z6X6,Z5Z4X6,Z3EZ123

X5,Z7X5,Z6X5,Z5X5,Z4X5,Z3X5D13

X4,Z7X4,Z6X4,Z5X4,Z4X4,Z3Z2X412

X3X3,Z6X3,Z5X3X3,Z3X3C3

X2X2X2X2,Z4X2BX22

X1,Z7X1,Z6X1,Z5Z4X1,Z3Z2A1

123231312321

X7,Z7X7,Z6X7,Z5Z4X7,Z3Z2Z1123

X6X6,Z6X6,Z5Z4X6,Z3EZ123

X5,Z7X5,Z6X5,Z5X5,Z4X5,Z3X5D13

X4,Z7X4,Z6X4,Z5X4,Z4X4,Z3Z2X412

X3X3,Z6X3,Z5X3X3,Z3X3C3

X2X2X2X2,Z4X2BX22

X1,Z7X1,Z6X1,Z5Z4X1,Z3Z2A1

123231312321

56

and 2 Z5: absent because there is no row domination relative to columns 1 and 3 Z6: row 2 dominates rows 1 and 3 relative to columns 2 and 3 Z7: row 3 dominates row 1 relative to columns 1, 2 and 3 *Note that Z7 is not enough to remove row 1 from any equilibrium. In fact, row 1 is part of a Nash equilibrium (A) because it is sometimes a best response.

Now the remaining possibilities are labeled A through E for ease of reference. Keep in mind that each case, A,…,E is defined by assumptions on precisely which xi and yj are equal to zero (all others being assumed strictly positive). In what follows, a Nash equilibrium meeting the assumptions of each case will either be precisely characterized or ruled out with detailed algebra, even if it could not be eliminated a priori by domination. Throughout the discussion, the following notation will be used:

r1 = (1,0,0), r2 = (0,1,0), r3 = (0,0,1),

c1 = (1,0,0), c2 = (0,1,0), c3 = (0,0,1)

x = (x1, x2, x3) where xi ≥ 0 and ixi = 1

y = (x1, x2, x3) where yj ≥ 0 and jyj = 1

A: (x*,y*) where x* = r1 and y* = c1 is Nash equilibrium because any pure strategy profile surviving elimination is a Nash equilibrium.

B: (x*,y*) where x* = r2 and y* = c2 is a Nash equilibrium for the same reason.

57

C: (x*,y*) where x* = r3 and y* = c1 is a Nash equilibrium for the same reason.

D: Because r1 and r3 are both best responses to c1, any mixture x = (x1, 0, 1 – x1) is a best response to c1. Conversely, because c1 dominates both other columns when both r1 and r3 receive positive weight, c1 is a best response to any mixture of r1 and r3. Thus {x*=(x1,0,1 – x1),y* = (1,0,0) | 0 < x1 < 1} is a continuous set of Nash equilibria. Notice that we could allow the inequalities to be weak, i.e. 0 ≤ x1 ≤ 1, and the extra two solutions are cases A and C.

E: Any mixture of r2 and r3 is a best response to c2. Column c2 is a best response to x = (0, x2, 1 – x2) when

2(x,c2) ≥ 2(x,c1) and 2(x,c2) ≥ 2(x,c3), i.e.

2x2 + 0(1 – x2) ≥ 1x2 + 3(1 – x2) => x2 ≥ ¾.

Therefore {(x*,y*) | y*=(0,1,0) & x*=(0,x2,1 – x2), 3/4 ≤ x2 < 1} is a continuous set of Nash equilibria. The strict inequality x2 < 1 is only to match the assumptions of position E in the table. x2 = 1 is still an equilibrium – it is the equilibrium at position B in the table.

The next example (Figures 1.17 – 1.18) demonstrates solutions that include both players mixing fewer than all of their available pure strategies and also demonstrates that some table entries that are not eliminated by domination are eliminated when the detailed analysis is carried out.

58

Figure 1.17 More multiple mixed equilibria

Figure 1.18 Table analysis of the game in Figure 3.10

(Justifications for the eliminated cells are left for the reader.)

A x1 = x2 = 0, y2 = y3 = 0 is a pure strategy Nash equilibria (x,y) where x = (0,0,1), y = (1,0,0).

B x2 = 0, y2 = 0. For player 1 to have a best response mixing rows 1 and 3, these rows must be each be best responses, i.e.,

1(r1,y) = 1(r3,y) ≥ 1(r2,y)

–y1 + 2(1 – y1) = y1 + 0(1 – y1) ≥ –y1 + (1 – y1)

0,22,11,3

1,-11,0-1,3

2,10,2-1,-1

0,22,11,3

1,-11,0-1,3

2,10,2-1,-1

FEZZZZZ123

XXXXXXZ23

DCBZZZZ13

XXXZXZZ12

XXXXXXA3

XXXXXXZ2

XXXXXZX1

123231312321

FEZZZZZ123

XXXXXXZ23

DCBZZZZ13

XXXZXZZ12

XXXXXXA3

XXXXXXZ2

XXXXXZX1

123231312321

59

Because r1 dominates r2 when y2 = 0, the inequality is satisfied by any mixture of columns 1 and 3. The equation implies y1 = y3 = 1/2.

For player 2 to have a best response mixing columns 1 and 3, the columns must each be best responses, i.e.

2(x,c1) = 2(x,c3) ≥ 2(x,c2)

–x1 + 3(1 – x1) = x1 + 2(1 – x1) ≥ 2x1 + (1 – x1)

The equation implies x1 = 1/3 and the inequality holds for this value of x1, so (x,y) is a Nash equilibrium where x = (1/3, 0, 2/3) and y = (1/2, 0, 1/2).

C x2 = 0, y1 = 0. For player 1 to have a best response mixing rows 1 and 3, the rows must each be best responses, i.e.,

1(r1,y) = 1(r3,y) ≥ 1(r2,y)

0y2 + 2(1 – y2) = 2y2 + 0(1 – y2) ≥ y2 + (1 – y2)

The equation implies y2 = y3 = 1/2, which yields a three way equality, so r1 and r3 (just) qualify as best responses.

For player 2 to have a best response mixing columns 2 and 3, the columns must each be best responses, i.e.,

2(x,c2) = 2(x,c3) ≥ 2(x,c1)

2x1 + (1 – x1) = x1 + 2(1 – x1) ≥ –x1 + 3(1 – x1)

The equation implies x1 = 1/2 and the inequality holds for this value of x1, so (x,y) is a Nash equilibrium where x = (1/2, 0, 1/2) and y = (0, 1/2, 1/2).

D x2 = 0, yj > 0 j. For player 2 to have a best response mixing columns 1, 2 and 3, the columns must each be best responses, i.e.,

60

2(x,c2) = 2(x,c3) = 2(x,c1)

2x1 + (1 – x1) = x1 + 2(1 – x1) = –x1 + 3(1 – x1)

From the analysis of (C), the first equation holds only when x1 = 1/2 and from the analysis of position B in the table, the second equation holds only when x1 = 1/3, so there is no value of x1 that makes all three columns best responses and therefore no Nash equilibrium satisfying the assumptions of position D in the table.

E xi > 0 i, y1 = 0. From the analysis of C, y = (0, 1/2, 1/2) renders all three row payoffs equal. It remains to be checked if there is a mixture of all three rows such that columns 2 and 3 are both best responses, i.e.,

1(x,c2) = 1(x,c3) ≥ 1(x,c1)

2x1 + (1 –x1 –x2) = x1 – x2 + 2(1 –x1 –x2) ≥ –x1 + 3x2 + 3(1 –x1 –x2)

The first equation implies x2 = 1/2 – x1. Substituting this into the inequality yields x1 ≥ 5/12. Therefore, there is a set of Nash equilibria:

{ (x,y) | y = (0,1/2,1/2), x = (x1, 1/2 – x1, 1/2), 5/12 ≤ x1 < 1/2 }

Note that allowing x1 = 1/2 does not violate the reciprocal best response condition, but rather forces x2 = 0, which violates the assumption of position E in the table and instead refers to position C.

F xi > 0 i, yj > 0 j. The analysis from E shows x = (5/12, 1/12, 1/2) renders all three column payoffs equal. For player 1 to mix all three rows, the column mix of player 2 must render all three row payoffs equal:

1(r1,y) = 1(r2,y) = 1(r3,y)

61

–y1 + 2(1 –y1 –y2) = –y1 + y2 + (1 –y1 –y2) = y1 + 2y2 + 0(1 –y1 –y2)

The only solution to this system is y1 = 0, y2 = 1/2, which does not meet the assumption y1 > 0 of position F in the table, so no additional Nash equilibria come from position F.

1.15 Two-Person Zero-Sum Games

Two-person zero-sum games are, as the name suggests, games with two players in which, at every outcome, the payoffs to the two players always sum to zero. These games are called strictly competitive because the players interests are diametrically opposed – one player can only improve his or her payoff at the expense of the other player. In the case of finite or countably infinite strategies, the structure of the payoff function admits a streamlined representation of the strategic form of the game, namely the suppression of the column player’s payoff, which need not be explicitly written because it is always known to be the negative of the row player’s payoff. Thus, an n x m matrix A = (aij) with a single real number aij in each cell represents the strategic form of a finite two-person zero-sum game. In this representation, aij is the row player’s payoff given strategies i and j and the payoff to the column player is –aij. Note that any matrix can be regarded as a two-person zero-sum game, just as any bi-matrix can be regarded as a general sum game.

It is the convention to discuss two-person zero-sum games entirely in terms of the row player’s payoffs and to refer to the payoff of the game, rather than the payoffs to the two players. It is also convention, as with general sum games, to refer to the row player as player 1 if there is no external

62

context establishing some other names for the players. Except where explicitly noted to the contrary, player 1 and row player will be used synonymously in the discussion that follows, as will player 2 and column player. In this context, player 1 strives to maximize the payoff of the game and player 2 strives to minimize the payoff of the game.

It is important to remember that because two-person zero-sum games are a special case of general sum strategic form games, all of the results discussed in the preceding sections regarding finite games obtain in finite two-person zero-sum games. In particular, weak and strict domination work the same, with the adjustment that “better” for the column player means “smaller”. Also, at least one Nash equilibrium exists in every such game and the algorithm for finding all pure strategy Nash equilibria and the algorithm for finding all Nash equilibria (mixed and pure) in a finite strategic form game work on zero-sum games, with the slight adjustment that a best response for player 2 is the column(s) with the smallest payoff rather than the column(s) with the largest payoff (Figure 1.19).

63

Figure 1.19 Dominated row in a two-person zero-sum game

Column 4 strictly dominates column 2 because in each row, the payoff in column 4 is strictly less than the payoff in column 2. In searching for Nash equilibria, column 2 can be entirely ignored because no strictly dominated strategy is ever used in any Nash equilibria. Analysis of the game can therefore continue as a 4x3 game with column 2 omitted (Figure 1.20).

Figure 1.20 Game 1.19 with dominated column removed

The algorithm for finding pure strategy Nash equilibria in a two-person zero-sum game is the same as for general sum games: mark the best response(s) in each column for the

–2–1–1–3

2

0

–1

132

11–1

002

–2–1–1–3

2

0

–1

132

11–1

002

–2–1–3^

2*

0

–1^

1^*2*

1*–1^

02*

–2–1–3^

2*

0

–1^

1^*2*

1*–1^

02*

64

row player and the best response(s) in each row for the column player. That is, mark the max in each column (shown with a * in Figure 1.20) and the min in each row (shown with a ^ in Figure 1.20). When they occur in the same cell, the corresponding row and column constitute a Nash equilibrium. In more general algebraic terms, this is called a saddlepoint.

The value of a saddlepoint, when one exists, is called the minmax1 value of the game, and can be found by first finding the max in each column, and then finding the minimum of these maxima, or it can also be found by finding the minimum in each row, and then finding the maximum among these minima. If there is no saddlepoint, the max of the minima (called maxmin2 for short) is always less than or equal to the min of the maxima (called the minmax*). The maxmin equals the minmax if and only if the matrix has a saddlepoint. When this occurs, the saddlepoint has the minmax value and the min of the maxima occurs in the column of the saddlepoint and the max of the minima occurs in the row of the saddlepoint. When there is no saddlepoint, the Nash equilibrium, which Nash’s theorem states must exist, must therefore occur with mixed strategies.

It can be found by the same algorithm used for general sum strategic form games (Figure 1.2.1 – 1.2.3).

1 Also called minimax 2 Also called maximin

65

Figure 1.21 Two-Person Zero-Sum game with no saddlepoint

Figure 1.22 Table analysis of game 1.2.1

A Nash equilibrium is (x*,y*), where x* = (x1, x2, 1–x1 –x2), y* = (y1, y2, 1–y1 –y2) and

(r1,y*) = (r2,y*) = (r3,y*)

(x*,c1) = (x*,c2) = (x*,c3)

or x* = (3/8, 1/8, 1/2) and y* = (1/4, 1/8, 5/8).

In this example, all three columns are used, so setting the payoffs equal suffices to prove they are all best responses. If any column were not used, it would be necessary to show that the payoff from the unused column is not less than the

121

21-1

102

121

21-1

102

AZ6Z5Z4Z3Z2Z1123

X6X6X6X6X6X6Z123

X5Z6X5X5Z3Z2X513

X4X4X4Z4X4Z2Z112

X3X3Z5X3Z3X3Z13

X2X2X2X2X2X2Z12

X1X1X1X1X1Z2X11

123231312321

AZ6Z5Z4Z3Z2Z1123

X6X6X6X6X6X6Z123

X5Z6X5X5Z3Z2X513

X4X4X4Z4X4Z2Z112

X3X3Z5X3Z3X3Z13

X2X2X2X2X2X2Z12

X1X1X1X1X1Z2X11

123231312321

66

payoff from the columns given positive weight, because player 2 maximizes his/her own payoff by minimizing player 1’s payoff.

i. The Value of a Game

An interesting property of two-person zero-sum games is that when a game has multiple Nash equilibria, every equilibrium has exactly the same payoff. It is straightforward to prove that when multiple saddlepoints exist, they all have the same value.

Theorem 4

Let A=(aij) be an m x n matrix and let ai*j* and ak*l* be two saddlepoints in A. Then ai*j* = ak*l* .

Proof Any two saddlepoints in the same row are each the min in the row and thus equal. Any two saddlepoints in the same column are each the max in the column and thus equal. If ai*j* and ak*l* are both saddlepoints and i k, j l, then ai*j* ≤ ai*l ≤ ak*l ≤ ak*j* ≤ ai*j* which implies that equality holds throughout. The first inequality holds because ai*j* is the min in its row. The second is because ak*l* is the max in its column. The third is because ak*l* is the min in its row. The fourth is because ai*j* is the max in its column.

ii. Linear Programming and Games

The proof that every mixed strategy Nash equilibrium has the same value relies on Linear Programming, which can be used to find the mixed strategy equilibria more efficiently than the table method – especially for large games.

The formulation of a linear program to solve a two-person zero-sum game relies on the strictly competitive nature of

67

the game – the fact that player 1 wants to maximize the payoff and player 2 wants to minimize the payoff. Because player 2 wants to minimize, player 1 has a basis for assuming that no matter what he/she does, the worst case payoff will result. Now, the worst case payoff that can result depends on what strategy is chosen. For example, the minimum in each row is the worst case payoff if the row is used as a pure strategy, and the maxmin payoff is player 1’s best guaranteed payoff using pure strategies. Likewise, player one will give player 2 the worst case payoff given any strategy for player 2, so the minmax payoff is player 2’s best guaranteed payoff using pure strategies. The purpose of using linear programming is for each player to optimize their worst case payoff (i.e. their guaranteed payoff) over their respective sets of mixed strategies – not just pure strategies.

Formally, recall that when player 1 uses x and player 2 uses y, the payoff of the game is xAyt. Therefore, player 1 seeks

max{over x}{min{over y} xAyt }

and player 2 seeks

min{over y}{max{over x} xAyt }

Now, these are not linear programs as stated, because the min (respectively, max) being optimized by player 1 (respectively, player 2) are not linear. However, maximizing a lower bound instead of the min linearizes the problem for player 1 and minimizing an upper bound instead of the max linearizes the problem for player 2. The maximum of a lower bound equals the maximum of the minimum and is achieved at the same value of x*. Likewise, the minimum of an upper bound equals the minimum of the maximum and is achieved at the same

68

value of y*. Therefore the linear problem finds the same value as the original problem and the same x* (respectively, y*) that optimizes the minimum (respectively, maximum).

Figure 1.23 Player 1’s linear program (LP)

Max L

Subject to

1(x,cj) ≥ L j = 1,…,n

ixi = 1

x ≥ 0

Figure 1.24 Player 2’s linear program (LP)

Min U

Subject to

2(r1,y) ≤ U i = 1,…,m

jyj = 1

y ≥ 0

It suffices for each player to consider the worst case payoff against their opponent’s pure strategies, because the payoff against any mixed strategy of the opponent is a weighted average of the payoff against pure strategies, which is bounded above and below by the max (respectively min) of the pure strategy payoffs. That is, the minimum payoff against mixed strategies equals the minimum payoff against pure strategies, and the maximum payoff against mixed strategies equals the maximum payoff against pure strategies. The example that follows shows how to set up a

69

pair of linear programs that solve for optimal strategies x* and y* for Player 1 and Player 2, respectively.

Figure 1.25 Game 1.2.1 Revisited

Figure 1.26 Player 1’s LP for game 1.2.1 / 1.2.3

Figure 1.27 Player 2’s LP for game 1.2.1/1.2.3

There are many ways to solve linear programs, including many commercial software packages. This text will focus

121

21-1

102

121

21-1

102

Player 1 (symbolic)

max Lsubject to(x,c1) ≥ L(x,c2) ≥ L(x,c3) ≥ L

x1 + x2 + x3 = 1x1, x2, x3 ≥ 0

Player 1 (numeric)

max Lsubject to

2x1 – x2 + x3 ≥ L0x1 + x2 + 2x3 ≥ L1x1 + 2x2 + x3 ≥ Lx1 + x2 + x3 = 1

x1, x2, x3 ≥ 0

Player 2 (symbolic)

min Usubject to(r1,y) ≤ U(r2,y) ≤ U(r3,y) ≤ U

y1+y2+y3 = 1y1, y2, y3 ≥ 0

Player 2 (numeric)

min Usubject to

2y1 + 0y2 + 1y3 ≤ U–1y1 + 1y2 + 2y3 ≤ U

1y1 + 2y2 + 1y3 ≤ Uy1 + y2 + y3 = 1

y1, y2, y3 ≥ 0

70

on primarily setting up the linear programs, not solving them. The Mathematica section for this chapter will show how to solve them with Mathematica, which is useful in the context of this course, though it is not the most efficient LP solver for general purposes.

Two important facts about the linear programs 1.9.ii and 1.9.iii are critical to the study of two-person zero-sum games:

1) the LPs 3.19 and 3.20 always have optimal solutions, and

2) the LPs 3.19 and 3.20 are dual programs to one another. As such, they have the same optimal value.

That is, max L = min U – always! For this reason the value of the game is defined as V = max L = min U. Compare this to general sum games in which multiple Nash equilibria can have different payoffs! In a zero-sum game, player 1 has at least one strategy that guarantees the payoff to be at least V and player 2 has a strategy that guarantees the payoff to be at most V. Therefore, player 1 is not making a best response in any strategy profile that results in a payoff of less than V and player 2 is not making a best response in any strategy profile that results in a payoff of more than V. Therefore, every Nash equilibrium must have payoff exactly V. Furthermore, the strategies x* and y* can truly be called optimal, a concept not well defined for general sum games. Because of this strong solution concept for two-person zero-sum games, the expression solve the game means find the value of the game and the optimal strategies for both players.

Recall that tic-tac-toe, chess and all two-player games with a win-lose outcome can be modeled as zero-sum games,

71

including sports competitions like football, basketball and baseball, although people betting on those games may be modeling them as general sum games if they are concerned with point spreads and the like. This is yet one more example of adjusting the fidelity of a model to accommodate the level of detail relevant in an application. It is well known that the value of tic-tac-toe is 0 and by backward induction there are optimal pure strategies for both players.

iii. Graphical Method for 2 x n and m x 2 Games

When one player has only two pure strategies, the game – indeed the LP – can be solved graphically (Figure 3.21). If player 1 has two strategies, then x = (1 – x2, x2) may be used to represent an arbitrary mixed strategy. The reason for choosing this rather than (x1, 1 – x1) will become clear in what follows. For each of player 2’s pure strategies, the payoff of the game can be graphed as a function of player’s 1’s mixture, as represented by x2 varying from 0 to 1 (Figure 3.22).

Figure 1.28 2 x n zero sum game

Figure 3.22 graphs the payoff resulting from each of player 2’s pure strategies (columns) as a function of player 1’s mixed strategy, represented by the weight on the 2nd row, so that payoffs go from the 1st row to the 2nd row as x2 goes

1

2

-402

5-2-3

1

2

-402

5-2-3

72

from 0 to 1 (left to right). Player 1’s strategy could just as easily be represented x1 instead of x2, i.e. x = (x1, 1 – x1), and then payoffs would go from the 2nd row to the 1st as x1 went from 0 to 1. The method shown struck the author as more intuitive. For any value of the variable x2, the minimum is highlighted with a thick red line. The concept of the LP can be seen in this picture: the feasible space for L is dependent on the choice x.

Figure 1.29 Graphical solution for x* and V

The constraints on L, which is itself just a single real variable, are that L be less than or equal to the payoff against each column, that payoff being a function of player 1’s strategy x. Thus, for any x, the feasible space for L is the interval from the thick red line to –∞, so the feasible set for the LP is the set of (x2,L) that lie between x2 = 0 and x2 = 1 and below the red line. The linear program is solved by setting x* such that the constrained maximum of L is as big as possible. From the picture it is clear that this is exactly

x2=0 x2=1

543210

-1-2-3-4

(x,c1)

(x,c2)

(x,c3)

(x,c4 )

x2*

maxmin = V

x2=0 x2=1

543210

-1-2-3-4

(x,c1)

(x,c2)

(x,c3)

(x,c4 )

x2*

maxmin = V

73

the same as maximizing over x the minimum payoff that is possible given x.

Using the graph to identify which column(s) are responsible for the minimum payoff enables solving for x2* and V. In the numerical example at hand, the maxmin occurs where

(x,c2) = (x,c4)

–2 + 2x2 = 5 – 9x2

=> x2 = 7/11, so x = (4/11, 7/11) and V = –2 + 2(7/11) = –8/11

Once the maxmin has been found, player 2’s optimal strategy can be derived. Player 2 must be making a best response to player 1, and only c2 and c4 can be used in a best response to x* = (4/11, 7/11). This can be seen in the graph, where

(x*,c2) = (x*,c4) > (x*,c3), and

(x*,c2) = (x*,c4) > (x*,c1)

These inequalities imply that any mixture of c2 and c4 will be a best response to x*, but, as in the “table method” for general sum games, y* is determined by setting the payoffs equal from the strategies that player 1 is mixing, i.e.,

(r1, y*) = (r2, y*)

where y* = (0, y2*, 0, y4*), y2* + y4* =1 and y2*,y4* > 0

–2y2* + 5(1 – y2*) = 0y2* + –4(1 – y2*)

=> y2* = 9/11, so y* = (0, 9/11, 0, 2/11).

In summary, the value of the game is -8/11 and the Nash equilibrium for the game is (x*, y*), where x* = (4,11,7/11) is the optimal strategy for the row player and y*

74

= (0, 9/11, 0, 2/11) is the optimal strategy for the column player.

iv. Exercises (extracted to angel)

1.16 Famous Strategic Form Games

The final section on strategic form games is a summary of some of the most famous strategic form games. These games have been the subject of hundreds and even thousands of research papers, books, experiments and symposia. Anyone having a course in game theory should know these games well enough to reproduce them from memory and explain to someone what is interesting about them.

i. Prisoner’s Dilemma

The Prisoner’s Dilemma gets it’s name from a story first used in the literature by Howard Raiffa to describe the game. Two prisoners are arrested and placed in separate interrogation rooms. They are guilty but the police have no evidence, so if neither one talks, they both go free. The police offer each freedom plus a reward if they “squeal” on their accomplice. If one squeals the other gets a stiff prison sentence. If they both squeal, they both get moderate prison sentences. These outcomes are modeled generically with parameters for general proofs and numerically for experiments (Figures 1.28, 1.29).

Figure 1.30 Traditional Prisoner’s Dilemma

T > R > P > S

75

Figure 1.31 Typical Numerical Prisoner’s Dilemma

In the generic notation above, T stands for temptation payoff, after the temptation to defect for selfish gain; R is for the reward that comes to partners who are faithful to one another, P is for the punishment that befalls the mutually unfaithful, and S is for sucker who was faithful to an unfaithful partner.

Notice that the inequalities T > R > P > S imply 2R > 2P. Akimov and Soutchanski, who formally defined social dilemma games as an n-player generalization of the prisoner’s dilemma, made the additional assumption that aggregate social welfare would strictly increase with the number of cooperators (see Appendix A). In the two-player case, this assumption is that 2R > T+S > 2P, which is not always assumed across the extensive literature devoted to the 2-player prisoner’s dilemma, but it is upheld

P,PT,SD

S,TR,RC

DC

P,PT,SD

S,TR,RC

DC

1,15,0D

0,53,3C

DC

1,15,0D

0,53,3C

DC

76

by the most common set of numerical payoffs (Figure 3.24). As with all modeling assumptions, whether or not to include this assumption depends on whether the assumption holds in the real world scenario being studied through the model.

ii. Battle of the Sexes (Coordination)

The Battle of the Sexes, still quaintly referenced by the name it was given in the 1950’s, could also be called “Meet Me You Know Where”. It is a coordination game, in that both players’ payoffs are maximized when they achieve a coordinated outcome even though they cannot communicate while making their simultaneous moves. The story that goes with the original name is about a stereotypical, heterosexual couple. The man wants to go the Fights, and the woman wants to go to the ballet, but each would rather go out together than alone. The traditional payoffs for this game are given below (Figure 3.25) as well as an alternative set (Figure 3.26).

Figure 1.32 Traditional Battle of the Sexes Game

1,5

0,0

B

0,0B

5,1F

F

1,5

0,0

B

0,0B

5,1F

F

77

Figure 1.33 Alternate Battle of the Sexes Game

In both payoff matrices, the row player prefers the Fights (F) and the column player prefers the ballet (B). When the players both choose the same event, i.e., coordinate, the biggest payoff goes to the player who prefers the chosen event. The other player still gets a positive payoff for not being alone. In the traditional payoff matrix, both players get 0 if coordination fails, i.e. if they choose different events. In the alternate payoff matrix, a going to one’s least favorite event and alone adds insult to injury, so to speak, and the payoff is –5. Going to one’s favorite event alone is negative (–1), modeling the fact that going out alone was not desired, but at the players both go to their first choice event. The traditional payoffs model the idea that going out alone is so undesirable that the event is essentially irrelevant.

In the traditional story, the people each buy two tickets in advance in the hopes of surprising the other. This is exactly the plot of a Francis Ford Coppola movie called One from the Heart. The same thing actually happened to a friend of mine. His wife, as a surprise, bought plane tickets to Las Vegas, hotel rooms and tickets to a martial arts tournament (his favorite), while my friend bought a vacation package to Cancun (her favorite). The stakes do not need to be so high. The game is played quite often by all sorts of people (not necessarily lovers) who can’t communicate but hope to meet by chance at the favorite

1,5

-1,-1

B

-5,-5B

5,1F

F

1,5

-1,-1

B

-5,-5B

5,1F

F

78

hangout of one of them – hence the alternate title meet me you know where.

iii. Hawk Dove

The Hawk Dove game was first made famous by evolutionary biologist John Maynard-Smith in his famous book Evolutionary Game Theory. The game models competition for a resource among members of a species, e.g. humans, dogs, or lions. The game begins with a display contest in which the contestants try to intimidate one another. The Hawk strategy is to fight if the other escalates and the Dove strategy is to retreat if the other escalates. The resource is assumed to have value V and the cost of a fight is C and the assumption C > V > 0 is made. Therefore, if both players play Hawk, their average payoff is assumed to be (V – C)/2, typically explained as a symmetric contest in which each player has equal probability of either winning the resource worth V or paying the cost C of the fight. A more satisfying explanation might be that C is the total cost of the fight, which is shared equally (C/2 each) in a symmetric contest and each player wins the resource V with equal probability. The difference may seem subtle – in the former interpretation, both the cost and the benefits of the fight are random; in the latter, the cost is deterministic and the benefit is random. My view is that in a fight between equally matched opponents, they both suffer equally and “the winner” is the winner by a slight margin and does not walk away unscathed. The other outcomes are straight forward: Hawk wins the resource from Dove without a fight, and two competitors who both play Dove will go unharmed, as no fight ensues, and each will have equal

79

chance of outlasting the other in the lengthy display contest (Figure 3.27).

Figure 1.34 Symmetric Hawk Dove Game

iv. Rock Scissors Paper

Rock Scissors Paper is an age old children’s game in which two players count to three and simultaneously reveal their strategy choices (Rock, Scissors, or Paper). It is a zero-sum game with outcomes determined by the rules Rock breaks Scissors, Scissors cuts Paper, and Paper covers Rock (Figure 3.28).

Figure 1.35 Rock Scissors Paper Game

V/2

V/2

V

0

D

0

V

D

(V – C)/2

(V – C)/2

H

H

V/2

V/2

V

0

D

0

V

D

(V – C)/2

(V – C)/2

H

H

–1

0

1

S

1–1S

0

–1

P

1P

0R

R

–1

0

1

S

1–1S

0

–1

P

1P

0R

R

80

v. Stop Sign Game

The Stop Sign Game is played when two cars simultaneously arrive at stop signs on opposite sides of a busy through street that has no stop signs and both wish to make left turns. Because they arrived simultaneously, there is no clear right of way and they can’t communicate clearly across the busy traffic. When a hole opens up in the traffic, they simultaneously choose to “wait” or “go”. If they both go, they crash. If they both wait, they must play again when the next hole opens up in traffic. If one waits and one goes, the one who goes clearly profits and the one who waits now has the clear right of way when the hole in traffic appears (Figure 3.29).

Figure 1.36 Stop Sign Game

vi. Chicken

Chicken is a game dating at least to the 1950’s in American popular culture and most likely pre-dates automobiles in one form or another. It was played with farm tractors in the Kevin Bacon movie “Footloose”. Two players drive straight toward each other and must choose to “turn” or continue “straight”. If neither swerves, they crash. If one swerves, that player is shamed as “chicken” and the other

–1,–1

10,1

W

1,10W

–10,–10G

G

–1,–1

10,1

W

1,10W

–10,–10G

G

81

hailed as “brave”. If both swerve, they are both chicken, but this is probably less humiliating than being the lone chicken (Figure 3.30).

Figure 1.37 Chicken

vii. Matching Pennies

Matching Pennies is a competitive (zero-sum) coordination game. Each of two players simultaneously choose “heads” or “tails”, but no coin is flipped. If the players choose the same thing (both heads or both tails), then player 1 wins. If they make different choices, player 2 wins (Figure 3.31). This game is not so much played for fun as for a substitute for flipping a physical coin. Like Rock Scissors Paper, it is often used as a way to arbitrate minor disputes.

Figure 1.38 Matching Pennies

–1,–1

10,–10

T

–10,10T

–100,–100S

S

–1,–1

10,–10

T

–10,10T

–100,–100S

S

1

–1

T

–1T

1H

H

1

–1

T

–1T

1H

H

82

viii. Exercises

1. – 7

Find all Nash equilibria of each of these games.

83

4 Continuous Games The strategic form requires that players all have a discrete set of strategies – either finite or at most countably infinite. In some games, one or more players have a continuum of pure strategies. The pure strategy payoffs for these games must be expressed by formulas rather than explicitly listed a lookup table, which is exactly what the strategic form is. Many models from classical economics are continuous games. Spatial games, where players choose positions, are continuous, including many war games. Games of timing are continuous, such as duels. Auctions are continuous in principle, though quantized by the allowable units of money.

Note that while the strategy space may be a continuum, this does not guarantee that payoffs will be a continuous functions of the players choices, let alone differentiable functions. That being said, this elementary text will indeed focus on games with twice differentiable payoff functions, to which basic calculus techniques can be applied to find the Nash equilibria.

Consider a game with players {1,…,n} where each player i chooses xi from some strategy set i R and payoffs are given by (x1,…,xn) = (1(x1,…,xn),…,n(x1,…,xn)). In accordance with definition 2.1, a Nash equilibrium is a vector (x1

*,…,xn*) satisfying

1(x1*,…,xi

*,…xn*) ≥ (x1

*,…xi,…,xn*)

Mixed strategies can be defined for continuous games just as they are for discrete games, as probability distributions over the set of pure strategies. That is, any function i: i

84

[0,1] such that

i

dxxi 1)( may be interpreted as a

mixed strategy on i Given a mixed strategy profile x =(x1,…,xn), payoffs are computed by integrating the pure strategy payoffs against the corresponding probabilities, which are multiplied to model independence:

dxxxxx nnn )()(),...,()( 111 (4.i)

The payoff function : M R is continuous, where M denotes the set of mixed strategy profiles. The best response function Br: M M is upper hemicontinuous; that is, the set of all best responses to any pure or mixed strategy profile is a closed set. Therefore, if is a compact subset of Rn then M, the set of all mixed strategy profiles, is both compact and convex, and, as in Nash’s theorem on strategic form games, Kakutani’s fixed point theorem ensures the existence of an equilibrium in mixed strategies.

Later in this chapter examples will be presented of two-player games with compact strategy sets and Nash equilibria, and also of games with non-compact strategy sets which have no equilibria. First, a method for finding equilibria in two player games with twice differentiable payoff functions.

1.17 Finding Nash Equilibria in Continuous Games

A Nash equilibrium in a continuous game is, like it’s discrete analog, is strategy profile in which each player is making a best response to his or her co-players. If Bri: ~i

i player i’s best response function (namely, the ith component of the function Br defined above), then a Nash

85

equilibrium is a strategy profile (x1,…,xn) satisfying the system of n equations

xi = Bri(x~i)for i = 1,…,n

Keep in mind these equations might not be linear and might not even be closed form functions. It might take some ad hoc analysis to determine the best response to a given strategy profile, but these equations serve as an outline for finding or verifying whether a profile is a Nash equilibrium. In the two-player case, the equilibrium equations are

1211 xBrBrx 4.1a

and

2122 xBrBrx 4.1b

Either one of the equations suffices to verify an equilibrium – if either equation is satisfied, they both are. A useful metaphor to remember the concept might be standing two mirrors facing each other, for, if the first equation is satisfied, then so is

121211 xBrBrBrBrx

and so on and so on, ad infinitum.

The equations that must be satisfied lead naturally to an method for finding Nash equilibria: start with a candidate strategy for one player, find the best response(s) of the co-player, find the best response(s) of the first player to those, and if the strategy you started with is among them, you have a Nash equilibrium. This method has the potential for a wild goose chase, but for twice differentiable functions of

86

one strategy variable for each player, some basic calculus enables us to simplify the problem and narrow the search.

i) For any fixed value of the strategy of one player, the first player’s payoff function reaches a maximum either in the interior of the strategy space or at a boundary

ii) Any interior maximum will have a zero first derivative with a negative or zero second derivative (that is, first and second partial derivatives with respect to the maximizing player).

Therefore, if there is a Nash equilibrium which is an interior point for both players then each player’s first partial derivatives are simultaneously zero and can be found by solving the simultaneous equations. If there is a Nash equilibrium which is not interior for both players, at least one player is using a boundary point for a strategy. Starting with each boundary point x* for player 1, 4.1a can be checked by determining y* = Br2(x*), then Br1(y*) and (x*,y*) is a Nash equilibrium if and only if 4.1a is satisfied, i.e. if and only if Br1(y*) = x*. Similarly, starting with each boundary point y* for player 1, 4.1a can be checked by determining x* = BR1(y*), then Br2(x*) and (x*,y*) is a Nash equilibrium if and only if 4.1b is satisfied, i.e. if and only if Br2(x*) = y*. Note that if player 1 and player 2 have continuous payoff functions and each chooses a single real strategy variable form a closed and bounded (i.e. compact) strategy set of real numbers, then a player’s best response is always defined – i.e., a continuous function always attains it’s maximum on a compact set. If any strategy set is not compact, then that player’s best response might not be defined.

The above discussion can be summarized as a general method for finding Nash equilibria in two-player games

87

with twice differentiable payoff functions. Consider the generic game in which player 1 chooses x [a,b] R, player 2 chooses y [c,d] R, ui(x,y) is player i’s payoff or utility function, as it is often called in economics, where many continuous models arise. Take the steps listed below to find all Nash equilibria of the game.

1) Find all solutions (x,y) to the simultaneous equations

0),(

0),(

2

1

yxuy

yxux

2) At each (x,y) found in (1), check whether

0),(12

2

yxux

and whether

0),(22

2

yxuy

If both 2nd partials are negative at a given (x,y), the (x,y) is a local maximum for both utility functions. If any 2nd partial is zero, further analysis must test whether x (respectively, y) is a local maximum for u1 (respectively, u2). If any 2nd partial is positive, the corresponding variable is a local minimum for the function, not a maximum.

In general, the derivative information must be used sensibly to determine best responses, which are global maxima, at least within the constraints of the strategy sets. For example, if the 2nd derivative is positive and constant, the corresponding utility function increases uniformly from

88

the minimum to the boundary in either direction, so the best response must be at the lower or upper limit of the strategy set and which of these is true can be found by evaluating the utility function at both endpoints. If any 2nd partial is negative and constant, then the local max is actually a global max and no further checking is required. If a local max has a 2nd partial which is a function, further analysis must be done to determine the true global max given the co-players choice. Depending on derivative information, checking the endpoints of the strategy sets is not always necessary, but it never hurts, so when in doubt, check!

3) If no interior Nash equilibrium is found in (2), then systematically start with each endpoint and test 4.1a and 4.1b until all possibilities are exhausted. That is, using the methods of (1) and (2), find the best response to a, namely,

),(maxarg* 2],[ yauy dcy (i)

then find the best response to y* which equals

*),(maxarg* 1],[ yxux bax (ii)

and if x* = a, then (a,y*) is a Nash equilibrium. Then repeat (i) and (ii) using b in place of a. Then, if (a,c) or (b,c) was not already found as a Nash equilibrium, repeat (i) and (ii) starting with c and d and checking 4.1b instead of 4.1a.

The above listed steps (1) – (3) will find all pure strategy Nash equilibria. If, at any stage, multiple best responses are found to some strategy, then each path must be followed to find all pure strategy equilibria and mixed strategies can be found by replacing a player’s original strategy variable with a set of linear variables putting weights on the player’s multiple best responses. The co-

89

player’s utility function can be re-written in terms of the mixture and the same methods (1) to (3) above used to find the co-player’s best response to the mixture and test 4.1a or 4.1b. Nash equilibria where both players are mixing are the hardest to derive. In practice, the derivative information guides the search for what mixtures need to be tested. This text will focus on classical examples that do not test the limits of complexity possible with models of this form. The remainder of the chapter presents a purely numerical example to demonstrate the method just given, followed by some famous examples.

1.18 Continuous Game Nash Equilibrium Example

Player I chooses x between 0 and 10 inclusive, player II chooses y between 0 and 10 inclusive. Payoffs are given by the functions u and v:

xyx (x,y) u 21

xy y(x,y) u 2 2

Following 4.1, compute the 1st and 2nd partial derivatives, check for simultaneous zeros and interpret the results.

202),(1

yxyxyxu

x

02),(12

2

yxux

202),(2

xyxyyxu

y

90

02),(22

2

yxuy

The derivative information for u1 implies that 2

yx is a

global max and the unique best response to any y chosen by player 2. It is a max because the 2nd derivative is negative. It is a global max because the 2nd derivative is a constant3. Therefore player 1’s best response function for any y is

2)(1

yyBr . The derivative information for u2 indicates

that 2

xy is a global minimum, not a maximum!

Moreover, the fact that the derivative is positive and constant implies that the maximum of yxu ,2 , for any fixed x, must be at one boundary or the other of the strategy interval4. Therefore player 2’s best response function for any x is ),(maxarg)( 2},{2 yxuxBr dc . That is, for any

fixed x, max u2(x,y) is either u2(x,0) or u2(x,10), so the max can be found by checking two points.

Because player 2’s best response is always an endpoint for any x, there are no purely interior Nash equilibria. Therefore it suffices to check for equilibria where at least one player’s choice is an endpoint. These possibilities x* = 0, x* = 10, y* = 0, y* = 10 can be methodically checked one after the other.

3 It suffices that 0),(12

2

yxux

on the strategy interval [0,10].

4 It suffices that 0),(22

2

yxuy

on the strategy interval [0,10].

91

If x* = 0, 02Br must be either 0 or 10. Evaluating the

function yields 0,2 yxu and u2(0,10) = 100, so

10)0(2 Br . *52

1010* 121 xBrxBrBr so

there is no Nash equilibrium (x*,y*) where y* = 0.

If x* = 10, 102Br must be either 0 or 10. 0)0,10(2 u

and 0)10,10(2 u , so both y* = 0 and y* = 10 are best

responses for player 2, but 10001 Br and

105101 Br , so there is no pure strategy Nash equilibrium (x*,y*) where x* = 10. Because there are two best responses to x* = 10, there is a possibility of an equilibrium where player 2 uses a mixed strategy y = (y0,y10) putting weight on 0 and 10. If this happens, player 1’s payoff function becomes

)100(),( 1002

1 yyxxyxu

1050102),( 10101

yxyxyxux

and ),(12

2

yxux

still equals 2 so *)(5 110 xyBry so

there is no mixed equilibrium (x*,y*) where x* = 10.

If y* = 0, 02

0*)(1 yBr .

*100* 212 yBryBrBr (from above) and therefore there is no Nash equilibrium (x*,y*) where y* = 0.

92

If y* =10, 52

10101 Br . When x = 5, 52Br must be

either 0 or 10. 0)0,5(2 u and 50)10,5(2 u , so

*1052 yBr . Therefore (x*,y*)=(5,10) is a Nash equilibrium.

The only possibility that has not been considered is that both player use mixed strategies. If ]1,0[]10,0[: f is a

probability distribution, i.e., yyf 0)( , 1)(10

0

dyyf ,

then ),(1 yxu 10

0

2 )( dyyyfxx , and

10

0

1 )(2),( dyyyfxyxux

so player 1’s best response

will be 10

0

1 )(2

1)( dyyyffBr , and 5)(0 1 fBr . Thus

player one has a unique pure strategy best response to any mixed strategy by player 2, so there is no Nash equilibrium in which both players mix. Therefore, (x*,y*) = (5,10) is the one and only Nash Equilibrium of this game.

This example demonstrates the general method of analyzing best responses using calculus when payoff (utility) functions are differentiable. In practice, there are important examples in which the analysis is far easier, for example when only pure strategies make sense in the model and/or when both derivatives yield unique interior best responses. The applied examples that follow illustrate some famous models that are solved with very basic analysis.

93

1.19 Cournot Duopoly Model

One of the earliest examples of modern game theory is from Antoine Augustin Cournot (28 August 1801‑ 31 March 1877), a French economist, philosopher and mathematician. He published his economics masterpiece the Recherches in 1838, which included the famous model of duopolistic competition that follows.

Two firms, firm 1 and firm 2, produce and sell an identical product. The market clearing price, at which supply equals demand, is a decreasing function p(q) of the total quantity q = q1 + q2 available from the two firms combined. Each firm i chooses a quantity qi and incurs a production cost ci(qi). With these assumptions the profit or utility function of each firm can be modeled as revenue minus cost, where revenue equals quantity produced and sold at the market price:

ui(qi) = p(q1,q2)qi – ci(qi), i = 1,2.

The example that follows shows how the equilibrium can be found by following algorithm 4.1, if the price and cost functions are made precise.

p(q1,q2) = 20 – q1 – q2

c1(q1) = 2q1

c2(q2) = 2q2

u1(q1,q2) = (20 – q1 – q2)q1 – 2q1 = 18q1 – q12 – q1q2

u2(q1,q2) = (20 – q1 – q2)q2 – 2q2 = 18q2 – q22 – q1q2

2111 218/ qqqu

2/ 211

2 qu

94

2122 218/ qqqu

2/ 222

2 qu

2111 218/ qqqu = 0 = 21 218 qq = 22 / qu

q1 = q2 = 6

The fact that the 2nd partial derivative is negative for all values of q1 (respectively, q2) implies that u1 (respectively, u2) has a global maximum at q1 = 6 (respectively, q2 = 6). If this were not the case, some simple evaluations of u1 and u2 would confirm that q1 = 6 is a best response to q2 = 6 and q2 = 6 is a best response to q1 = 6. u1(6,6) = 36 > 0 = u1(0,6) = u1(6,12) > u1(6,q1) q1 > 12.

1.20 Hotelling Duopoly Model

Hotelling duopoly model, was published by American economist Harold Hotelling (1895–1973) in the Economic Journal in 1929.

Firms A and B are located at opposites ends of town. They sell identical products at a price of their choosing. They produce exactly enough to meet demand and have equal per unit costs. Demand for their products is uniformly distributed across the town, and is inelastic, or does not vary with price. From the customer’s point of view, the true cost of the item is the purchase price plus the transportation cost of travel from home to the firm and back. These circumstances are modeled in what follows.

Town is represented by the interval [0,1]. Firm A is located at 0 and B at 1. Transportation cost is t per unit distance travel. Thus, if A charges price pA and B charges pB, then

95

the total cost to a customer living at x, 0 ≤ x ≤ 1, is pA + tx from A and pB + t(1 – x) from B.

The decision to be made by each firm is what price to charge to maximize profits. If we assume that customers minimize total cost, then for any pair of prices (pA, pB), firm A’s market share is x* and firm B’s market share is B’s is (1 – x*), where x* is the point at which customer cost is equal from firm A and firm B. Accordingly, if per unit production cost is c, the firms payoffs are given by

)(*),( cpxppu ABAA

and

)*)(1(),( cpxppu BBAB

Solving for the location of the indifferent customer based on pA, pB and t yields

t

tppx BA

2*

1 *2

B Ap p tx

t

Substituting this back into the payoffs functions yields

t

tpppcppu BAA

BAA 2

))((),(

( , ) ( ) 12

A BB A B B

p p tu p p c p

t

20

22),(

tcpp

t

tpp

t

cpppu

pB

ABAA

BAAA

96

01

),(2

2

tppu

pBAA

A

( , ) 02 2 2

B B A AB A B B

B

p c p p t p c tu p p p

p t t

01

),(2

2

tppu

pBAB

B

( , ) ( , ) 0A A B B A B A BA B

u p p u p p p p c tp p

The per unit transportation cost t is a constant in the model, so the 2nd partials are both negative and constant, implying a global maximum for each player, which both confirms a Nash equilibrium at tcpp BA and rules out any other pure strategy Nash equilibrium and equilibria where only one player mixes. Identifying or ruling out equilibria where both players mix takes more work, with an emphasis on analysis rather than game theory and would thus b e a distraction in the context of this course. For this course it is important to remember that the possibility of mixed equilibrium may exist and that also to consider what a mixed equilibrium in the model would mean. It is not consistent with the situation being modeled that firms would actually vary the price throughout the day. One interpretation could be Harsanyi’s concept that a mixed strategy represents one belief about what the other is charging, but that would require addition assumptions about lack of information between the two players at the time of price setting. Perhaps the best interpretation would be that the stores sell multiple products and compete by setting some prices low to attract market share and increase

97

their profit by setting prices high on other goods. Indeed, this is common practice among retail stores. Modeling this last interpretation would probably involve a cost variable and demand distribution for each product, as well. These main point at this stage is to forever keep in mind the map between the model the real-world situation being modeled and think about how each model assumption should be interpreted and also how variations in either the application or the model would be reflected in the other.

One set of variations easily possible with the basic model presented is to explore model sensitivities by varying the production costs between the two firms and varying the magnitude of the production costs relative to the transportation costs. These excursions are explored in the exercises.

1.21 Hide and Seek (a.k.a. search and destroy)

This game models a submarine S hiding from a destroyer D in a narrow stretch of water, modeled as an interval [a,b], e.g. [0,10]. The submarine chooses a hiding location x [0,10] and the destroyer chooses a location y [0,10] to drop depth charges. It can be modeled as a zero sum game with payoffs

50)( 2 yxuS 2)(50 yxuD

One can check this with derivatives, but it is clear from inspection that the sub’s best response is to maximize it’s distance from the destroyer, i.e., choose the farthest endpoint, and the destroyer’s best response is to minimize the distance, i.e. choose x = y. Therefore, there is no Nash

98

equilibrium in which both players are using pure strategies. If the destroyer chooses the midpoint y = 5, the sub has two best responses: x = 0 and x = 10. To check for an equilibrium we look for a strategy x = (x0,x10) for the sub that mixes the two endpoints x = 0 and x = 10 such that y = 5 is a best response for the destroyer. When the sub uses x = (x0,x10) the destroyer’s payoff becomes

))10(50()50(),( 210

20 yxyxyxuD

Substituting x10 = 1 – x0 and collecting like terms yields 2

02

0 )20100)(1(50),( yyxyxyxuD

00 10100)1010(2),( xyyxyxuy D

It follows that y = 5 is a best response for the destroyer if

the submarine chooses 2

10 x , that is, hides at either end

point with probability 2

1, and this strategy profile is a Nash

equilibrium.

1.22 Games of Timing

Duals, Market Preemption, and War of Attrition are all games of timing and thus have continuous strategy spaces. That is, each player chooses a time for action. In a duel the players choose when to shoot, in Market Preemption they choose when to release a new product and in War of Attrition they choose when to give up.

i. Duel

99

ii. Market Preemption

iii. War of Attrition

1.23 A Simple Voting Game

1.24 Auctions

i. 1st Price Auction

ii. 2nd Price Auction

100

5 Repeated Games and Adaptive Learning

1.25 Backward Induction

i. Chain Store Paradox

1.26 Infinitely Repeated Games and Discount Factor

i. Folk Theorems

1.27 Repeated Prisoner’s Dilemma

i. Axelrod’s tournaments

1.28 Adaptive and Reinforcement Learning

i. Brown’s Method of Fictitious Play

ii. Local Interaction Models and Imitation

(Cascades)

1. Prisoner’s Dilemma (Sigmund, Byrne)

2. Adoption of Technology (Chatterjee, Krishna)

101

6 Cooperative Game Theory Cooperative game theory differs from non-cooperative theory in the assumptions that players

i) communicate,

ii) coordinate their strategies, and

iii) reallocate the payoffs,

iv) make binding agreements,

The communication of assumption (i) is necessary to support assumptions (ii) and (iii). In addition, assumption (iv) that agreements are binding allows enforcement issues to be ignored.

1.29 Coalitions and Joint Strategies

A set of players is called a coalition. A joint strategy is a mixed strategy for a given set of players. Assumption (ii) provides a mechanism for players to carry out their agreements and leads to the following definition: A joint strategy for a set of any k players is a probability distribution on a set of k-tuples (s1,…,sk) where sj is a pure strategy for the jth player in the set. Any subset of players can be regarded as a coalition, so joint strategies are mixed strategies for coalitions. For technical reasons to be addressed in what follows, the empty set is regarded as a coalition, so if N = {1,…,n} is the set of all players then the power set of N, c(N), is the set of coalitions. The coalition of all players is called the grand coalition.

Assumption (iv) enables us to focus on what can be gained from coalition formation, rather than what can go wrong, the theory should one day deal with the latter, as well.

102

Players can now act in coalitions to maximize payoffs to their coalition, even at a reduction to their personal payoff in the original game, because they have a binding agreement to get a different payoff when the coalition’s payoffs from the original game are pooled and reallocated.

1.30 Imputations

Because of these assumptions, the focus of the analysis shifts from what payoffs players can get alone in the original game to what payoffs players are likely to get after reallocation, which depends on what the different coalitions can get using joint strategies. These reallocations are called imputations, and their definition embodies two more assumptions of cooperative game theory.

Definition: An imputation for an n-player game is an element x Rn. such that

v) individual rationality: no player settles for less than he or she could guarantee without any cooperation from any other players, and

vi) collective rationality: the grand coalition always forms and acts to maximize the aggregate payoff available for reallocation.

The set of all possible imputations could be allowed to vary through all of Rn, but the two additional assumptions are intended to focus the analysis. Assumption (iv) rules out martyrdom, but this is consistent with the general nature of traditional game theory, namely a focus on what strategic opportunities for optimizing players exist in each game, not a focus on players will actually play given the complex realities of actual human decision making. Assumption (v) probably holds in the real world in far more occurrences

103

than assumption (vi). Readers are encouraged to imagine how various outcomes might likely differ without assumption (vi). It is somewhat analogous to the subgame perfection criterion for Nash equilibria in non-cooperative games: would a player who fails to negotiate his or her desired share of the aggregate payoff follow through on a threat to play so as to reduce the aggregate even though it means getting a personal payoff which is reduced even further? Experimental evidence on the ultimatum game suggests the answer is yes. This text will focus on presenting the conventional theory as a baseline for such excursions, so assumptions (v) and (vi) will be enforced in what follows.

1.31 Characteristic Functions and Superadditivity

The idea of rationality embodied in (v) is the notion that people will not settle for less than they can guarantee themselves. This idea is generalized to coalitions of any size by the characteristic function. The characteristic function maps the set of coalitions into the real numbers by assigning to each coalition the maximum aggregate payoff that the coalition can guarantee itself without cooperation from players outside the coalition. Formally,

:c(N)R.

For any non-empty proper subset S of N,

(S) = V

where V is the value of the value of the two-player zero-sum game between S and Sc based on the aggregate payoff to S, where Sc = N – S is the complement of S in N. Sc is called the counter-coalition. The characteristic value of the

104

grand coalition N is simply the maximum aggregate payoff in the game, because there are no players in the counter coalition. The characteristic value of the empty coalition is set to 0.

The characteristic function value is not the same as the best paying Nash equilibrium in the two-player general-sum game based on the aggregate payoffs to both the coalition and the counter-coalition. In computing the maximum guarantee to S, the assumption is made that the counter-coalition is conspiring to minimize the payoff to S, not conspiring to maximize its own payoff. Therefore, the aggregate payoff to Sc arising in the original game is ignored when computing the characteristic value of S and instead a zero-sum game is created based on the payoffs to S. This methodology truly finds the maximum guarantee to S, because a guarantee is actually the worst case scenario and assuming every player outside of S conspires to minimize the payoff to S is indeed the worst case scenario. This assumption is not made throughout the cooperative theory – it is specifically made to compute the characteristic function. The characteristic function is naturally relevant to bargaining positions, as will be seen in what follows. The example below illustrates computation of the characteristic function from a strategic form game (Figures 6.1 – 6.7).

105

Figure 1.39 Strategic Form Cooperative Game

In a 3-player game (Figure 6.1), the possible coalitions are , {1}, {2}, {3}, {1,2}, {1,3}, {2,3}, {1,2,3}. The characteristic function value for the coalition {2,3} is the value of the zero sum game between {2,3} and {1}, with payoffs of the coalition {2,3}. The joint pure strategies are CE, DE, CF, DF for {2,3} and A,B for {1}. For each joint strategy profile in the 2-player game, the payoffs for the corresponding 3-player strategy profile in the original game are summed over players 2 and 3 (Figures 6.2, 6.3).

0,-5,-7-2,1,3B-2,2,22,-1,-3B

0,4,44,-2,-1A1,-3,-1-1,3,2A

DCDC

FE

0,-5,-7-2,1,3B-2,2,22,-1,-3B

0,4,44,-2,-1A1,-3,-1-1,3,2A

DCDC

FE

106

Figure 1.40 Two-Player Zero-Sum Game Determining ({2,3})

Figure 1.41 Graphical Solution for ({2,3})

The characteristic value for the singleton coalition {1} is computed from the two player zero-sum game between the same two coalitions, but the roles of S and Sc are switched

-128DF

4-3CF

4-4DE

-45CE

BA

Zero-Sum Game

{23} vs. {1}

-128DF

4-3CF

4-4DE

-45CE

BA

Zero-Sum Game

{23} vs. {1}

y2=0 y2=1

86420

-2-4-6-8

-10

minmax = 1/2 at y = (1/2,1/2)

-12y2=0 y2=1

86420

-2-4-6-8

-10

minmax = 1/2 at y = (1/2,1/2)

-12

107

so for each joint strategy profile the payoffs are for the coalition {1}, i.e., player 1 (Figures 6.4,6.5).

Figure 1.42 Two-Player Zero-Sum Game Determining v({1})

Figure 1.43 Graphical Solution for ({1})

The characteristic values for all the coalitions are found in the same way – for each coalition S, as the value of the two-player zero-sum game between S and Sc, with the exception of () and ({1,2,3}). The grand coalition is

0-2-22B

041-1A

DFCFDECE

Zero-Sum Game {1} vs. {23}

0-2-22B

041-1A

DFCFDECE


x2=0 x2=1

5

4

3

2

1

0

-1

-2

-3

-4maxmin = 0 at (2/3,1/3)x2=0 x2=1

5

4

3

2

1

0

-1

-2

-3

-4maxmin = 0 at (2/3,1/3)

108

{1,2,3}, so({1,2,3}) = 8, and, as in every game, () = 0. The game of 6.1 is shown again with all derived zero-sum games for cross-reference (Figure 6.6), after which the complete characteristic function is given (Figure 6.7).

Figure 1.44 Zero-Sum Games Yielding Characteristic Values

-71BF

43AF

0-1BE

01AE

DC

Zero-Sum Game

{13} vs. {2}

-71BF

43AF

0-1BE

01AE

DC

Zero-Sum Game

{13} vs. {2}

-542-3D

1-2-13C

BFAFBEAE


-542-3D

1-2-13C

BFAFBEAE


-734-1F

2-3-12E

BDBCADAC


-734-1F

2-3-12E

BDBCADAC


-50BD

-11BC

4-2AD

22AC

FE

Zero-Sum Game

{12} vs. {3}

-50BD

-11BC

4-2AD

22AC

FE

Zero-Sum Game

{12} vs. {3}

0,-5,-7-2,1,3B-2,2,22,-1,-3B

0,4,44,-2,-1A1,-3,-1-1,3,2A

DCDC

FE

0,-5,-7-2,1,3B-2,2,22,-1,-3B

0,4,44,-2,-1A1,-3,-1-1,3,2A

DCDC

FE

-128DF

4-3CF

4-4DE

-45CE

BA

Zero-Sum Game

{23} vs. {1}

-128DF

4-3CF

4-4DE

-45CE

BA

Zero-Sum Game

{23} vs. {1}

0-2-22B

041-1A

DFCFDECE


0-2-22B

041-1A

DFCFDECE


109

Figure 1.45 Characteristic Values from Figure 6.1/6.6 Game

i. Exercises

1. Compute the characteristic function for the given game.

8{1,2,3}

1/2 {2,3}

3{1,3}

2{1,2}

–1{3}

–1/2 {2}

0{1}

0

S

8{1,2,3}

1/2 {2,3}

3{1,3}

2{1,2}

–1{3}

–1/2 {2}

0{1}

0

S

-1,0,-1-1,-4,10B2,-1,0-4,1,2B

6,8,-4-1,1,2A-2,1,13,-1,0A

DCDC

FE

-1,0,-1-1,-4,10B2,-1,0-4,1,2B

6,8,-4-1,1,2A-2,1,13,-1,0A

DCDC

FE

110

An important property of the characteristic function computed from any game in this way is that when any number of players and/or small coalitions get together to form a larger coalition, the aggregate guaranteed payoff never decreases. That is, people can always do as well or better in a group than as individuals and better in a large group than in multiple small groups. This property is called superadditivity.

Definition: Let N = {1,…,n} and :c(N) R. Superadditivity is the property that S,T c(N) S T = , (S T) ≥ (S) + (T).

Superadditivity extends to any finite collection of coalitions by induction. In particular, (N) ≥ (i) for every game.

1.32 Characteristic Function Form Games

Definition: Let N = {1,…,n} and :c(N) R satisfy superadditivity and n() = 0. Then is a characteristic function form game.

The significance of this last definition is that one need not start with a strategic form game to have a characteristic function. Just as the derivation of the strategic form from the extensive form led to the creation of strategic form games as a starting point for analysis, so does the derivation of characteristic function form games lead to the analysis of characteristic functions that are created directly from a verbal description of a game or even created purely as numerical examples without any associated story. Any function satisfying the definition may be regarded as a cooperative game, i.e., bargaining situation.

111

The characteristic function enables a more concise statement of the definition of an imputation.

Definition: An imputation for a characteristic function form game is an element x Rn. such that

i) xi (i) (individual rationality), and

ii) xi(N) (collective rationality)

Corollary: The set of all imputations is convex. (It is the intersection of finitely many half-planes defined by a system of linear inequalities.)

1.33 Essential and Inessential Games

If (N) = (i), then (S) = Si

i)( for every intermediate

coalition S N, because any gains by a single coalition would be preserved by superadditivity in larger coalitions. Recall that every imputation needs must be both individually and collectively rational, so if (N) = (i) then there is only one payoff vector that qualifies as an imputation: ((1), (2), …, (n) ). This observation motivates the classification of inessential games.

Definition: If :c(N) R is a characteristic function, is called inessential if (N) = (i). is called essential if is not inessential.

Inessential games are uninteresting to analyze from the perspective of cooperative game theory, because the focus of cooperative game theory is on how the structure of the payoff function affects the dynamics of the bargaining process – in particular, what imputations are more or less likely to result as outcomes. There is absolutely no incentive for anyone to form a coalition in an inessential

112

game because the players can do exactly as well playing the same game as a non-cooperative game with no pre-game communication or binding agreements. While inessential games are not interesting to analyze from a cooperative point of view, their existence is very interesting. That is, the fact that in some games there truly is no value added to collaboration or communication of any kind. An important class of inessential games is that of Two-Person Zero-Sum games. As noted in chapter 3, these games are called strictly competitive because there is never an advantage to cooperation.

1.34 Dominance, Coalitional Rationality, and the Core

The characteristic function captures information on the strategic position of coalitions in the bargaining process, from individuals all the way up to the grand coalition. The concept of dominance of imputations formalizes one way in which the characteristic function can bear on negotiations.

Definition: Given a characteristic function form game , imputations x and y, and a coalition S, x dominates y through S, written

yx S

if

(i) xi > yi i S,

(ii)

Si

i Sx

The first condition states that the members of S all prefer x to y. The second condition states that the members of S have the ability to guarantee the portion of x that pertains to

113

them, that is, xi for i S. Together, these conditions indicate that in some sense y is not a likely outcome of the pre-game bargaining process because the members of S could discover x and reject y in favor of x. This reasoning is valid, but there is a problem with some games in that an imputation x could be dominated by an imputation y which is dominated by another imputation z which in turn dominates x (see example 6.4.1 below.)

Example 6.4.1.

2

3,2,

2

3

4

7,

4

9,12,

4

7,

4

5

2

3,2,

2

3231312

In other words, domination through a coalition is not even a partial order of imputations, let alone a total order. Dominance should be thought of as capturing a critical aspect of the bargaining process, but it still does not guarantee a “neat” prediction of the outcome because the underlying structure of the game could admit many outcomes that are not dominated through any coalition (and thus dominance cannot rank them) or could admit no imputations that are not dominated and thus any outcome is in some sense unstable.

5123

423

413

412

03

02

01

0Ø

vs

5123

423

413

412

03

02

01

0Ø

vs

114

Definition: The core of a game in characteristic function form is set of all imputations that cannot be dominated through any coalition.

The core can be empty or can be the entire set of imputations or can be a subset of imputations of any dimension. The fully understand the core it is instructive to observe the relationship between a coalition’s payoffs and the coalition’s characteristic function value. Given an imputation y, consider a coalition S and suppose

Si

i Sy . Then, no imputation y can dominate x

through S because if condition (i) of the definition of domination does not fail, i.e., xi > yi i S, then

Si

ix

Si

i Sy so condition (ii) fails. Conversely, if

Si

i Sy , then setting xi = yi + ((S)–Si

iy ) / |S|

satisfies (i) and (ii) and satisfies xi (i) for i S because xi > yi (i). If xi (i) for i S and

Siix =

Siiy – ((S)–

Si

iy ) then

n

iix

1

= (N) which is required for x to be an

imputation. In other words, in order to account for the increase of (S )–

Siiy > 0 allocated evenly among i S,

it is necessary to allocate a decrease of (S )–Si

iy among

the players i S. The allocation of the increase across S need not be even, but must be positive for all i S. The allocation of the decrease across Sc need not be even and is not required to affect all players. For example, one player could suffer the entire reduction. Because yi (i) is

115

required i, it might be impossible to reduce some players’ payoffs. These complications preclude specifying a general formula for allocating the reduction to Sc that always works, but some allocation is always possible, because

Si

ix = (N) – (S) ≥ (Sc) Si

i)( by superadditivity.

The result of this derivation can be summarized by the following

Theorem: An imputation y can be dominated by some imputation x through a coalition S if and only if

Siiy <

(S).

Corollary: An imputation y is in the core of a characteristic function form game if and only if

Siiy ≥(S) S

c(N).

Corollary: The core of a game in characteristic function form is convex. (Like the set of all imputations, the core is the intersection of finitely many half-planes defined by a system of linear inequalities.)

The examples that follow show that the core can be empty or can be the whole set of imputations or can be any dimension in between.

i. Example of an Empty Core

ii. Example of an All Encompassing Core

3123 5/2 23

213

212

13

- 22

- 11

0Ø

v s

3123 5/2 23

213

212

13

- 22

- 11

0Ø

v s ) ( S xS i

i

x 1 ? – 1

x 2 ? – 2 x 3 ? 1 x 1 + x2 ? 2

x 1 + x 2 ? 2 x 1 + x 3 ? 2 x2 ? 1

x 2 + x 3 ? 5/2 x1 ? 1/2

Core =

3123 023

013

012

03

02

01

0Ø

vs

3123 023

013

012

03

02

01

0Ø

vs

x 1 ? 0x 2 ? 0x 3 ? 0 x1 + x2 ? 3

x 1 + x2 ? 0

x 1 + x3 ? 0 x2 ? 3

x 2 + x3 ? 0 x1 ? 3

Core = all imputations

)(S xS i

i

116

iii. Example of a 0-dimensional Core

Core in red = { (1,1,1) }Core in red = { (1,1,1) }

3123

223

213

212

13

-22

-11

0Ø

vs

3123

223

213

212

13

-22

-11

0Ø

vs )(SxSi

i

x1 –1

x2

–2

x3 ? 1 x1 + x2 ? 2

x1 + x2 ? 2

x1 + x3 ? 2 x2 ? 1

x2 + x3 ? 2 x1 ? 1

117

iv. Example of a 1-dimensional Core

3123

4/323

5/413

212

13

-22

-11

0Ø

vs

3123

4/323

5/413

212

13

-22

-11

0Ø

vs

x1 ≥ –1

x2 ≥ –2

x3 ≥ 1 x1 + x2 ≤ 2

x1 + x2 ≥ 2

x1 + x3 ≥ x2 ≤ 7/4

x2 + x3 ≥ 2 x1 ≤ 5/3

Core in red

)(SxSi

i

118

v. Example of a Non-Trivial 2-dimensional Core

3123

223

213

3/212

13

-22

-11

0Ø

vs

3123

223

213

3/212

13

-22

-11

0Ø

vs

x1 ≥ –1

x2 ≥ –2

x3 ≥ 1 x1 + x2 ≤ 2

x1 + x2 ≥ 3/2

x1 + x3 ≥ 2 x2 ≤ 1

x2 + x3 ≥ 2 x1 ≤ 1

Core: in red

)(SxSi

i

119

1.35 Strategic Equivalence and Normalization

Strategic equivalence is motivated by the idea that if the payoff units are changed from dollars to yen to euros, the strategic essence of the game should be unchanged. Likewise, if each player i receives a constant side payment ci regardless of what imputation players finally results, then these payments should not affect the strategic essence of the game, precisely because the players receive the same amount no matter how the game is played – including what negotiations lead to what imputation.

Definition: If and are characteristic functions, is said to be strategically equivalent to if k > 0 and c1,…,cn such that

Si

icSkS )()( .

The positive constant k represents a change of units, and must therefore be positive. A negative change would naturally reverse the incentives and a zero constant would nullify all incentives. The constants ci represent the side payments to each player that are independent of what imputation results and these constants can be positive, negative or zero.

If

Si

icSkS )()( then

Si

ik

cSk

S )(1

)( . k >

0 01

k so n is strategically equivalent to m. That is,

strategic equivalence is a symmetric relation. Taking k = 1 and ci = 0 i shows that strategic equivalence is also a reflexive relation. The verification that strategic equivalence is a transitive relation is left as an exercise. Combined, these three properties establish strategic

120

equivalence as a true equivalence relation in the standard mathematical sense of a reflexive, symmetric and transitive relation.

Proposition 6.7.1: Suppose and are strategically equivalent, and suppose yx S under . If x’i = kxi + ci

and y’i = kyi + ci, i = 1,…n, then '' yx S under . That is,

equivalence preserves dominance of imputations. The proof is left as an exercise.

Corollary 6.7.2: Strategic equivalence preserves the dimensionality of the core of a characteristic form game. Proof: strategic equivalence induces a 1 – to – 1 map between cores.

Proposition 6.7.3: Every essential characteristic function form game is strategically equivalent to exactly one game satisfying (i) = 0 i and (N) = 1. Proof: Let

)()(

1k

iN and ci = –k(i) and (S) = k(S) +

Si

ic . Then k > 0 because is essential and (i) = 0 and

(N) = 1.

Definition: For any essential game, the strategically equivalent game known to exist by proposition 6.7.3 is called the (0,1)–reduced form of the game and is also referred to as the zero–one normalization of the game. The expressions normalize the game and put the game in (0,1)–reduced form are synonymous.

The (0,1)–reduced form is useful for determining whether two arbitrary characteristic functions are strategically equivalent. Rather than solving for the constants k and ci

121

on a case by case basis, each function can quickly be normalized and the (0,1)–reduced forms compared.

Proposition 6.7.4. If and are strategically equivalent games in (0,1)–reduced form, then . Proof: k(i) = (i) + ci ci = 0 i . ci = 0 i k = 1.

The (0,1)–reduced form is useful for characterizing classes of games because only one representative of each class need be analyzed. Strategic equivalence allows one to think of all characteristic function form games as existing in the first quadrant of Rn. That is, if any game exists possessing a given set of strategic properties, such a game exists in the first quadrant.

i. Example of Normalizing a Game

The results of normalizing a game focus attention on relative values rather than absolute values. In the example 6.7.1, ({1,2}) < ({1,3}), but in the normalized game, ({1,2}) > ({1,3}). The normalized characteristic value of each coalition is the value added of forming the coalition as a fraction of the total value added of forming

5123

223

513

312

03

-52

31

0Ø

s

5123

223

513

312

03

-52

31

0Ø

s

(12) = k(12) + c1 + c2 = 3/7 – 3/7 + 5/7 = 5/7(13) = k(13) + c1 + c3 = 5/7 – 3/7 + 0 = 2/7(23) = k(23) + c2 + c3 = 2/7 + 5/7 + 0 = 1

c1 = –k(1) = –3/7c2 = –k(2) = 5/7c3 = –k(3) = 0

7/1)053(5

1

)()(

1

iNk

1123

123

2/713

5/712

03

02

01

0Ø

s

1123

123

2/713

5/712

03

02

01

0Ø

s

122

the grand coalition: ({1,2}) – (({1}) + ({2})) = 3 – (–2) = 5 (out of 7), whereas ({1,3}) – (({1}) + ({3})) = 5 – (3) = 2 (out of 7). Another surprising feature of the normalized game is that ({2,3}) = 1 = ({1,2,3}). That is, the coalition appears to be capable of garnering the entire aggregate payoff for itself, whereas in the original game, player 1 appears to command 60% of the aggregate payoff. This, again, is because by normalizing the individually rational payoffs to zero, the (0,1)–reduced characteristic function illuminates the relative value added that can be gained through coalition formation. Because ({1,2,3}) = 5 and ({1}) = 3, there is only an extra value added of 2 above what player is guaranteed to get alone, and the normalized characteristic function tells us that players 2 and 3 can get all the value added that the grand coalition makes possible without the help of player 1. This an important observation that is not obvious in the original game but is highlighted by normalizing the game.

Exercise. Normalize examples 6.6.1 – 6.6.5, graph each normalized core, and compares to graphs of 6.6.1 – 6.6.5.

1.36 Shapley Values

The core is one solution concept for cooperative games. Shapley values, named for Lloyd Shapley, their creator, are a different type of solution concept for cooperative games. The vector of Shapley values, called the Shapley vector, is an imputation. A player’s Shapley value can also be thought of as an index of power: the greater the Shapley value, the more bargaining power the player has.

Shapley values are based on each player’s “value added” when joining a coalition, defined as

123

i = ( S {i} ) – (S)

When player i joins coalition S, the amount of aggregate payoff the new coalition can command is increased by i. If the assumption that the grand coalition always forms is refined to specify an order in which each joins, then i can be computed for each player, taking for S the sub coalition already formed before they join. For example, if the order in which the grand coalition forms is player 2, followed by player 3, followed by player 1, then

2 = ( {2} ) – () = ({2}) – 0 = ({2})

3 = ( {2} {3} ) – (2) = ({2,3}) – ({2})

1 = ( {2,3} {1} ) – ({2,3}) = ({1,2,3}) – ({2,3})

If the order is 1, followed by 2, followed by 3, then

1 = ( {1} ) – () = ({1}) – 0 = ({1})

2 = ( {1} {2} ) – (1) = ({1,2}) – ({1})

3 = ( {1,2} {3} ) – ({1,2}) = ({1,2,3}) – ({1,2})

Because the value of i depends on the order in which the grand coalition is assumed to form, the Shapley value averages each i over all possible orders. Each order is a permutation of the set of players, i.e. each order is an element p Sn, the set of permutations on n elements. If i is written explicitly as a function of the ordering p, that is, i: Sn R, then the ith player’s Shapley value, (i), can be written as

(i) =

nSp

in

pS

1

124

i. Example of Shapley Value Computation

The tables below show the data for computing the Shapley values for the given characteristic function that come from each permutation of the players. The first table contains redundant information to present one example in extreme detail, after which the 2nd and 3rd rows are omitted from the tables for the remaining permutations. The coalition S – i in each column of the first table is the coalition S from the preceding column. Color coding associates the data

for each player.

2-30v(S – i)

{12}{1}ØCoalition S – i before i joins

25-3Player i’s added is v( S ) – v( S – {i} )

42-3v(S)

{123}{12}{1}Coalition S after i joins

321Players in order

2-30v(S – i)

{12}{1}ØCoalition S – i before i joins


42-3v(S)


321Players in order


41-3v(S)


231Players in order


41-3v(S)


231Players in order


42-1v(S)


312Players in order


42-1v(S)


312Players in order

4123

223

113

212

03

-12

-31

0Ø

vs

4123

223

113

212

03

-12

-31

0Ø

vs

125

Proposition: The Shapley vector is always an imputation.

Proof: The set of all imputations for a game is a convex set, so it suffices to show that the Shapley vector is a weighted average of imputations; that is, that for each p Sn, (1(p), …,n(p)) is an imputation. i(p)=( S {i} ) – (S), where S depends on p. Superadditivity implies (S{i}) ≥ (S) + ({i}), which implies i(p) =(S{i}) – (S) ≥ ({i}) so individual rationality holds. Furthermore, given any p = {i1,…,in}, i(p) = ({i1}) + ({i1,i2}) – ({i1}) + ({i1,i2,i3}) – ({i1,i2}) + … + ({i1,…,in}) – ({i1,…,in–1}) = ({i1}) – ({i1}) + ({i1,i2}) – ({i1,i2}) + … + ({i1,…,in–1}) – ({i1,…,in–1}) + ({i1,…,in}) = ({i1,…,in}), so collective rationality holds. Therefore, i(p) is an imputation p and it follows that ((1),…,(n)) is an imputation.


42-1v(S)


132Players in order


42-1v(S)


132Players in order

220Player i’s added is v( S ) – v( S – {i} )

420v(S)


123Players in order

220Player i’s added is v( S ) – v( S – {i} )

420v(S)


123Players in order

[(-3)+(-3)+3+ 2+1+2] / 6 = 2/61

[5+3+(-1)+(-1)+ 3+2] / 6 = 11/62

[2+4+2+3+0+0] / 6 = 11/63

Player i’s Shapley value = added value averaged over all peruations

[(-3)+(-3)+3+ 2+1+2] / 6 = 2/61

[5+3+(-1)+(-1)+ 3+2] / 6 = 11/62

[2+4+2+3+0+0] / 6 = 11/63

Player i’s Shapley value = added value averaged over all peruations

126

1.37 Simple Games, Compound Games, Elections

Elections, legislative bodies and voting situations of many kinds all have something in common: a coalition can do nothing until it reaches the critical composition required to carry the vote, at which point it has all the power it needs and extra members do not add value. These scenarios motivate the following definition:

Definition A game in characteristic function form is called a simple game if

i) (i) = 0 i

ii) (N) = 1

iii) (S) {0,1}

In simple games, i(p) = (S{i}) – (S) {0,1} pSn because (S{i}) – (S) can only evaluate to 0 – 0, 1 – 0, or 1 – 1. The cases when i(p) = 1, player i is the “swing vote”, turning a losing coalition into a winning coalition. Therefore, the Shapley value for each player i can be computed most efficiently by identifying exactly those permutations p of the players that make player i the swing vote. If Pi Sn denotes the set of these permutations, then player i’s Shapley value is (i) = |Pi|/n!

Any permutation p = (i1,…ik,i,ik+2,…,in) Pi can be partitioned into three segments p = (b,i,a), where b = (i1,…ik) is an ordering of players who join the grand coalition before i and are a losing coalition until player i joins, and a = (ik+2,…,in) is an ordering of players who join after i. If b’ is a reordering of b and a’ is a reordering of a, then p’ = (b’,i,a’) Pi as well. Given a representative p = (b,i,a), the number of permutations of the form p’ =

127

(b’,i,a’) is k!(n – k – 1)! This further simplifies computation of the Shapley value for player i to the identification of a set of distinct representatives b of losing coalitions that become winning when i joins – distinct in the sense that none is a permutation of another. Identifying these representatives could require some reasoning tailored to the specifics of the problem, after which the permutations of each representative can be found by the simple application of formula. Depending on the specifics of the problem, it might even be possible to count the number of distinct representatives b without even listing them all, and thereby further simplify the computation of (i).

The following example of a simple game is taken from Morris.

i. Lake Wobegone Local Government

Lake Wobegone has a mayor as well as a city council consisting of 6 aldermen and one chairman.

There are two ways a bill can be passed into law:

1) a majority vote by the council followed by the signature of the mayor (chairman only votes in case of a tie)

2) a super-majority vote of 6 council members will override a veto by the mayor (chairman always votes in case of veto override)

128

The Shapley values of the aldermen will all be identical, so there are three Shapley values to compute: that of an alderman (A), of the chairman (C), and of the mayor (M).

The permutations in which the mayor is the swing vote have four distinct forms listed below. The number of each can be counted based on the fact that there are a total of six alderman from which to choose a subset to precede the mayor, multiplied by the orderings of each predecessor set b and successor set a, as discussed above.

The permutations in which the chairman is the swing vote have two distinct forms listed and counted below.

= 2160

= 2880

= 1440

= 3600

10080

Shapley value = 10080/8!

4A M 2A C

3A C M 3A

5A M A C

4A C M 2A

x 4! x 3!( )64

x 4! x 3!( )64( )64

x 5! x 2!( )65

x 5! x 2!( )65( )65

x 4! x 3!( )63 x ( )1

1x 4! x 3!( )6

3( )63 x ( )1

1( )11

x 5! x 2!( )64

x ( )11

x 5! x 2!( )64( )64

x ( )11( )11

= 1440

= 2880

4320

so Shapley value = 4320/8!

5A C M A

3A M C 3A

x 5! x 2!( )65

x 5! x 2!( )65( )65

x 4! x 3!( )63 x ( )1

1 x 4! x 3!( )63( )63 x ( )1

1( )11

129

The permutations in which an alderman is the swing vote have two distinct forms listed and counted below. Notice that when computing the Shapley value of a fixed alderman, there are only 5 alderman left from which to choose the coalition b that comes before the swing vote.

Note that the Shapley values add up to (N) = 1 as they must: six alderman and the chairman contribute 7 x 4320 plus the mayor at 10080 sums to 40320 = 8! which is exactly the denominator.

1.38 Nash Arbitration Solution

The last solution concept to be covered here is the Nash arbitration solution. It is specifically designed for two-player games and it is not intended to be a bargaining model, but truly and arbitration solution – that is, a method for an arbitrator to impose a settlement on two parties who are at an impasse in negotiation. The arbitrator should be fair and that is why the players are willing to consent to arbitration. Nash set out to derive the ideal arbitration function.

= 1440

= 2160

= 240

= 1200

4320

so Shapley value = 4320/8!

3A M A 2A C

2A C M A 3A

5A A M C

4A C A 2A

x 4! x 3!( )53

x 4! x 3!( )53( )53

x 5! x 2!( )55 x 5! x 2!( )55( )55

x 5! x 2!( )54

x ( )11

x 5! x 2!( )54( )54

x ( )11( )11

x 4! x 3!( )52 x ( )1

1 x ( )11 x 4! x 3!( )5

2( )52 x ( )1

1( )11 x ( )1

1( )11

130

In an arbitration scenario, the assumptions of characteristic function form games are modified. The assumptions of communication, collaboration (i.e. joint strategies) are assumed, but reallocation of payoffs is not. That is, any payoff pair that is feasible as a joint strategy outcome is a possible arbitration outcome, but the payoffs will not be reallocated. The feasible set P for payoff pairs (u,v) is the full set of feasible joint payoff pairs, i.e. the convex hull of the pure strategy payoff pairs.

To derive the arbitration outcome, Nash made assumptions about what properties an ideal arbitration outcome should have. He denoted the outcome (u*,v*) = ((P,u0,v0)) to indicate that the outcome depends on the payoff region and the status quo point.

1) Feasibility: the arbitration outcome must be feasible, i.e. (u*,v*) P

2) Individual Rationality. This assumption is generalized slightly from the context of imputations. Nash assumed that the result of negotiations up to the point of arbitration was that each player had achieved some status quo payoff, i.e. a payoff that they will receive if any further negotiation or arbitration fails. Thus, the characteristic function values inherent in the underlying game are replaced by a pair of status quo payoffs (u0, v0), so individual rationality means u* ≥ u0 and v* ≥ v0.

3) Pareto Optimality: (u*,v*) is Pareto optimal. Pareto optimal means “on the Pareto boundary”, which is the set of feasible (u,v) at which u cannot increase and stay feasible unless v decreases, and visa versa.

131

4) Symmetry: If P is symmetric (i.e., (u,v) P (v,u) P), and if u0 = v0, then u* = v*.

5) Independence of Irrelevant Alternatives: If P’ P and (u0,v0) P’ and (u*,v*) P’, then , P'(u0,v0)) = (u*,v*)

6) Invariance Under Affine Transformations: If P’ is obtained from P by a transformation of the form

u’ = au + b, v’ = cv + d where a,c > 0

then

P’(au0 + b,cv0 + d)) = (au* + b, cv* + d).

The first assumption needs no justification and might be taken for granted if not stated, but Nash based an important proof on these assumptions so it is important to list them all. The second is easily defended as common sense and the third is also intuitively appealing: naturally, each party would expect the arbitrator to grant any gains that could be had at no expense to the other party. The fourth assumption, symmetry, is motivated by fairness. Independence of irrelevant alternatives is not as obvious as the earlier assumptions. In essence, it labels the entire feasible set “irrelevant” except for the convex hull of the status quo point (u0,v0) and the solution point (u*,v*). It states that if one throws away any subset of the feasible set to get a smaller set containing the status quo point and the solution point, then the ideal arbitration function should return the same solution for the smaller set. This assumption implies that if you take a larger set instead of a smaller set, the function will also return the same point, so long as a new solution point is not added. Because the solution is assumed to be Pareto optimal, if the Pareto

132

boundary is not changed then the solution doesn’t change. Finally, Invariance under affine transformations with positive coefficients. Multiplication by a positive constant can be taken as a units change, which should not change the solution. The fact that addition or subtraction of wealth does not change it is more questionable, especially in light of utility theory, but this is only mentioned to remind readers to always question assumptions. Nash’s result is remarkable and even this last assumption is innocuous in light of the strong result he obtained.

i. Theorem (Nash Arbitration Solution)

There is a unique function that satisfies assumptions (1) – (6), given by

(u*,v*) = ((P,u0,v0)) = argmax(u,v)P (u – u0)(v – v0)

The proof is deferred to Appendix A.

The variables that are input to a function are also called the arguments. The argmax of a function is the set of values for the arguments that maximize the function. For example, the solution set to a Linear Program is the argmax (or argmin) of the objective function on the feasible set.

In this case, the function being optimized is not linear. However, because the function is clearly increasing in both variables, the maximum always occurs on the Pareto boundary of the feasible set (as it must in accordance with the 6 assumptions). The Pareto boundary is always either a point or a line segment with negative slope – negative slope is the condition that one variable cannot increase without the other decreasing. On the line segment, the function

133

can be expressed as a function of just one variable and maximized using standard calculus techniques.

ii. Example of Nash Arbitration Solution

Find the Nash arbitration outcome for the given game using individually rational payoffs as the status quo point (12/5,12/5).

1) Plot the feasible space and identify the Pareto boundary.

2) Write functions to express v as a function of u on the boundary.

Segment 1: v = 4 – u/2

Segment 2: v = 5 – u

Segment 3: v = 8 – 2u

3,22,3

0,44,0

3,22,3

0,44,0

Pareto boundary has three segments.

Feasible set is the convex hull of the

pure strategy outcomes

134

4) Maximize f(u,v) = (u – 12/5)(v – 12/5) over the Pareto boundary

The max must be found separately on each boundary segment and then the overall max chosen from the three.

Segment 1: u [0,2] ; v = 4 – u/2

f(u,v) = (u – 12/5)(4 – u/2 – 12/5) = –u2/2 + 14/5 u – 96/25 = f(u)

du

udf )( = –u + 14/5 > 0 on [0,2] so max = –6/25 at u = 2

Segment 2: u [2,3] ; v = 5 – u

f(u,v) = (u – 12/5)(5 – u – 12/5) = –u2 + 5 u – 156/25 = f(u)

du

udf )( = –2u + 5 = 0 at u = 5/2 and

2

2 )(

du

ufd = –2 so max =

1/100 at u = 5/2

Segment 3: u [3,4] ; v = 8 – 2u

f(u,v) = (u – 12/5)(8 – 2u – 12/5) = –2u2 + 52/5 u – 336/25 = f(u)

du

udf )( = –4u + 52/5 < 0 on [3,4] so max = –6/25 at u = 3

The overall max = 1/100 at u = 5/2 and v = 5 – u = 5/2 so the Nash arbitration solution is (5/2,5/2).

iii. Shapley Procedure

As mentioned above, when Nash created his arbitration solution he assumed that the negotiation had been going on for some time and had reached an impasse, thus requiring

135

arbitration. The status quo point was the result of this preceding negotiation process for which no particular method was specified, so the status quo point was simply an exogenous input, and could be any feasible point.

Lloyd Shapley suggested taking as the status quo point the characteristic function values, i.e., the maximin values derived from the original game. This has become known as the Shapley procedure – to find the Nash arbitration solution using the characteristic function values of the two players as the status quo point.

iv. Exercises

6.10.3.1 Find the Nash arbitration solution for the given game, using u0 = 21/6 and v0 = 25/6 as the status quo values.

4,43,5

1,56,0

4,43,5

1,56,0

136

7 Evolutionary Game Models In evolutionary game models, a strategic form game, referred to as the generation game5, is used to model a fitness competition, such as hunting, mate selecting, or possibly more abstract choices. The idea is that payoffs are in Darwinian fitness units. Pure strategies model genotypes, usually called types or sometimes pure types to emphasize the representation by pure strategies. Mixed strategies model heterogeneous populations comprised of multiple types – many individuals of each type. Multiple generations are modeled as successive plays of the generation game, starting with an initial population (i.e. mixed strategy). In each generation t, the payoff is computed for each pure strategy type paired with a co-player that uses the mixed strategy representing the population in generation t. A new mixed strategy is computed to represent the population in the next generation t+1, and types increase or decrease as a fraction of the new population according to how their payoffs compared given the population in generation t. The cycle then repeats to compute the population for generation t+2 and so on.

Payoffs are only computed for each type, rather than each individual, implicitly assuming that all players of a given type each score the same payoff. Assigning to each type the payoff of its pure strategy, when matched with a co-player that uses the mixed strategy representing the population, models each individual playing the game many, many times throughout its lifetime (i.e., one generation), facing co-players randomly selected from the population.

5 or sometimes the period game or stage game in the language of generic dynamic or multi-stage games

137

The “law of large numbers”6 loosely implies7 that the lifetime history of co-players for each individual will resemble, on average, the mixed strategy representing the population. Thus, the average lifetime payoff will be approximately the payoff when a single co-player uses the mixed strategy. In the model, the qualifier is dropped and each player of a given type scores exactly the payoff of the pure strategy representing the type, given the population mixed strategy by the co-player. A constant base fitness term is added to all payoffs, independent of type. This parameter provides a control to calibrate the importance of the game to the overall fitness of the types. High base fitness makes the game payoff relatively unimportant and low base fitness makes fitness highly sensitive to the game payoff. If the generation game allows any zero or negative payoffs, base fitness must be at least large enough to ensure positive payoffs to all types for technical reasons discussed later.

An implicit assumption in this modeling paradigm is that any fractional distribution of types in the population is feasible, which is actually only true for infinite populations and even then only rational fractions would be feasible; e.g. not (1/, –1/). Modeling finite populations with probability distributions on types avoids the complications of a finite system such as how do you have a fractional number of offspring. In its most basic form the model ignores mating by assuming that types pass their exact type to their offspring 100% of the time.

6 A folk theorem of sorts from statistics that states that when sampling a random distribution, as the sample size tends to infinity the samples tend to represent the distribution ever more closely. 7 There is no rigor associated with how many samples guarantee how well the distribution is represented by the samples.

138

Despite all these very high level assumptions, or perhaps because of them, evolutionary game models have proved vastly useful in biology – often more useful than in economics where the time frame is perhaps relatively short to make the law of large numbers valid. Also, in economics, there is an ongoing debate as to whether being successful leads one to have more children or fewer, and economic behavior may also be more learned than genetic so types do not perfectly inherit the strategy of their parents. When considering questions such as these one must always consider that all the models in this book, and all models in general, can sometimes be used at face value to directly model phenomena in the world, and other times they can be used indirectly to model phenomena by analogy. For example, evolutionary models are often used to model cultural inheritance, which is, in essence, imitation – a learning process. The dynamics can be essentially the same, with successive generations imitating the strategies of their parents’ generation in proportion to the success they witnessed growing up. There are several variations of evolution models, including one-population models, two-population models, n-population models, models with random quasi-mutation, genetic algorithms with structured mutation, and overlapping generations models in which a player lives 2 or more generations so parents and children play the game together. The computational details of basic one- and two-population models are presented in the subsections that follow.

139

1.39 One-population models without Mutation

Given a symmetric 2-player strategic form general-sum game G = (aij,bij), a one-population evolutionary game model is a population

pt = (x1,t,…,xn,t)

where xi,t is the fraction of the ith pure type (i.e., the ith pure strategy of G) in pt , the population at time t, along with an evolution dynamic D: that yields a new population from an existing population, where is the set of all mixed strategies of G. Note that because G is symmetric, the strategy sets for player 1 and player 2 are the same so needs no subscript to distinguish it. D is difficult at best, and generally counter-intuitive to express in closed form. It useful to name the evolution dynamic simply for convenience of reference and to indicate that other dynamics beside the standard dynamic are admissible. That being said, one standard evolution dynamic is virtually always used, and is computed as follows:

A one-population evolutionary game model is initialized with a starting population

),...,( ,,10 tnt xxp

For any t ≥ 0 payoffs to each type i in generation t are computed for each pure type as

waxn

jijtjti

1,,

where w is base fitness and xj,t and ai,j are as above in the definition of pt and G. As will be seen below, the model produces nonsense if payoffs go negative and if a type

140

payoff ever equals zero, that type will go extinct and never return to the population without quasi-mutation, so w is generally required to be large enough so that

tti ,0,

and always required to be large enough to prevent negative payoffs.

The average payoff is the continuous analogy of a straight average, namely the total payoffs to all players divided by the number of players. The continuous version of this computation is simply the fraction of players with a given payoff times that payoff, or

n

itititave x

1,,,

.

Given the type payoffs and the average payoff, the fraction of each type in the population is updated as

titave

titi xx ,

,

,1,

This update computation guarantees that the resulting population qualifies as a mixed strategy, because w guarantees type payoffs are positive, so average payoff is positive, so the ratio is positive, so

0,0, tx ti , and

n

i

n

ititi

taveti

tave

tin

iti xxx

1 1,,

,,

,

,

11, 1

1

Starting with an initial mixed strategy population, the evolution dynamic forever produces a new population from

141

the last. Individual types can converge to zero, or to unity, or can converge to some mixture that is in balance with other types, or can forever wander through the set of possible populations (mixed strategies) without ever settling down. The dynamics of different games are the subject of a great deal of study. Some classic examples will be presented in this chapter, based on some of the famous games presented in chapter 3.

A note on symmetry: In a one-population model, the symmetry requirement is necessary to avoid accounting for which player has the role of player 1 and which the role of player 2 when two players from the same population are matched. If a strategic form game at least has symmetrical strategy sets, then if each player took the role of player 1 half the time and player 2 half the time, the average payoffs would be symmetric. This reasoning yields a method for transforming a square two-player game with asymmetric payoffs into a fully symmetric two-player game suitable for a one-population model.

i. Hawk-Dove Evolution Dynamics

Consider the Hawk-Dove game (Section 3.9.3) with V = 2 and C = 3 (Figure 7.1).

142

Figure 1.46 Hawk-Dove Numerical Example

The parameter w (base fitness) must be greater than ½ to keep payoffs positive, so for this example assume w = 1, and let

4

1,

4

3,, 000,20,10 dhxxp

Update the population from time 0 to time 1:

8

912

4

1

2

1

4

30,

h

8

1011

4

10

4

30,

d

32

37

8

10

4

1

8

9

4

30,

ave

37

27

4

3

3237

89

1

h

1

1

0

2 D

2

0

–1/2

–1/2H

DH

1

1

0

2 D

2

0

–1/2

–1/2H

DH

143

37

10

4

1

3237

810

1

d

37

10,

37

27),( 111 dhp

At t = 0, the doves outscore the hawks, so at t = 1 the hawks have decreased in the population and the doves have increased. The computations are then carried out for the new population:

37

10,

37

27),( 111 dhp

74

8712

27

10

2

1

37

271,

h

74

9411

37

100

37

271,

d

2738

3289

74

94

37

10

94

87

37

271,

ave

3289

2349

37

27

27383289

7487

2

h

3289

940

37

10

27383289

7494

2

d

3289

940,

3289

2349),( 111 dhp

144

As can be seen in this example, the computations quickly become quite messy and are best suited to a computer, especially since the focus of analysis of evolution models is the long run behavior. Example 7.1 was computed and graphed for 100 generations with Mathematica (figure 7.2).

Figure 1.47 Hawk-Dove Evolution (V=2,C=3,w=1,h0=3/4,d0=1/4)

The fraction of Hawks in the population is graphed in blue and the fraction of hawks in red. This is redundant because the fractions always add up to 1, but it may help clarity to actually see the values together. Further insight can be gained making the same computations beginning with a different starting population, or by varying the values of the parameters (figures 7.3 to 7.6).

0 20 40 60 80 1000.0

0.2

0.4

0.6

0.8

1.0

145



0 20 40 60 80 1000.0

0.2

0.4

0.6

0.8

1.0

0 20 40 60 80 1000.0

0.2

0.4

0.6

0.8

1.0

146



From figure 7.2 to 7.3, the starting population is changed

from

4

1,

4

3 to

5

4,

5

1. This time the hawks outscore the

doves and increase in the population, but with a decreasing rate of growth as their proportion in the population appears

0 20 40 60 80 1000.0

0.2

0.4

0.6

0.8

1.0

0 20 40 60 80 1000.0

0.2

0.4

0.6

0.8

1.0

147

to asymptotically approach

3

1,

3

2=

C

VC

C

V, in both

7.2 and 7.3.

Comparing 7.3 to 7.4, the only difference is the value of base fitness. Game payoffs range from a minimum of -½ to a maximum of 2. When base fitness w = 1, total fitness ranges from ½ to 3, or 600% from the minimum to the maximum. When w = 4, total fitness ranges from 3½ to 7, or 200% from the minimum to the maximum. Because of the smaller relative variation in total fitness, the population adjusts more slowly but ultimately converges asymptotically to the same ratio.

In comparing graphs 7.4 through 7.6, the value V is held at 2 while the cost C varies through the values 3, 4 and 8. In

all cases the population converges to

C

V

C

Vdh tt 1,, .

This is no accident. When there are exactly C

V hawks,

both types score the same payoff, so the population will stay at that ratio in the next generation, and therefore in the next after that, and so on. That is, the population

C

V

C

Vdh tt 1,, is an equilibrium of the population

dynamics of this model. Moreover, if there are fewer than

C

Vhawks, then the hawk payoff is greater than the dove

payoff, and if there are more than C

V hawks, the doves

outscore the hawks. Therefore, in any population containing both hawks and doves, the dynamics will drive

148

the population toward this equilibrium mixture. This behavior is explored more generally in next section.

1.40 Evolutionary Games as Dynamical Systems

Evolutionary game models are an example of a discrete time dynamical system. The state space is the set of all heterogeneous populations in which every individual is a pure genotype; that is every individual plays a pure strategy. The state space is mathematically identical to the set of all mixed strategies of the underlying strategic form game that is played each generation. The update formula that determines the population at time t+1 from the existing population at time t is the evolution function of the system, sometimes called the system dynamic. In evolutionary games, the general term of evolution function used for general dynamical systems literally represents evolution and is often called the selection dynamic in this context, referring to Charles Darwin’s theory of natural selection.

Evolutionary games have the property that the image of any population under the selection dynamic is independent of time. That is, assuming no mutation or quasi-mutation, each population xt evolves to one and only one new population xt+1, regardless of the time period t in which the population arrives at the starting population xt. in the state space. For a general discrete time dynamical system, the evolution function maps the cross product of the state space and time into the state space, i.e. for x X, t Z, (x,t) X.

149

i. Dynamic Equilibria

An equilibrium of a dynamical system is a fixed point of the evolution function. That is, a point x in the state space such that (x,t) = x t, where is the evolution function. It was observed in the Hawk-Dove model of example 7.1.1

that a population

C

V

C

Vdh tt 1,, is an equilibrium.

This text shall refer to equilibria of evolutionary game models as dynamic equilibria in order to clearly distinguish this concept from the related but distinctly different concept of a Nash equilibrium.

A dynamic equilibrium in any evolutionary game model is a population that remains fixed under the selection dynamic. The only way this can happen is for all strategies present in the population score the same payoff. The qualification that strategies are present in the population specifically means that they are present as a positive fraction of the population. Under the selection dynamic, each mixed strategy weight at time t+1 is a positive multiple of the weight at time t. Therefore, a strategy can only have weight zero at any time t if it starts at zero in the population specified as initial conditions.

If a game has n pure strategies, the set of dynamic equilibria can include equilibria in which any number of strategies are absent from the population – i.e. have zero weight. These cases are routinely investigated by checking each combination of types and solving the system of linear equations that equate the payoffs of every type included in the combination. Unlike Nash equilibrium which requires that strategies given positive weight must be best responses, dynamic equilibria of an evolutionary game model can exist among inferior strategies if the higher

150

scoring strategies in the strategic form are left out of the population. A thorough example of this as well as other concepts is given in section 7.2.4.

If true evolution is being modeled, then extinction is a true physical possibility. This could be possible in an evolution model by allowing non-negative update multipliers, instead of requiring strictly positive multipliers, but this could mask the possibility of strategies almost going extinct and growing back again to a substantial portion of the population. If a strategy is truly destined for extinction, it will converge to zero. After this has been identified, the model can be analyzed again starting the would be extinct population at zero. It should be noted that in a model with no mutation or quasi-mutation, a population consisting of exactly one genotype (strategy) is always trivially at equilibrium by virtue of having no competition. It should also be noted that in a model with mutation or quasi-mutation, there are no true equilibria due to the randomness of the evolution function. The convention when analyzing evolution models is that the equilibria are defined as strategies (or populations) alone with no reference to time, because, as discussed above, the selection dynamic is constant with respect to time for any particular population. Pure strategies are often used to denote homogeneous populations, i.e., consisting of only one type.

ii. Stable and Unstable Equilibria

In any dynamical system, an important property of equilibria is their stability. A useful example of stability to keep in mind is a pendulum. A pendulum has two equilibria, at 0 degrees and 180 degrees, or 6 o’clock and 12 o’clock, so to speak. The equilibrium at 6 o’clock is stable: if the pendulum is pushed away from 6 o’clock, the

151

system dynamic, namely the physical forces described by Newtonian mechanics including gravity and momentum, bring the pendulum back toward 6 o’clock. Of course, it may very likely swing past 6 o’clock, but once on the other side of 6 o’clock, the system dynamic will again act to return it to 6 o’clock. This behavior illustrates the basic concept of stability. The equilibrium at 12 o’clock is very different: if the pendulum is pushed the slightest bit off 12 o’clock, the system dynamic takes it farther away from 12 o’clock, rather than returning it. The 12 o’clock equilibrium is unstable. The mathematical definition of stable equilibrium is given below.

1. Definition: Stable Equilibrium

An equilibrium x* is stable if

> 0 ||x* – xt|| < limt xt = x*

In other words, an equilibrium x* is stable if there exists an open neighborhood such that if the system state is within that neighborhood, the dynamics will pull it toward the equilibrium. Therefore, if the system is pushed off the equilibrium, for example by random events or an external shock, then the dynamics will tend to restore the equilibrium, as in the case of the pendulum at 6 o’clock, so long as the perturbation is small enough.

If an equilibrium is unstable, then an infinitesimally small shock or random perturbation of the system can send it reeling away from the equilibrium, perhaps never to come close again – like the pendulum at 12 o’clock.

152

2. Definition: Evolutionarily Stable Strategy

An evolutionarily stable strategy (ESS) is a stable equilibrium of an evolutionary game, and by analogy is also used to refer to apparently stable equilibria of the actual evolution process of the physical world. ESS can be pure strategies or mixed strategies. If they are mixed strategies, the usual interpretation, that should be considered the default in case it is not specified, is that the equilibrium exists between a mixture of pure genotypes in the population, not that the population consists of players all having the same mixed type. The latter is possible to consider mathematically, but generally gives different results than the default assumption and it is always stated explicitly when mixed types are intended. John Maynard-Smith introduced the concept of ESS in single population models his seminal book Evolutionary Game Theory. Maynard-Smith gave test that follows to identify a pure strategy ESS in a single population model by inspection of the strategic form generation game.

3. Maynard-Smith Criteria for Pure ESS

A pure strategy is a single population ESS if either

i) 1() > 1() , or

ii) 1() ≥ 1() and 1() > 1() whenever 1() = 1()

The technical proof is given in the appendix. The ideas behind the proof are summarized here. If the first condition holds, then in a population that is “close to” all , the payoff of any type in the population will be “close to” the

153

pure type payoff of vs as given in the strategic form. That is, when the population is almost all , then the pure strategy “all ” is a good approximation of the population and the pure strategy payoff of each type matched with is a good approximation of the type’s real payoff – namely the payoff computed with the precise mixed strategy representing the population. An example follows.

Figure 1.52 Pure ESS in a Single Population Model

In example 7.7, C is an ESS by criterion 7.2.1.3.i because 1(C,C) = 4, 1(B,C) = 3 and 1(A,C) = 2 so 1(C,C) > 1(B,C) and 1(C,C) > 1(A,C). B is an ESS by criterion 7.2.1.3.ii because 1(B,B) = 3, 1(A,B) = 3 and 1(C,B) = 1 so 1(B,B) > 1(C,B), 1(B,B) = 1(A,B), and 1(B,A) > 1(A,A).

iii. Attractors and Basins of Attraction

An attractor in a dynamical system is a point in the state space with the property that, at least in a local neighborhood, the system dynamics pull the system state toward the point. The neighborhood in which the dynamics pull toward the attractor is called the basin of

1,3

3,3

3,2

B

3,12,3B

4,4

2,2

C

2,2C

1,1A

A

1,3

3,3

3,2

B

3,12,3B

4,4

2,2

C

2,2C

1,1A

A

154

attraction, making the analogy of the attractor as a drain down into which all the water in the basin runs. If the basin of attraction of an attractor is the entire state space, then the attractor is called a global attractor. An attractor that is not a global attractor is called a local attractor. Sometime the term global attractor is used with additional qualifiers, e.g.

the equilibrium C

V hawks and

C

V1 doves is a global

attractor within the interior of the state space of the hawk dove game, meaning that as long as there are both hawks and doves present in the population then the selection dynamic pulls the population toward the equilibrium. Because pure type populations are always at equilibrium without mutation or quasi-mutation, there are no unqualified global attractors in evolutionary game models.

The hawk-dove example is somewhat typical in that attractors are usually stable equilibria. It is possible for a system dynamic to actually repel a system from an attractor if the attractor is ever reached, but this is uncommon (see playing dead model – coming soon). While attractors are usually stable equilibria, stable equilibria are always attractors – at least local attractors. The definition of stable equilibrium is precisely an equilibrium that is a local attractor in some neighborhood, however small.

The hawk-dove model is also an example of a mixed strategy ESS. Maynard-Smith’s test makes it easy to find pure strategy ESS in any strategic form game, but finding mixed strategy ESS can be much harder if there are more than two pure strategies. Finding equilibria is straightforward and will be covered in the next section. In a game of two types, such as the hawk-dove game, determining stability is also straightforward because there

155

is only one possible dimension of deviation from the equilibrium: the ratio of types can be only more or less than equilibrium. Observing the selective pressure on either side of the equilibrium will quickly verify or negate stability. When there are three or more pure strategies, there are infinitely many directions in which the ratio can perturbed from the equilibrium and there are infinitely many paths by which the dynamics can pull the state back toward the equilibrium. If there is a single direction of perturbation which does not result in a return to the equilibrium, this negates stability. To prove stability one must prove that all directions of perturbation result in a return toward equilibrium. The example that follows is from Dean Foster and Peyton Young. It illustrates stable and unstable equilibria and basins of attraction.

iv. Repeated Prisoner’s Dilemma Evolution

Foster and Young considered as a generation game a prisoner’s dilemma repeated for 10 periods, with payoffs from each round accumulating. They took as stage payoffs the common Prisoner’s Dilemma payoff matrix of figure 3.24 and they considered three pure strategies for the 10-round repeated game: consistent cooperation, denoted C, consistent defection, denoted D, and tit-for-tat, denoted T. Recall tit-for-tat is the strategy “cooperate on the first move and on each subsequent move echo the co-player’s move in the previous round”. The accumulated payoffs for each pure strategy match after ten rounds of play are simply 10 times the stage payoffs for the constant strategies. T paired with D can be computed by considering the outcome in each round. In round one, D defects and T cooperates, so the payoffs are zero for T and 5 for D. In rounds 2 through

156

10, both T and D defect and get payoff 1 each. Over 10 rounds, T accumulates 9 and D accumulates 14. T cooperates in every round when matched with C, so they get 30 each (figure 7.8).

Figure 1.53 Foster & Young 10-Round Prisoner’s Dilemma

The pure strategies are all equilibria, as noted above, but only D (all D players) is an ESS, by 7.2.2.3.i. T is not stable because if C players enter the population, they score as well as the T players, so the dynamics do not take the population back to all T players. To look for mixed equilibria, we check populations of C and D, C and T, D and T, and populations consisting of all three types. C and D can never be at equilibrium because the D dominates C relative to any mix of C and D. C and T can be at equilibrium in any mix because their payoffs when matched with each other are identical. To check for a mix of D and T, we set their payoffs equal:

1((0,1,0),(0,d,1 – d)) = 1((0,0,1),(0,d,1 – d))

or

(10d + 14(1 – d) = 9d + 30(1 – D)

9,14

10,10

0,50

D

14,950,0D

30,30

30,30

T

30,30T

30,30C

C

9,14

10,10

0,50

D

14,950,0D

30,30

30,30

T

30,30T

30,30C

C

157

which yields d = 16/17, or population = (0, 16/17, 1/17). To check for the possibility of all three types in the mix, observe that as long as there are D’s present in the mix, the T dominates C so there is no equilibrium. The dynamics are illustrated in a graph of the state space (figure 7.9).

Although T is not an ESS, it is a local attractor in the interior of the state space, i.e. when all three types are present. D is a local attractor because it’s an ESS. The basins of attraction of D and T can be found by determining when there is enough D in the population to result in an inexorable movement of the population toward ever more D, and likewise when there is enough T to lead to ever more T.

Figure 1.54 Foster & Young PD Evolution Dynamics

(dynamic equilibria marked in red)

The dividing line is where D and T are getting the same payoff – but rather than the equilibrium, we are interested in the interior points where there are C’s in the population.

C

D

T

17

1,

17

16,0

ctd17

36

17

16

19

36

19

20

158

1((0,1,0),(c,d,1 – c – d)) = 1((0,0,1),(c,d,1 – c – d))

or

50c + 10d + 14(1 – c – d) = 30c + 9d + 30(1 – c – d)

or

d = c

17

36

17

16

Thus, the set of points along which D and T get the same payoff is a line segment from the c = 0 boundary to the d = 0 boundary. The end of the segment at c = 0 is the mixed equilibrium (0,16/17,1/17). The end of the segment at d = 0 is the mixed equilibrium of (4/9,0,5/9). Recall that all populations with d = 0 are equilibria and they are unstable precisely because all their neighbors on that boundary are also equilibria.

Consider an interior point at which d > c17

36

17

16 . D will

outscore T, which always outscores C when d > 0, so the D’s will grow the fastest in the population and D will outscore T and C by even more, accelerating the convergence of the population toward all D. Likewise, if d

< c17

36

17

16 , T will outscore D and T’s will grow in the

population relative to D’s, and d will fall even farther

below c17

36

17

16 . This last point may not be obvious, but it

can be checked by recasting the dividing line as d =

t19

36

19

20 . From this formulation it is clear that if T

outscores D in one generation, it will do so again in the

159

next, and visa-versa. Whether or not C outscores D depends on the fraction of D in the population. At any given point in the state space, one can compute the direction in which the selection dynamic will pull the population – called the selective pressure.

v. Quasi-Mutation

Quasi-mutation in evolutionary game models means that each pure type in the population is reproduced in the next generation in part randomly. That is, instead of the proportions in the next generation being entirely determined by the selection dynamic, the population at time t+1, popt+1 = (1 – t)*selt+1 + t*popave, where selt+1 is the

population determined by the selection dynamic at time t+1 and popave is the population consisting of all pure types weighted equally. In other words, each pure type is reproduced randomly with proportion t/n, where n is the number of pure types and t is a random number in an interval [0,] for some > 0. The same idea can be made even finer scaled by selecting a different perturbation magnitude t, for each pure type each period. The name quasi-mutation refers to the fact that no new strategies are being introduced into the population. Genetic algorithms, discussed in Appendix B, go farther toward modeling true mutation by constructing strategies (types) out of “building blocks” and exercising recombination of these building blocks in the reproduction process. The basic evolution models, even with quasi-mutation, have no recombination so the basic set of strategies remains fixed.

The first order effect of quasi-mutation is that there are no more true equilibria. In particular, all pure type equilibria cease to exist as do all equilibria with any number of types

160

missing form the population. Thus quasi-mutation guarantees that selection dynamics, which will continue to account for “most” of the evolution of the population, will be based on payoffs against all types, not just a subset. Even low-scoring types can play an important role under quasi-mutation if they slightly influence the balance between high scoring types.

The only equilibria from deterministic models that continue to have importance in models with quasi-mutation are stable equilibria, i.e. local or global attractors. A stable equilibrium of a deterministic model will be a quasi-equilibrium and remain quasi-stable so long as the e governing the quasi-mutation is considerably smaller than the basin of attraction of the deterministic equilibrium. That is, if the random perturbation does not carry the population outside the basin of attraction, at least not very often. A perturbation can take the population part way to the edge of the basin in one generation, and before the population can return it takes it closer to the edge in a subsequent generation, and based on the selective pressure it can be more or less likely that the population randomly leaves the basin of attraction of one attractor and falls in to the basin of another attractor.

Foster and Young presented this argument (for the author takes no credit) in the paper in which the 3-type repeated prisoner’s dilemma model of 7.2.4. The pure defection type remains quasi-stable and the tit-for-tat type, which was not technically stable in the deterministic model but which was almost stable becomes quasi-stable. All other equilibria cease to have special properties. Foster and Young went further by using Monte Carlo computer simulation experiments to test how often such a random switch of attractors might occur or tens of

161

millions of generations. They found that, on average, the population would spend about 90% of the time in the near vicinity of the pure defection attractor and only about 10% of it’s time near the tit-for-tat attractor. They explained this by observing that the pure cooperation type was not selected against very strongly in a population with very few defectors. Thus it was much easier for the population to randomly stray across the “bottom” of the state space with low d until the ratio of c/t became large enough that the population had fallen into the basin of the defection attractor. Once in that basin, the payoff differential eliminated cooperators rather quickly and hurried the convergence toward the defection attractor. Once in the neighborhood of the defection attractor, the selective pressure against tit-for-tat was much stronger than the selective pressure against cooperation in the basin of tit-for tat, so random shifts toward the tit-for-tat basin were quickly counteracted by selection and only the luckiest sequence of shifts toward the tit-for-tat attractor would spill into the basin of attraction for tit-for-tat.

vi. Rock–Scissors–Paper Evolution Dynamics

Rock–Scissors–Paper is somewhat famous as an evolutionary game because of its interesting dynamics. The payoff matrix given in figure 3.28 is repeated here for convenience (figure 7.10).

162

Figure 1.55 Rock Scissors Paper as a Generation Game

The only equilibrium in the interior is (1/3,1/3,1/3), because in each pair of different types, one beats the other and they all tie themselves. This can be seen in the payoff

matrix where in any 2x2 symmetric sub-matrix

ajjaji

ijaii

one type always dominates so no two types ever get the same payoff without the third type present.

The equilibrium is far from stable. In fact, it can never be reached if the population is not initialized at that point. Instead, the selection dynamics take the population on a cyclic path toward one pure type and then another until the population is virtually tracing the boundary of the triangular state space. Figure 7.11 was generated by with a starting population of (1/3+2,1/3–), where = .005. Because at the start there are more rocks, paper does the best and scissors does the worst. When the rocks have declined enough and the paper increased enough, the scissors become the high scorers and increase while the paper declines the fastest until there are so many scissors and so few papers that the rocks become the high scorers.

–1

0

1

S

1–1S

0

–1

P

1P

0R

R

–1

0

1

S

1–1S

0

–1

P

1P

0R

R

163

Figure 1.56 Rock Scissors Paper Evolution Dynamics

The smaller a type gets in the population, the longer it takes to grow back even when it is getting the better score, and the bigger a type in the population, the faster it takes over the population when it gets a better score, so the dynamics take ever more wild swings leading to a type of convergence called limit cycles. A rigorous proof of this outcome and its inevitability is deferred to Appendix A.

1. Definition: Limit Cycle

To be completed later.

vii. Finding one-population dynamic equilibria

This section merely crystallizes the key steps from the preceding examples of how to exhaustively search for all dynamic equilibria of a one-population evolutionary game model.

R

S P

164

1. All pure strategies (i.e. all populations with only one type present in non-zero proportion) are trivially in equilibrium by virtue of no competition.

2. For each subset {i1,…,ik} {1,…,n}of k pure strategies, 2 k n, check for a mixed strategy x = (x1,…,xn) such that xi > 0 for i = i1,…,ik and xi = 0

for i i1,…,ik and such that the payoffs ai.x are equal for each pure strategy i in the set {i1,…,ik}, where ai = {ai1,…,ain} Any such x for any value of k is an equilibrium population.

Step (2) is quite similar to the table method of section 3.7 but it is simpler in two ways. First, only the diagonal of the table is checked because in a one-population model, the mixed strategy of the column player is the one and only population and player one alternately takes on the role of each pure strategy, which all must get the same against the population mixture to be in equilibrium. Second, we are not concerned about the strategies not in the population. Therefore, we do not eliminate strategies if they are dominated only by strategies outside the set. If some strategy is dominated by a strategy inside the set, we don’t just eliminate the dominated strategies – we eliminate the entire set as a possible equilibrium.

1.41 Two-population Evolution Models

In a one-population model, members of a single population are repeatedly matched with one another to play a two player game, In a two-population model, members of one population are repeatedly matched with members of a second population to play a two-player game. This fundamental difference is the reason for every other difference between one- and two-population models. One

165

difference is that the generation game for a two-population model need not be symmetric. Recall that in a one-population model, the symmetry requirement is necessary to avoid accounting for which player has the role of player 1 and which the role of player 2 when two players from the same population are matched. In a two population model, members of the “player 1” population always have the role of player 1 and members of the “player 2” population always have the role of player 2. Another difference is that base fitness need not be the same for the two populations, so there can be two base fitness parameters so the rates of change of the two populations can be controlled independently. Naturally, the computational formulas of two-population models are different from the one-population formulas, but entirely consistent when the fundamental difference in the model is taken into account.

Let G = (aij,bij) be any general sum game with the usual convention that aij and bij are the payoffs to players 1 and 2, respectively, when pure strategies i and j are used. A two-population evolutionary game model is initialized with two starting populations

),...,( ,,10 tmt xxx

),...,( ,,10 tnt yyy

For any t ≥ 0 payoffs to each type i in population 1 in generation t are computed for each pure type as

11

,, wayn

jijtjti

where w1 is base fitness for population 1. Notice that the payoff to any member of population 1 depends on the

166

mixture of types in population 2 because it is from population 2 that the opponents are randomly selected.

The average payoff in population 1, as in a one-population model, is

m

itititave x

1,,,1

.

Given the type payoffs and the average payoff, the fraction of each type in population 1 is updated just as in a one-population model:

titave

titi xx ,

,1,

,1,

For any t ≥ 0 payoffs to each type i in population 2 in generation t are computed for each pure type as

21

,, wbxm

iijtitj

where w2 is base fitness for population 1. The payoff to any member of population 2 depends on the mixture of types in population 1 because it is from population 1 that the opponents are randomly selected.

The average payoff in population 2, is

n

jtjtjtave y

1,,,2

.

The fraction of each type in population 2 is updated as

tjtave

tjtj yy ,

,2,

,1,

167

Starting with an initial pair of mixed strategy populations, each population updates in response to the other population, after which they each react to each other’s updates and this cycle continues forever. The dynamic possibilities are analogous to the one population case, with considerable increase in the possible complexity as the two populations are not only interacting but they can evolve at different rates, controlled by the two base fitness parameters w1 and w2.

i. Two Population Numerical Example

This example is solely to illustrate the calculations just described. Subsequent examples will relate the modeling to evolution of predator – prey interactions. An arbitrary payoff matrix is given along with initial values and base fitness for the two populations, after which the update computations are shown.

w1 = 1; w2 = 2

5

2,

5

2,

5

1;

3

1,

3

200 yx

Update population 1 from time 0 to time 1:

B

A

EDC

3,11,11,2

1,42,12,1

B

A

EDC

3,11,11,2

1,42,12,1

168

5

1311

5

22

5

22

5

10, A

5

1413

5

21

5

21

5

10, B

15

40

5

14

3

1

5

13

3

20,1 ave

40

26

40

15

5

13

3

21, A

40

14

40

15

5

14

3

11, B

Update population 2 from time 0 to time 1:

3

1022

3

11

3

20, C

3

921

3

11

3

20, D

3

1521

3

14

3

20, E

15

58

3

15

5

2

3

9

5

2

3

10

5

10,2 ave

58

10

58

15

3

10

5

11, C

58

18

58

15

3

9

5

21, D

58

30

58

15

3

15

5

21, E

169

Notice in the example how the mixture of a type’s opponent population is what affects the game payoffs and the mixture of a type’s own population is what affects the average payoff in that population. This should be completely clear if you understand the map between the formulas and the interactions being modeled. It is repeated here to urge the reader to pause and review if it is not clear.

ii. Finding Two-Population Dynamic Equilibria

Finding dynamic equilibria is even more similar to finding Nash equilibria in a two-population evolution model than in a one-population model, because the restriction of symmetry is removed. The table method of section 3.7 is recommended as a book-keeping structure, but, as with one-population equilibria, domination by strategies given zero weight is irrelevant. The only requirement is that every strategy given positive in each mixture under consideration can get the same payoff by some choice of weights on the strategies assumed to be present in opponent population. The game of 7.3.1 is used here for an example of how to exhaustively search for all two-population equilibria.

170

Figure 1.57 Equilibrium Possibilities for Example 7.3.1

The following codes explain the reasons for eliminating the possibility of equilibria satisfying the mixture assumptions of the table cells. Multiple codes in some cells explain all possible justifications for elimination, though a single justification always suffices.

X1 – When population 1 consists only of type A, type E dominates C and D so no mixture including E will be in equilibrium.

X2 – When population 1 consists only of type B, type C dominates D and E so no mixture including E will be in equilibrium.

X3 – When population 1 is a mix of types A and B, type D is dominated by both C and E, and so no mixture including D will be in equilibrium.

Z1 – When population 2 consists solely of type C or D, type A dominates type B, so no mixture of A and B will be in equilibrium.

Z2 – When population 2 consists solely of type E, type B dominates type A, so no mixture of A and B will be in equilibrium.

X3X3E3X3,Z3Z2Z1Z1AB

X2E2X2X2B

X1X1X1E1E0

A

CDEDECECDEDC

X3X3E3X3,Z3Z2Z1Z1AB

X2E2X2X2B

X1X1X1E1E0

A

CDEDECECDEDC

171

Z3 – When population 2 is a mix of types C and D, A dominates B so no mixture of A and B will be in equilibrium.

Cells not eliminated from the table are then further analyzed for equilibria.

E0 – All profiles consisting of a single type in each population are trivially in equilibrium by virtue of having no competition.

E1 – Population 1 is trivially in equilibrium and any mixture of C and D in population 2 is in equilibrium because C and D get the same payoff against A.

E2 – Population 1 is trivially in equilibrium and any mixture of D and E in population 2 is in equilibrium because D and E get the same payoff against B.

E3 – Population 1 is in equilibrium if A and B get the same payoff, which occurs if the mix in population 2 is (2/3, 0, 1/3). Population 2 is in equilibrium if C and E get the same payoff, which occurs if the mix in population 1 is (1/4,3/4).

iii. Two Population Predator – Prey Example

Coming soon – nothing technically new – just a application showing how the math structures can model a real world example.

1.42 Comparing Nash and Evolutionary Equilibria

This very short section is given a high level in the outline because of its importance and its generality. The similarities and differences between Nash equilibria and dynamic equilibria of evolutionary models are summarized

172

for quick reference and with an emphasis on concept over computation.

The first important difference is conceptual: a Nash equilibrium is a strategy choice for each of several players in a game. Dynamic equilibrium in an evolution model is not a set of strategy choices – it is a set of one or more populations, each made up of individuals who each play a particular pure strategy exclusively. A Nash equilibrium for a two-player game includes two strategies – one for each player, but when a one-population model is made from a two player game, a dynamic equilibrium is just a single population. Members of the population are randomly matched over and over to play each other in the two player game so the payoff to an individual player, over a lifetime of matches in a generation, are on average the mixed strategy payoffs of that player’s exclusive pure strategy against the mixed strategy that represents the population.

A mathematical similarity between Nash equilibrium and evolutionary equilibrium is that both require all strategies in the mix to get equal payoff. The difference is that for dynamic equilibria we don’t care about the strategies not in the mix and for Nash equilibria we require that strategies not in the mix get lesser or most equal payoff to those in the mix. In two-population models that is the only mathematical difference, so any Nash equilibrium of a two player game corresponds to exactly one dynamic equilibrium of the two-population evolution model corresponding to the game. A one population model can only be made from a symmetric two player game and the only Nash equilibria that correspond to dynamic equilibria are the symmetric equilibria; i.e., those of the form (x,x). Therefore, while the set of Nash equilibria can be thought

173

of as a subset of the set of dynamic equilibria for two-population models, the same cannot be said of one-population models. For a one-population models, there can be Nash equilibria of the game that do not correspond to dynamic equilibria. For both one- and two-population models, there can be dynamic equilibria that do not correspond to Nash equilibria because the population (mixed strategy) is such that all pure strategies given positive weight are equally inferior responses to the mixture. Note that if all strategies in the generation game are given positive weight by a strategy x, and all get equal payoff against x, then the x is a dynamic equilibrium of a one-population model, (x,x) is a dynamic equilibrium of a two-population model, and (x,x) is a Nash equilibrium of the game. Thus there is a one-to-one correspondence of Nash equilibria giving positive weight to all pure strategies and evolutionary equilibria of either one- or two-population models containing all pure types in positive proportion.

174

a. Deferred Proofs

i. Chapter 1 Proofs

None.

175

ii. Chapter 2 Proofs

1. Backward Induction

iii. Chapter 3 Proofs

1. Existence of Mixed Equilibria in Strategic Form Games

iv. Chapter 4 Proofs

176

1. Existence of Mixed Equilibria in Games with a Continuum of Strategies

v. Chapter 5 Proofs

1. Folk Theorem of Equilibrium in the Infinitely Repeated Prisoner’s Dilemma

vi. Chapter 6 Proofs

177

1. Existence and Uniqueness of Nash Arbitration Solution

vii. Chapter 7 Proofs

1. Maynard-Smith Pure ESS Criteria

2. Convergence to limit cycles in the rock-scissors-paper game

178

b. Supplemental Topics

viii. Chapter 1 Topics

1. WWII and Operations Research

2. The Rand Corporation

ix. Chapter 2 Topics

1. Chess

2. Multi-stage Games

3. Refinements of Nash Equilibrium

x. Chapter 3 Topics

179

1. Infinite Strategic Form Games

xi. Chapter 4 Topics

1. More Economic Models

xii. Chapter 5 Topics

1. Repeated Prisoner’s Dilemma Tournaments

180

2. More on Adaptive and Reinforcement Learning

3. Games on Graphs - Local Interaction Networks - Cascades

xiii. Chapter 6 Topics

1. What if the Grand Coalition Doesn’t Form?

181

xiv. Chapter 7 Topics

1. Genetic Algorithms

2. Selection via multiple games

3. Over-lapping Generations Models

4. Signaling Games and the Handicap Principle

182

5. Byrne-Kurland Self-Deception Model

183

c. Applications of Game Theory Auctions (FCC + eBay)

Taxation – Audits

War

Policy Analysis/Mechanism design (welfare, traffic, healthcare, pollution, climate change, …)

Price choosing, investment analysis, other economic & business questions

Biology, Psychology, Computer Science, sociology

184

d. Solutions to Exercises

185

e. References

Game theory, byrne

Documents

Transcript of Game theory, byrne