Unit III: The Evolution of Cooperation

56
Unit III: The Evolution of Cooperation Can Selfishness Save the Environment? Repeated Games: the Folk Theorem Evolutionary Games A Tournament How to Promote Cooperation 3/30

description

Unit III: The Evolution of Cooperation. Can Selfishness Save the Environment? Repeated Games: the Folk Theorem Evolutionary Games A Tournament How to Promote Cooperation. 3/ 30. Can Selfishness Save the Environment?. The Problem of Cooperation The Tragedy of the Global Commons? - PowerPoint PPT Presentation

Transcript of Unit III: The Evolution of Cooperation

Page 1: Unit III: The Evolution of Cooperation

Unit III: The Evolution of Cooperation

•Can Selfishness Save the Environment?•Repeated Games: the Folk Theorem•Evolutionary Games•A Tournament•How to Promote Cooperation

3/30

Page 2: Unit III: The Evolution of Cooperation

Can Selfishness Save the Environment?

• The Problem of Cooperation• The Tragedy of the Global Commons?• Common Resource Game • We Play a Game• Repeated Games• Discounting• The Folk Theorem

Page 3: Unit III: The Evolution of Cooperation

How can a number of individuals, each behaving as a utility maximizer, come to behave as a group and maximize joint utility?

The Problem of Cooperation

Page 4: Unit III: The Evolution of Cooperation

C 6,6 0,5

D 5,0 1,1

The Problem of Cooperation

C D

Players may fail to cooperate (i.e., fail to maximize joint payoffs), because they lack information.

If each has reason to believe the other will cooperate, the problem is solved!

Assurance Game Prisoner’s Dilemma

Page 5: Unit III: The Evolution of Cooperation

C 6,6 0,5

D 5,0 1,1

The Problem of Cooperation

C D

C 3,3 0,5

D 5,0 1,1

C D

Assurance Game Prisoner’s Dilemma

Page 6: Unit III: The Evolution of Cooperation

C 6,6 0,5

D 5,0 1,1

The Problem of Cooperation

C D

C 3,3 0,5

D 5,0 1,1

C DEasy

In the Prisoner’s Dilemma, there is no belief that will lead the players to cooperate.

Rather than a problem of information, this is a problem of incentives.

Cooperation is both inaccessible and unstable.

Prisoner’s Dilemma

Page 7: Unit III: The Evolution of Cooperation

The problem of cooperation arises in several important contexts, including

public goods: everyone can enjoy the good even if they don’t pay for it, e.g., nat’l defense, public tv.

common (property) resources: raw or natural resources that are own by everyone (or no one), e.g., clean air, clean water, biodiversity.

Can Selfishness Save the Environment?

Problem of Cooperation

– subject to the Free-rider problem– undersupplied by a voluntary contribution scheme

– subject to the “Tragedy of the Commons” (Hardin, 1968)– overconsumed (depleted)

Page 8: Unit III: The Evolution of Cooperation

Can Selfishness Save the Environment? Arguments to the effect that “polluting is wrong” are less likely to be effective than measures that get the incentives right over the long run (Ridley & Low, 1993).

Our Common Future: Transboundary pollution, ozone depletion, nuclear proliferation, global warming, loss of biodiversity, deforestation, overfishing are all the consequences of continuing economic growth and development (Brundtland, 1987).

Negative externalities

Tragedy of the Global Commons?

Page 9: Unit III: The Evolution of Cooperation

Tragedy of the Global Commons?

Consider a country deciding on its optimal level of economic growth (X), in the presence of a negative externality (e.g., transboundary pollution, etc.). National Utility is a positive function of own growth and negative function of overall growth (X,X’): 

National Utility Own Choice of X All Choices of X 

P(X,X’) = a(X) – b(X,X’) + c 

Alternatively: X can be the voluntary contribution level in the case of a public good (bad); or the consumption level in the case of a common resource.

Page 10: Unit III: The Evolution of Cooperation

Common Resource GameTwo fishermen fish from a single lake. Each year, there are a fixed number of fish in the lake and two periods during the year that they can be harvested, spring and fall. Each fisherman consumes all the fish he catches each period, and their identical preferences are described by the following consumption function:

Ui = CsCf

where Cs = spring catch; Cf = fall catch. 

Each spring, each fisherman decides how many fish to remove from the lake. In the fall, the remaining fish are equally divided between the two.

Page 11: Unit III: The Evolution of Cooperation

Common Resource GameConsider two fishermen deciding how many fish to remove from a

commonly owned pond. There are Y fish in the pond.• Period 1 each fishery chooses to consume (c1, c2).• Period 2 remaining fish are equally divided (Y –

(c1+c2))/2). 

 

c1 = (Y – c2)/2  

 

Y/3 c1

c2

Y/3c2 = (Y – c1)/2

Ui = ct ct+1

Ct = Today’s consumption

Ct+1 = Tomorrow’s ‘’

Page 12: Unit III: The Evolution of Cooperation

Common Resource GameConsider two fishermen deciding how many fish to remove from a

commonly owned pond. There are Y fish in the pond.• Period 1 each fishery chooses to consume (c1, c2).• Period 2 remaining fish are equally divided (Y –

(c1+c2))/2). 

 

c1 = (Y – c2)/2  

 

Y/4Y/3 c1

c2

Y/3Y/4 c2 = (Y – c1)/2

Social Optimality: c1 = c2 = Y/4

Page 13: Unit III: The Evolution of Cooperation

Common Resource GameConsider two fishermen deciding how many fish to remove from a

commonly owned pond. There are Y fish in the pond.• Period 1 each fishery chooses to consume (c1, c2).• Period 2 remaining fish are equally divided (Y –

(c1+c2))/2). 

 

c1 = (Y – c2)/2  

 

Y/4Y/3 c1

c2

Y/3Y/4 c2 = (Y – c1)/2

If there are 12 fish in the pond, each will consume (Y/3) 4 in the spring and 2 in the fall in a NE. Both would be better off consuming (Y/4) 3 in the fall, leaving 3 for each in the spring.

Page 14: Unit III: The Evolution of Cooperation

If there are 12 fish in the pond, each will consume (Y/3) 4 in the spring and 2 in the fall in a NE. Both would be better off consuming (Y/4) 3 in the fall, leaving 3 for each in the spring.

C 9, 9 7.5,10

D 10,7.5 8, 8

Common Resource Game

C D

A Prisoner’s Dilemma

What would happen if the game were

repeated?

C = 3 in the springD = 4 “ “

Page 15: Unit III: The Evolution of Cooperation

We Play a Game

At each round of the game, you will have the chance to contribute to a public good (e.g., national defense; public tv).

The game is repeated for several rounds, and payoffs are calculated as follows:

1 pt. for each contribution made by anyone. + 3 pts. for each round you don’t contribute.

See Holt and Laury, JEP 1997: 209-215.

Page 16: Unit III: The Evolution of Cooperation

Payoffs = 1 pt. for each contribution by anyone; 10 pts. for each round you don’.

You play: Contribution Rate (n-1)0% 10 … 50 … 90 100%

Contribute 1 4 16 28 30Don’t 3 6 18 30 33

Assume n = 30

We Play a GamePayoff: 1 pt. for each contribution made by anyone. + 3 pts. for each round you don’t contribute.

n-person Prisoner’s Dilemma: Don’t Contribute is a dominant strategy. But if none Contribute, the outcome is inefficient.

Page 17: Unit III: The Evolution of Cooperation

We Play a Game

Public Goods Games

Round No. of Round No. of Contributions Contributions

Round Contributions

Round Contributions

1 7 8 172 5 9 103 3 10 34 4 11 15 2 12 …6 1 137 1 14

Data from 2009. N = 20. Communication was allowed between rounds 7 and 8.

Page 18: Unit III: The Evolution of Cooperation

We Play a Game

Public Goods Games

Typically, contribution rates:

• 40-60% in one-shot games & first round of repeated games

• <30% on announced final round• Decrease with group size• Increase with “learning”

Page 19: Unit III: The Evolution of Cooperation

Repeated Games

Examples of Repeated Prisoner’s Dilemma

• Overfishing• Transboundary pollution• Cartel enforcement• Labor union• Public goods

The Tragedy of the Global Commons

Free-rider Problems

Page 20: Unit III: The Evolution of Cooperation

Repeated Games

Some Questions:

• What happens when a game is repeated? • Can threats and promises about the future

influence behavior in the present?• Cheap talk• Finitely repeated games: Backward induction• Indefinitely repeated games: Trigger strategies

Page 21: Unit III: The Evolution of Cooperation

The Evolution of CooperationUnder what conditions will cooperation emerge in world of egoists without central authority?

Axelrod uses an experimental method – the indefinitely repeated PD tournament – to investigate a series of questions: Can a cooperative strategy gain a foothold in a population of rational egoists? Can it survive better than its uncooperative rivals? Can it resist invasion and eventually dominate the system?      

Page 22: Unit III: The Evolution of Cooperation

The Evolution of Cooperation

The Indefinitely Repeated Prisoner’s Dilemma TournamentAxelrod (1980a,b, Journal of Conflict Resolution).

A group of scholars were invited to design strategies to play indefinitely repeated prisoner’s dilemmas in a round robin tournament.

Contestants submitted computer programs that select an action, Cooperate or Defect, in each round of the game, and each entry was matched against every other, itself, and a control, RANDOM.

Page 23: Unit III: The Evolution of Cooperation

The Indefinitely Repeated Prisoner’s Dilemma TournamentAxelrod (1980a,b, Journal of Conflict Resolution).

Contestants did not know the length of the games. (The first tournament lasted 200 rounds; the second varied probabilistically with an average of 151.)

The first tournament had 14 entrants, including game theorists, mathematicians, psychologists, political scientists, and others.

Results were published and new entrants solicited. The second tournament included 62 entrants . . .

The Evolution of Cooperation

Page 24: Unit III: The Evolution of Cooperation

The Indefinitely Repeated Prisoner’s Dilemma Tournament

TIT FOR TAT won both tournaments!TFT cooperates in the first round, and then does whatever

the opponent did in the previous round.

TFT “was the simplest of all submitted programs and it turned out to be the best!” (31).

TFT was submitted by Anatol Rapoport to both tournaments, even after contestants could learn from the results of the first.

The Evolution of Cooperation

Page 25: Unit III: The Evolution of Cooperation

The Indefinitely Repeated Prisoner’s Dilemma Tournament

TIT FOR TAT won both tournaments!In addition, Axelrod provides a “theory of cooperation” based on his analysis of the repeated prisoner’s dilemma game.

In particular, if the “shadow of the future” looms large, then players may have an incentive to cooperate. A cooperative strategy such as TFT is “collectively stable.”

He also offers an evolutionary argument, i.e., TFT wins in an evolutionary competition in which payoffs play the role of reproductive rates.

The Evolution of Cooperation

Page 26: Unit III: The Evolution of Cooperation

The Indefinitely Repeated Prisoner’s Dilemma Tournament

This result has been so influential that “some authors use TIT FOR TAT as though it were a synonym for a self-enforcing, cooperative agreement” (Binmore, 1992, p. 433). And many have taken these results to have shown that TFT is the “best way to play” in IRPD.

· While TFT won these, will it win every tournament? · Is showing that TFT is collectively stable equivalent to

predicting a winner in the computer tournaments?· Is TFT evolutionarily stable?

The Evolution of Cooperation

Page 27: Unit III: The Evolution of Cooperation

The Evolution of CooperationClass Tournament

Imagine a population of strategies matched in pairs to play repeated PD, where outcomes determine the number of offspring each leaves to the next generation.

– In each generation, each strategy is matched against every other, itself, and RANDOM.

– Between generations, the strategies reproduce, where the chance of successful reproduction (“fitness”) is determined by the payoffs (i.e., payoffs play the role of reproductive rates).

 

Then, strategies that do better than average will grow as a share of the population and those that do worse than average will eventually die-out. . .

Page 28: Unit III: The Evolution of Cooperation

Repeated Games

Some Questions:

• What happens when a game is repeated? • Can threats and promises about the future

influence behavior in the present?• Cheap talk• Finitely repeated games: Backward induction• Indefinitely repeated games: Trigger strategies

Page 29: Unit III: The Evolution of Cooperation

Can threats and promises about future actions influence behavior in the present? Consider the following game, played 2X:

C 3,3 0,5

D 5,0 1,1

Repeated Games

C D

See Gibbons: 82-104.

Page 30: Unit III: The Evolution of Cooperation

Repeated Games

Draw the extensive form game:

(3,3) (0,5) (5,0) (1,1)

(6,6) (3,8) (8,3) (4,4) (3,8)(0,10)(5,5)(1,6)(8,3) (5,5)(10,0) (6,1) (4,4) (1,6) (6,1) (2,2)

Page 31: Unit III: The Evolution of Cooperation

Repeated Games Now, consider three repeated game strategies:

D (ALWAYS DEFECT): Defect on every move.

C (ALWAYS COOPERATE): Cooperate on every move.

T (TRIGGER): Cooperate on the first move, then cooperate

after the other cooperates. If the other defects, then

defect forever.

Page 32: Unit III: The Evolution of Cooperation

Repeated GamesIf the game is played twice, the V(alue) to a player using ALWAYS DEFECT (D) against an opponent using ALWAYS DEFECT(D) is:

V (D/D) = 1 + 1 = 2, and so on. . . V (C/C) = 3 + 3 = 6V (T/T) = 3 + 3 = 6V (D/C) = 5 + 5 = 10V (D/T) = 5 + 1 = 6V (C/D) = 0 + 0 = 0V (C/T) = 3 + 3 = 6

V (T/D) = 0 + 1 = 1V (T/C) = 3 + 3 = 6

Page 33: Unit III: The Evolution of Cooperation

Repeated GamesAnd 3x:

V (D/D) = 1 + 1 + 1 = 3 V (C/C) = 3 + 3 + 3 = 9V (T/T) = 3 + 3 + 3 = 9V (D/C) = 5 + 5 + 5 = 15V (D/T) = 5 + 1 + 1 = 7V (C/D) = 0 + 0 + 0 = 0V (C/T) = 3 + 3 + 3 = 9

V (T/D) = 0 + 1 + 1 = 2V (T/C) = 3 + 3 + 3 = 9

Page 34: Unit III: The Evolution of Cooperation

Repeated GamesTime average payoffs:

n=3

V (D/D) = 1 + 1 + 1 = 3 /3 = 1V (C/C) = 3 + 3 + 3 = 9 /3 = 3V (T/T) = 3 + 3 + 3 = 9 /3 = 3V (D/C) = 5 + 5 + 5 = 15 /3 = 5V (D/T) = 5 + 1 + 1 = 7 /3 = 7/3V (C/D) = 0 + 0 + 0 = 0 /3 = 0V (C/T) = 3 + 3 + 3 = 9 /3 = 3

V (T/D) = 0 + 1 + 1 = 2 /3 = 2/3

V (T/C) = 3 + 3 + 3 = 9 /3 = 3

Page 35: Unit III: The Evolution of Cooperation

Repeated GamesTime average payoffs:

n

V (D/D) = 1 + 1 + 1 + ... /n = 1V (C/C) = 3 + 3 + 3 + ... /n = 3V (T/T) = 3 + 3 + 3 + ... /n = 3V (D/C) = 5 + 5 + 5 + ... /n = 5V (D/T) = 5 + 1 + 1 + ... /n = 1 + eV (C/D) = 0 + 0 + 0 + ... /n = 0V (C/T) = 3 + 3 + 3 + … /n = 3

V (T/D) = 0 + 1 + 1 + ... /n = 1 - e

V (T/C) = 3 + 3 + 3 + ... /n = 3

Page 36: Unit III: The Evolution of Cooperation

Repeated Games Now draw the matrix form of this game:

1x

T 3,3 0,5 3,3

C 3,3 0,5 3,3

D 5,0 1,1 5,0

C D T

Page 37: Unit III: The Evolution of Cooperation

Repeated Games

T 3,3 1-e,1+e 3,3

C 3,3 0,5 3,3

D 5,0 1,1 1+e,1-e

C D T

If the game is repeated, ALWAYS DEFECTis no longer dominant.

Time Average Payoffs

Page 38: Unit III: The Evolution of Cooperation

Repeated Games

T 3,3 1-e,1+e 3,3

C 3,3 0,5 3,3

D 5,0 1,1 1+e,1-e

C D T

… and TRIGGERachieves “a NE with itself.”

Page 39: Unit III: The Evolution of Cooperation

Repeated Games

Time Average Payoffs

T(emptation) >R(eward)>P(unishment)>S(ucker)

T R,R P-e,P+e R,R

C R,R S,T R,R

D T,S P,P P+e,P-e

C D T

Page 40: Unit III: The Evolution of Cooperation

DiscountingThe discount parameter, d, is the weight of the next payoff

relative to the current payoff.

In a indefinitely repeated game, d can also be interpreted as the likelihood of the game continuing for another round (so that the expected number of moves per game is 1/(1-d)).  

The V(alue) to someone using ALWAYS DEFECT (D) when playing with someone using TRIGGER (T) is the sum of T for the first move, d P for the second, d2P for the third, and so on (Axelrod: 13-4): 

V (D/T) = T + dP + d2P + …

“The Shadow of the Future”

Page 41: Unit III: The Evolution of Cooperation

Discounting

Writing this as V (D/T) = T + dP + d 2P +..., we have the following:

V (D/D) = P + dP + d2P + … = P/(1-d) V (C/C) = R + dR + d2R + … = R/(1-d) V (T/T) = R + dR + d2R + … = R/(1-d) V (D/C) = T + dT + d2T + … = T/(1-d) V (D/T) = T + dP + d2P + … = T+ dP/(1-d) V (C/D) = S + dS + d2S + … = S/(1-d) V (C/T) = R + dR + d2R + … = R/(1- d)

V (T/D) = S + dP + d2P + … = S+ dP/(1-d)

V (T/C) = R + dR + d2R + … = R/(1- d)

Page 42: Unit III: The Evolution of Cooperation

T

C

D

DiscountedPayoffs

T > R > P > S 0 > d > 1

R/(1-d) S/(1-d) R/(1-d)

R/(1-d) T/(1-d) R/(1-d)T/(1-d) P/(1-d) T + dP/(1-d)

S/(1-d) P/(1-d) S + dP/(1-d)

Discounting

C D T

R/(1-d) S + dP/(1-d) R/(1- d)

R/(1-d) T + dP/(1-d) R/(1-d)

Page 43: Unit III: The Evolution of Cooperation

T

C

D

DiscountedPayoffs

T > R > P > S 0 > d > 1

T weakly dominates C

R/(1-d) S/(1-d) R/(1-d)

R/(1-d) T/(1-d) R/(1-d)T/(1-d) P/(1-d) T + dP/(1-d)

S/(1-d) P/(1-d) S + dP/(1-d)

Discounting

C D T

R/(1-d) S + dP/(1-d) R/(1- d)

R/(1-d) T + dP/(1-d) R/(1-d)

Page 44: Unit III: The Evolution of Cooperation

DiscountingNow consider what happens to these values as d varies (from 0-1):

V (D/D) = P + dP + d2P + … = P/(1-d) V (C/C) = R + dR + d2R + … = R/(1-d) V (T/T) = R + dR + d2R + … = R/(1-d) V (D/C) = T + dT + d2T + … = T/(1-d) V (D/T) = T + dP + d2P + … = T+ dP/(1-d) V (C/D) = S + dS + d2S + … = S/(1-d) V (C/T) = R + dR + d2R + … = R/(1- d)

V (T/D) = S + dP + d2P + … = S+ dP/(1-d) V (T/C) = R + dR + d2R + … = R/(1- d)

Page 45: Unit III: The Evolution of Cooperation

DiscountingNow consider what happens to these values as d varies (from 0-1):

V (D/D) = P + dP + d2P + … = P/(1-d) V (C/C) = R + dR + d2R + … = R/(1-d) V (T/T) = R + dR + d2R + … = R/(1-d) V (D/C) = T + dT + d2T + … = T/(1-d) V (D/T) = T + dP + d2P + … = T+ dP/(1-d) V (C/D) = S + dS + d2S + … = S/(1-d) V (C/T) = R + dR + d2R + … = R/(1- d)

V (T/D) = S + dP + d2P + … = S+ dP/(1-d) V (T/C) = R + dR + d2R + … = R/(1- d)

Page 46: Unit III: The Evolution of Cooperation

DiscountingNow consider what happens to these values as d varies (from 0-1):

V (D/D) = P + dP + d2P + … = P+ dP/(1-d) V (C/C) = R + dR + d2R + … = R/(1-d) V (T/T) = R + dR + d2R + … = R/(1-d) V (D/C) = T + dT + d2T + … = T/(1-d) V (D/T) = T + dP + d2P + … = T+ dP/(1-d) V (C/D) = S + dS + d2S + … = S/(1-d) V (C/T) = R + dR + d2R + … = R/(1- d)

V (T/D) = S + dP + d2P + … = S+ dP/(1-d) V (T/C) = R + dR + d2R + … = R/(1- d)

V(D/D) > V(T/D) D is a best response to D

Page 47: Unit III: The Evolution of Cooperation

DiscountingNow consider what happens to these values as d varies (from 0-1):

V (D/D) = P + dP + d2P + … = P+ dP/(1-d) V (C/C) = R + dR + d2R + … = R/(1-d) V (T/T) = R + dR + d2R + … = R/(1-d) V (D/C) = T + dT + d2T + … = T/(1-d) V (D/T) = T + dP + d2P + … = T+ dP/(1-d) V (C/D) = S + dS + d2S + … = S/(1-d) V (C/T) = R + dR + d2R + … = R/(1- d)

V (T/D) = S + dP + d2P + … = S+ dP/(1-d) V (T/C) = R + dR + d2R + … = R/(1- d)

2

1

3

?

Page 48: Unit III: The Evolution of Cooperation

DiscountingNow consider what happens to these values as d varies (from 0-1): For all values of d: V(D/T) > V(D/D) > V(T/D)

V(T/T) > V(D/D) > V(T/D)  

Is there a value of d s.t., V(D/T) = V(T/T)? Call this d*.

If d < d*, the following ordering hold: 

V(D/T) > V(T/T) > V(D/D) > V(T/D)  

D is dominant: GAME SOLVED

V(D/T) = V(T/T)T+dP(1-d) = R/(1-d) T-dt+dP = R T-R = d(T-P)

d* = (T-R)/(T-P)

?

Page 49: Unit III: The Evolution of Cooperation

DiscountingNow consider what happens to these values as d varies (from 0-1): For all values of d: V(D/T) > V(D/D) > V(T/D)

V(T/T) > V(D/D) > V(T/D)  

Is there a value of d s.t., V(D/T) = V(T/T)? Call this d*. d* = (T-R)/(T-P)

If d > d*, the following ordering hold: 

V(T/T) > V(D/T) > V(D/D) > V(T/D)  

D is a best response to D; T is a best response to T; multiple NE.

Page 50: Unit III: The Evolution of Cooperation

Discounting

V(T/T) = R/(1-d)

d* 1

V

TR

Graphically:

The V(alue) to a player using ALWAYSDEFECT (D) against TRIGGER (T), and the V(T/T) as a functionof the discount parameter (d)

V(D/T) = T + dP/(1-d)

Page 51: Unit III: The Evolution of Cooperation

The Folk Theorem

(R,R)

(T,S)

(S,T)

(P,P)

The payoff set of the repeated PD is the convex closure of the points [(T,S); (R,R); (S,T); (P,P)].

Page 52: Unit III: The Evolution of Cooperation

The Folk Theorem

(R,R)

(T,S)

(S,T)

(P,P)

The shaded area is the set of payoffs that Pareto-dominate the one-shot NE (P,P).

Page 53: Unit III: The Evolution of Cooperation

The Folk Theorem

(R,R)

(T,S)

(S,T)

(P,P)

Theorem: Any payoff that pareto-dominates the one-shot NE can be supported in a SPNE of the repeated game, if the discount parameter is sufficiently high.

Page 54: Unit III: The Evolution of Cooperation

The Folk Theorem

(R,R)

(T,S)

(S,T)

(P,P)

In other words, in the repeatedgame, if the future matters “enough”i.e., (d > d*),there are zillions of equilibria!

Page 55: Unit III: The Evolution of Cooperation

• The theorem tells us that in general, repeated games give rise to a very large set of Nash equilibria. In the repeated PD, these are pareto-rankable, i.e., some are efficient and some are not.

• In this context, evolution can be seen as a process that selects for repeated game strategies with efficient payoffs.

“Survival of the Fittest”

The Folk Theorem

Page 56: Unit III: The Evolution of Cooperation

Next Time

4/6 Evolutionary Games.Axelrod, Ch. 5: 88-105.Gintis Chs. 7, 9: 148-163; 188-219.