Unit III: The Evolution of Cooperation

Unit III: The Evolution of Cooperation

•Can Selfishness Save the Environment?•Repeated Games: the Folk Theorem•Evolutionary Games•A Tournament•How to Promote Cooperation

3/30

Can Selfishness Save the Environment?

• The Problem of Cooperation• The Tragedy of the Global Commons?• Common Resource Game • We Play a Game• Repeated Games• Discounting• The Folk Theorem

How can a number of individuals, each behaving as a utility maximizer, come to behave as a group and maximize joint utility?

The Problem of Cooperation

C 6,6 0,5

D 5,0 1,1


C D

Players may fail to cooperate (i.e., fail to maximize joint payoffs), because they lack information.

If each has reason to believe the other will cooperate, the problem is solved!

Assurance Game Prisoner’s Dilemma

C 6,6 0,5

D 5,0 1,1


C D

C 3,3 0,5

D 5,0 1,1

C D

Assurance Game Prisoner’s Dilemma

C 6,6 0,5

D 5,0 1,1


C D

C 3,3 0,5

D 5,0 1,1

C DEasy

In the Prisoner’s Dilemma, there is no belief that will lead the players to cooperate.

Rather than a problem of information, this is a problem of incentives.

Cooperation is both inaccessible and unstable.

Prisoner’s Dilemma

The problem of cooperation arises in several important contexts, including

public goods: everyone can enjoy the good even if they don’t pay for it, e.g., nat’l defense, public tv.

common (property) resources: raw or natural resources that are own by everyone (or no one), e.g., clean air, clean water, biodiversity.

Can Selfishness Save the Environment?

Problem of Cooperation

– subject to the Free-rider problem– undersupplied by a voluntary contribution scheme

– subject to the “Tragedy of the Commons” (Hardin, 1968)– overconsumed (depleted)

Can Selfishness Save the Environment? Arguments to the effect that “polluting is wrong” are less likely to be effective than measures that get the incentives right over the long run (Ridley & Low, 1993).

Our Common Future: Transboundary pollution, ozone depletion, nuclear proliferation, global warming, loss of biodiversity, deforestation, overfishing are all the consequences of continuing economic growth and development (Brundtland, 1987).

Negative externalities

Tragedy of the Global Commons?

Tragedy of the Global Commons?

Consider a country deciding on its optimal level of economic growth (X), in the presence of a negative externality (e.g., transboundary pollution, etc.). National Utility is a positive function of own growth and negative function of overall growth (X,X’):

National Utility Own Choice of X All Choices of X

P(X,X’) = a(X) – b(X,X’) + c

Alternatively: X can be the voluntary contribution level in the case of a public good (bad); or the consumption level in the case of a common resource.

Common Resource GameTwo fishermen fish from a single lake. Each year, there are a fixed number of fish in the lake and two periods during the year that they can be harvested, spring and fall. Each fisherman consumes all the fish he catches each period, and their identical preferences are described by the following consumption function:

Ui = CsCf

where Cs = spring catch; Cf = fall catch.

Each spring, each fisherman decides how many fish to remove from the lake. In the fall, the remaining fish are equally divided between the two.

Common Resource GameConsider two fishermen deciding how many fish to remove from a

commonly owned pond. There are Y fish in the pond.• Period 1 each fishery chooses to consume (c1, c2).• Period 2 remaining fish are equally divided (Y –

(c1+c2))/2).

c1 = (Y – c2)/2

Y/3 c1

c2

Y/3c2 = (Y – c1)/2

Ui = ct ct+1

Ct = Today’s consumption

Ct+1 = Tomorrow’s ‘’



(c1+c2))/2).

c1 = (Y – c2)/2

Y/4Y/3 c1

c2

Y/3Y/4 c2 = (Y – c1)/2

Social Optimality: c1 = c2 = Y/4



(c1+c2))/2).

c1 = (Y – c2)/2

Y/4Y/3 c1

c2

Y/3Y/4 c2 = (Y – c1)/2

If there are 12 fish in the pond, each will consume (Y/3) 4 in the spring and 2 in the fall in a NE. Both would be better off consuming (Y/4) 3 in the fall, leaving 3 for each in the spring.

If there are 12 fish in the pond, each will consume (Y/3) 4 in the spring and 2 in the fall in a NE. Both would be better off consuming (Y/4) 3 in the fall, leaving 3 for each in the spring.

C 9, 9 7.5,10

D 10,7.5 8, 8

Common Resource Game

C D

A Prisoner’s Dilemma

What would happen if the game were

repeated?

C = 3 in the springD = 4 “ “

We Play a Game

At each round of the game, you will have the chance to contribute to a public good (e.g., national defense; public tv).

The game is repeated for several rounds, and payoffs are calculated as follows:

1 pt. for each contribution made by anyone. + 3 pts. for each round you don’t contribute.

See Holt and Laury, JEP 1997: 209-215.

Payoffs = 1 pt. for each contribution by anyone; 10 pts. for each round you don’.

You play: Contribution Rate (n-1)0% 10 … 50 … 90 100%

Contribute 1 4 16 28 30Don’t 3 6 18 30 33

Assume n = 30

We Play a GamePayoff: 1 pt. for each contribution made by anyone. + 3 pts. for each round you don’t contribute.

n-person Prisoner’s Dilemma: Don’t Contribute is a dominant strategy. But if none Contribute, the outcome is inefficient.

We Play a Game

Public Goods Games

Round No. of Round No. of Contributions Contributions

Round Contributions

Round Contributions

1 7 8 172 5 9 103 3 10 34 4 11 15 2 12 …6 1 137 1 14

Data from 2009. N = 20. Communication was allowed between rounds 7 and 8.

We Play a Game

Public Goods Games

Typically, contribution rates:

• 40-60% in one-shot games & first round of repeated games

• <30% on announced final round• Decrease with group size• Increase with “learning”

Repeated Games

Examples of Repeated Prisoner’s Dilemma

• Overfishing• Transboundary pollution• Cartel enforcement• Labor union• Public goods

The Tragedy of the Global Commons

Free-rider Problems

Repeated Games

Some Questions:

• What happens when a game is repeated? • Can threats and promises about the future

influence behavior in the present?• Cheap talk• Finitely repeated games: Backward induction• Indefinitely repeated games: Trigger strategies

The Evolution of CooperationUnder what conditions will cooperation emerge in world of egoists without central authority?

Axelrod uses an experimental method – the indefinitely repeated PD tournament – to investigate a series of questions: Can a cooperative strategy gain a foothold in a population of rational egoists? Can it survive better than its uncooperative rivals? Can it resist invasion and eventually dominate the system?

The Evolution of Cooperation

The Indefinitely Repeated Prisoner’s Dilemma TournamentAxelrod (1980a,b, Journal of Conflict Resolution).

A group of scholars were invited to design strategies to play indefinitely repeated prisoner’s dilemmas in a round robin tournament.

Contestants submitted computer programs that select an action, Cooperate or Defect, in each round of the game, and each entry was matched against every other, itself, and a control, RANDOM.

The Indefinitely Repeated Prisoner’s Dilemma TournamentAxelrod (1980a,b, Journal of Conflict Resolution).

Contestants did not know the length of the games. (The first tournament lasted 200 rounds; the second varied probabilistically with an average of 151.)

The first tournament had 14 entrants, including game theorists, mathematicians, psychologists, political scientists, and others.

Results were published and new entrants solicited. The second tournament included 62 entrants . . .


The Indefinitely Repeated Prisoner’s Dilemma Tournament

TIT FOR TAT won both tournaments!TFT cooperates in the first round, and then does whatever

the opponent did in the previous round.

TFT “was the simplest of all submitted programs and it turned out to be the best!” (31).

TFT was submitted by Anatol Rapoport to both tournaments, even after contestants could learn from the results of the first.



TIT FOR TAT won both tournaments!In addition, Axelrod provides a “theory of cooperation” based on his analysis of the repeated prisoner’s dilemma game.

In particular, if the “shadow of the future” looms large, then players may have an incentive to cooperate. A cooperative strategy such as TFT is “collectively stable.”

He also offers an evolutionary argument, i.e., TFT wins in an evolutionary competition in which payoffs play the role of reproductive rates.



This result has been so influential that “some authors use TIT FOR TAT as though it were a synonym for a self-enforcing, cooperative agreement” (Binmore, 1992, p. 433). And many have taken these results to have shown that TFT is the “best way to play” in IRPD.

· While TFT won these, will it win every tournament? · Is showing that TFT is collectively stable equivalent to

predicting a winner in the computer tournaments?· Is TFT evolutionarily stable?


The Evolution of CooperationClass Tournament

Imagine a population of strategies matched in pairs to play repeated PD, where outcomes determine the number of offspring each leaves to the next generation.

– In each generation, each strategy is matched against every other, itself, and RANDOM.

– Between generations, the strategies reproduce, where the chance of successful reproduction (“fitness”) is determined by the payoffs (i.e., payoffs play the role of reproductive rates).

Then, strategies that do better than average will grow as a share of the population and those that do worse than average will eventually die-out. . .

Repeated Games

Some Questions:

• What happens when a game is repeated? • Can threats and promises about the future

influence behavior in the present?• Cheap talk• Finitely repeated games: Backward induction• Indefinitely repeated games: Trigger strategies

Can threats and promises about future actions influence behavior in the present? Consider the following game, played 2X:

C 3,3 0,5

D 5,0 1,1

Repeated Games

C D

See Gibbons: 82-104.

Repeated Games

Draw the extensive form game:

(3,3) (0,5) (5,0) (1,1)

(6,6) (3,8) (8,3) (4,4) (3,8)(0,10)(5,5)(1,6)(8,3) (5,5)(10,0) (6,1) (4,4) (1,6) (6,1) (2,2)

Repeated Games Now, consider three repeated game strategies:

D (ALWAYS DEFECT): Defect on every move.

C (ALWAYS COOPERATE): Cooperate on every move.

T (TRIGGER): Cooperate on the first move, then cooperate

after the other cooperates. If the other defects, then

defect forever.

Repeated GamesIf the game is played twice, the V(alue) to a player using ALWAYS DEFECT (D) against an opponent using ALWAYS DEFECT(D) is:

V (D/D) = 1 + 1 = 2, and so on. . . V (C/C) = 3 + 3 = 6V (T/T) = 3 + 3 = 6V (D/C) = 5 + 5 = 10V (D/T) = 5 + 1 = 6V (C/D) = 0 + 0 = 0V (C/T) = 3 + 3 = 6

V (T/D) = 0 + 1 = 1V (T/C) = 3 + 3 = 6

Repeated GamesAnd 3x:

V (D/D) = 1 + 1 + 1 = 3 V (C/C) = 3 + 3 + 3 = 9V (T/T) = 3 + 3 + 3 = 9V (D/C) = 5 + 5 + 5 = 15V (D/T) = 5 + 1 + 1 = 7V (C/D) = 0 + 0 + 0 = 0V (C/T) = 3 + 3 + 3 = 9

V (T/D) = 0 + 1 + 1 = 2V (T/C) = 3 + 3 + 3 = 9

Repeated GamesTime average payoffs:

n=3

V (D/D) = 1 + 1 + 1 = 3 /3 = 1V (C/C) = 3 + 3 + 3 = 9 /3 = 3V (T/T) = 3 + 3 + 3 = 9 /3 = 3V (D/C) = 5 + 5 + 5 = 15 /3 = 5V (D/T) = 5 + 1 + 1 = 7 /3 = 7/3V (C/D) = 0 + 0 + 0 = 0 /3 = 0V (C/T) = 3 + 3 + 3 = 9 /3 = 3

V (T/D) = 0 + 1 + 1 = 2 /3 = 2/3

V (T/C) = 3 + 3 + 3 = 9 /3 = 3

Repeated GamesTime average payoffs:

n

V (D/D) = 1 + 1 + 1 + ... /n = 1V (C/C) = 3 + 3 + 3 + ... /n = 3V (T/T) = 3 + 3 + 3 + ... /n = 3V (D/C) = 5 + 5 + 5 + ... /n = 5V (D/T) = 5 + 1 + 1 + ... /n = 1 + eV (C/D) = 0 + 0 + 0 + ... /n = 0V (C/T) = 3 + 3 + 3 + … /n = 3

V (T/D) = 0 + 1 + 1 + ... /n = 1 - e

V (T/C) = 3 + 3 + 3 + ... /n = 3

Repeated Games Now draw the matrix form of this game:

1x

T 3,3 0,5 3,3

C 3,3 0,5 3,3

D 5,0 1,1 5,0

C D T

Repeated Games

T 3,3 1-e,1+e 3,3

C 3,3 0,5 3,3

D 5,0 1,1 1+e,1-e

C D T

If the game is repeated, ALWAYS DEFECTis no longer dominant.

Time Average Payoffs

Repeated Games

T 3,3 1-e,1+e 3,3

C 3,3 0,5 3,3

D 5,0 1,1 1+e,1-e

C D T

… and TRIGGERachieves “a NE with itself.”

Repeated Games

Time Average Payoffs

T(emptation) >R(eward)>P(unishment)>S(ucker)

T R,R P-e,P+e R,R

C R,R S,T R,R

D T,S P,P P+e,P-e

C D T

DiscountingThe discount parameter, d, is the weight of the next payoff

relative to the current payoff.

In a indefinitely repeated game, d can also be interpreted as the likelihood of the game continuing for another round (so that the expected number of moves per game is 1/(1-d)).

The V(alue) to someone using ALWAYS DEFECT (D) when playing with someone using TRIGGER (T) is the sum of T for the first move, d P for the second, d2P for the third, and so on (Axelrod: 13-4):

V (D/T) = T + dP + d2P + …

“The Shadow of the Future”

Discounting

Writing this as V (D/T) = T + dP + d 2P +..., we have the following:

V (D/D) = P + dP + d2P + … = P/(1-d) V (C/C) = R + dR + d2R + … = R/(1-d) V (T/T) = R + dR + d2R + … = R/(1-d) V (D/C) = T + dT + d2T + … = T/(1-d) V (D/T) = T + dP + d2P + … = T+ dP/(1-d) V (C/D) = S + dS + d2S + … = S/(1-d) V (C/T) = R + dR + d2R + … = R/(1- d)

V (T/D) = S + dP + d2P + … = S+ dP/(1-d)

V (T/C) = R + dR + d2R + … = R/(1- d)

T

C

D

DiscountedPayoffs

T > R > P > S 0 > d > 1

R/(1-d) S/(1-d) R/(1-d)

R/(1-d) T/(1-d) R/(1-d)T/(1-d) P/(1-d) T + dP/(1-d)

S/(1-d) P/(1-d) S + dP/(1-d)

Discounting

C D T

R/(1-d) S + dP/(1-d) R/(1- d)

R/(1-d) T + dP/(1-d) R/(1-d)

T

C

D

DiscountedPayoffs

T > R > P > S 0 > d > 1

T weakly dominates C

R/(1-d) S/(1-d) R/(1-d)

R/(1-d) T/(1-d) R/(1-d)T/(1-d) P/(1-d) T + dP/(1-d)

S/(1-d) P/(1-d) S + dP/(1-d)

Discounting

C D T

R/(1-d) S + dP/(1-d) R/(1- d)

R/(1-d) T + dP/(1-d) R/(1-d)

DiscountingNow consider what happens to these values as d varies (from 0-1):

V (D/D) = P + dP + d2P + … = P/(1-d) V (C/C) = R + dR + d2R + … = R/(1-d) V (T/T) = R + dR + d2R + … = R/(1-d) V (D/C) = T + dT + d2T + … = T/(1-d) V (D/T) = T + dP + d2P + … = T+ dP/(1-d) V (C/D) = S + dS + d2S + … = S/(1-d) V (C/T) = R + dR + d2R + … = R/(1- d)

V (T/D) = S + dP + d2P + … = S+ dP/(1-d) V (T/C) = R + dR + d2R + … = R/(1- d)


V (D/D) = P + dP + d2P + … = P+ dP/(1-d) V (C/C) = R + dR + d2R + … = R/(1-d) V (T/T) = R + dR + d2R + … = R/(1-d) V (D/C) = T + dT + d2T + … = T/(1-d) V (D/T) = T + dP + d2P + … = T+ dP/(1-d) V (C/D) = S + dS + d2S + … = S/(1-d) V (C/T) = R + dR + d2R + … = R/(1- d)


V(D/D) > V(T/D) D is a best response to D


V (D/D) = P + dP + d2P + … = P+ dP/(1-d) V (C/C) = R + dR + d2R + … = R/(1-d) V (T/T) = R + dR + d2R + … = R/(1-d) V (D/C) = T + dT + d2T + … = T/(1-d) V (D/T) = T + dP + d2P + … = T+ dP/(1-d) V (C/D) = S + dS + d2S + … = S/(1-d) V (C/T) = R + dR + d2R + … = R/(1- d)


2

1

3

?

DiscountingNow consider what happens to these values as d varies (from 0-1): For all values of d: V(D/T) > V(D/D) > V(T/D)

V(T/T) > V(D/D) > V(T/D)

Is there a value of d s.t., V(D/T) = V(T/T)? Call this d*.

If d < d*, the following ordering hold:

V(D/T) > V(T/T) > V(D/D) > V(T/D)

D is dominant: GAME SOLVED

V(D/T) = V(T/T)T+dP(1-d) = R/(1-d) T-dt+dP = R T-R = d(T-P)

d* = (T-R)/(T-P)

?

DiscountingNow consider what happens to these values as d varies (from 0-1): For all values of d: V(D/T) > V(D/D) > V(T/D)

V(T/T) > V(D/D) > V(T/D)

Is there a value of d s.t., V(D/T) = V(T/T)? Call this d*. d* = (T-R)/(T-P)

If d > d*, the following ordering hold:

V(T/T) > V(D/T) > V(D/D) > V(T/D)

D is a best response to D; T is a best response to T; multiple NE.

Discounting

V(T/T) = R/(1-d)

d* 1

V

TR

Graphically:

The V(alue) to a player using ALWAYSDEFECT (D) against TRIGGER (T), and the V(T/T) as a functionof the discount parameter (d)

V(D/T) = T + dP/(1-d)

The Folk Theorem

(R,R)

(T,S)

(S,T)

(P,P)

The payoff set of the repeated PD is the convex closure of the points [(T,S); (R,R); (S,T); (P,P)].

The Folk Theorem

(R,R)

(T,S)

(S,T)

(P,P)

The shaded area is the set of payoffs that Pareto-dominate the one-shot NE (P,P).

The Folk Theorem

(R,R)

(T,S)

(S,T)

(P,P)

Theorem: Any payoff that pareto-dominates the one-shot NE can be supported in a SPNE of the repeated game, if the discount parameter is sufficiently high.

The Folk Theorem

(R,R)

(T,S)

(S,T)

(P,P)

In other words, in the repeatedgame, if the future matters “enough”i.e., (d > d*),there are zillions of equilibria!

• The theorem tells us that in general, repeated games give rise to a very large set of Nash equilibria. In the repeated PD, these are pareto-rankable, i.e., some are efficient and some are not.

• In this context, evolution can be seen as a process that selects for repeated game strategies with efficient payoffs.

“Survival of the Fittest”

The Folk Theorem

Next Time

4/6 Evolutionary Games.Axelrod, Ch. 5: 88-105.Gintis Chs. 7, 9: 148-163; 188-219.

Unit III: The Evolution of Cooperation

Documents

Transcript of Unit III: The Evolution of Cooperation