Prisoner’s dilemma TEMPTATION>REWARD>PUNISHMENT>SUCKER.

13
Prisoner’s dilemma •TEMPTATION>REWARD>PUNISHMENT>SUCKER

Transcript of Prisoner’s dilemma TEMPTATION>REWARD>PUNISHMENT>SUCKER.

Page 1: Prisoner’s dilemma TEMPTATION>REWARD>PUNISHMENT>SUCKER.

Prisoner’s dilemma

• TEMPTATION>REWARD>PUNISHMENT>SUCKER

Page 2: Prisoner’s dilemma TEMPTATION>REWARD>PUNISHMENT>SUCKER.
Page 3: Prisoner’s dilemma TEMPTATION>REWARD>PUNISHMENT>SUCKER.

Repeated Prisoner’s Dilemma

• Consider a prisoner’s dilemma game played many times • A strategy specifies what you do in each stage game• Ex: cooperate in every stage game• Ex: cooperate in every odd-numbered stage game, defect in every

even-numbered stage game• Etc…

Page 4: Prisoner’s dilemma TEMPTATION>REWARD>PUNISHMENT>SUCKER.

Axelrod’s tournament

Page 5: Prisoner’s dilemma TEMPTATION>REWARD>PUNISHMENT>SUCKER.

Axelrod’s tournament

• The game above repeated 200 times• 15 strategies submitted • Random strategy• Always defect• Always cooperate• Etc.

• Each strategy played against all other strategies including itself• 15x15=225 games in total

• After all games played, earnings added and strategy with the most points declared winner

Page 6: Prisoner’s dilemma TEMPTATION>REWARD>PUNISHMENT>SUCKER.

Tournament results

• On average, no strategy scored above 600 points per game (what you would get if everyone mutually cooperated 200 rounds)• The best scoring strategies were nice (never first to defect)• 8 top scoring strategies were nice

• The worst scoring strategies were nasty (first to defect)• Forgiving strategies did better than unforgiving ones • A forgiving strategy has a short memory. For example, it doesn’t punish

forever• Of the 8 nice strategies, one of the strategies punished a defection by

defecting forever in response. This was the worst scoring nice strategy

Page 7: Prisoner’s dilemma TEMPTATION>REWARD>PUNISHMENT>SUCKER.

Tit-for-tat

• The winning strategy was called tit-for-tat• This strategy starts off by cooperating and then mimics what the other

player does• Example: imagine tit-for-tat playing against naïve prober

• Naïve prober is the same as tit for tat, except it defects 1 in 10 rounds chosen at random

• U(TFT,TFT)>U(NP,TFT) >U(NP,NP)

• Example: imagine tit-for-tat playing against remorseful prober• Remorseful prober is the same as naïve prober but allows “one free hit” • U(TFT,TFT)>U(RP,TFT)>U(NP,TFT)

• But is tit-for-tat an equilibrium?

Page 8: Prisoner’s dilemma TEMPTATION>REWARD>PUNISHMENT>SUCKER.

Tit-for-two-tats

• Same at Tit-for-tat but allows two defections in a row• Axelrod found that if tit-for-two-tats participated in his tournament, it

would have won

Page 9: Prisoner’s dilemma TEMPTATION>REWARD>PUNISHMENT>SUCKER.

Second tournament

• More strategies (63)• John Maynard Smith submitted tit-for-two-tats• Random termination times for each game (“infinitely” repeated

game)• Tit-for-tat won again!

• One problem with these tournaments is that the winner depends on the strategies that were submitted

Page 10: Prisoner’s dilemma TEMPTATION>REWARD>PUNISHMENT>SUCKER.

Third tournament (Evolution)

• Started with the same 63 strategies in equal proportion• After the first round of repeated games was played, winnings paid out in

“offspring”• New round with different proportions of strategies• After 1000 rounds, no changes in the population • Nasty strategies driven out, tit-for-tat and some other nice strategies survived • Note tit-for-tat is not ESS• Can be invaded by always cooperate• Can be invaded by a mixture of tit-for-two-tats and suspicious tit-for-tat (who

defects on the first move, otherwise behaves like tit-for-tat)

Page 11: Prisoner’s dilemma TEMPTATION>REWARD>PUNISHMENT>SUCKER.

Collectively stable strategies

• If there are lots of nasty strategies, always defect does best• If there are lots of nice strategies, tit-for-tat does best• Consider a world where only these two strategies are played• Can we argue that the system will tend toward tit-for-tat?• Kinship: related individuals live close together• Small clusters grow into large clusters

Page 12: Prisoner’s dilemma TEMPTATION>REWARD>PUNISHMENT>SUCKER.

Examples of repeated games

• “Live and let live” in WWI• Vampire bats

Page 13: Prisoner’s dilemma TEMPTATION>REWARD>PUNISHMENT>SUCKER.