218 1 Lecture 3 Part2-Print
-
Upload
monil-joshi -
Category
Documents
-
view
222 -
download
0
Transcript of 218 1 Lecture 3 Part2-Print
Lecture 3
Dynamic games of complete information
- Part 2
Outline (part 2)
Questions/comments/observations are always encouraged, at any point during the lecture!!
• Repeated games • Finite • Infinite
Motivation • Play the same normal-form game over and over
– each round is called a “stage game” Prisoner dilemma
Repeated games
• Repeated game is designed to examine the logic of long-term interaction
• It captures the idea that a player will take into account the effect of his current behavior on the other players’ future behavior, and aims to explain phenomena like cooperation, revenge, threats etc.
Finitely repeated games
• Everything is straightforward if we repeat a game a finite number of times
• We can write it as an extensive-form game with imperfect information – at each round players don’t know what the others
have done; afterwards they do – overall payoff function is additive: sum of payoffs in
stage games
Remarks
• Observe that the strategy space is much richer than it was in the normal-form setting
• Repeating a Nash strategy in each stage game will be an equilibrium in behavioral strategies (called a stationary strategy)
• We can apply backward induction in these games when the normal form game has a dominant strategy.
Prisoner’s dilemma as repeated game
Infinitely repeated games
Infinitely repeated games
Strategies
Nash equilibria with no discounting
Nash equilibria with no discounting
Can we get anything else rather than repetitions of the stage game equilibrium?
Can we get anything else rather than repetitions of the stage game equilibrium?
Important points
What outcomes can be achieved as equilibria?
What outcomes can be achieved as equilibria?
What outcomes can be achieved as equilibria?
The folk theorem
Folk theorems
• Repeated games – Structure of the equilibrium strategies (more useful) – Determine the payoffs that can be sustained by
equilibria -> conditions under which this set consists of nearly all reasonable payoff profiles (just existence of equilibria)
• “Folk theorems” – focus of most of the formal development in repeated games – Socially desirable outcomes that cannot be sustained
if players are myopic, can be sustained if players are foresighted (i.e. have long-term objectives)
Folk Theorems
• When players are patient, repeated play allows virtually any payoff to be an equilibrium outcome
• The set of Nash equilibria outcomes includes outcomes that are not repetitions of the constituent game
• To support such an outcome, each player must be deterred from deviating by being “punished”
• Punishment may take many forms – One possibility – “trigger strategy” (any deviation
causes his opponent(s) to carry out a punitive action that lasts forever)
Repeated games – some preliminary conclusions
• Repeated games may introduce new equilibria and stimulate cooperation
– Infinitely repeated games (infinite horizon T) • Finitely repeated games (finite horizon, T finite): solved
by backward induction – Players have incentives to cheat • Infinite Horizon: description of a game where players
think the game extends one more period with high probability
• Finite Horizon: terminal date of the game is known.
Nash equilibria with discounting
The folk theorem
The folk theorem with discounting
What about subgame perfect NE?
Strategies representation in repeated games - Automata The following game is infinitely repeated with
discount factor δ.
C D C D
2, 2
0, 3 1, 1
3, 0
Grim Trigger Strategy
Consider the repeated prisoner’s dilemma. The strategy prescribes that the player initially
cooperates, and continues to do so if both players cooperated at all previous times.
si (a1, . . . , aT) = C if at = (C,C) for all t = 1, . . . , T. si (a1, . . . , aT) = D otherwise. Note that a player defects if either she or her
opponent defected in the past.
Automaton of Grim Trigger Strategy
• There are two states: C in which C is chosen, and D, in which D is chosen.
• The initial state (*) is C. • If the play is not (C,C) in any period then the
state changes to D. • If the automaton is in state D, it remains there
forever.
* C D (C,D)
(D,C) (D,D)
(C,C)
More formalism!
One-step deviation principle
Central questions
1. If players are patient can we get cooperative outcomes = better than NE for all players? 2. If players are patient, what else can we get?
Grim-Trigger is SGPE
Suppose that both players adopt the grim-trigger strategy.
There are two sets of histories. Those for which grim-trigger strategy prescribes that the players play (C,C) and those for which the grim trigger strategy prescribes that they play (D,D).
In the first set of histories, if player i plays grim- trigger, then the outcome is (C, C) in every period with payoffs (2, 2, . . .), whose discounted average is 2.
If I deviates only once, she plays D. Then she reverts to the grim trigger-strategy, that prescribes to play D at all subsequent periods.
• The opponent, playing grim trigger strategy, plays D forever as a consequence of i’s one-shot deviation.
• The OSD yields the stream of payoffs (3, 1, 1, . . .) with discounted average
(1 − d)[3 + d + d 2 + d 3 + · · ·] = 3(1 − d) + d. • Thus player i cannot increase her payoff by deviating
if and only if 2 ≥ 3(1 − d) + d, or d ≥ 1/2. • In the second set of histories, if player i plays grim
trigger, then the outcome is (D, D) in every period with payoffs (1, 1, . . .), whose discounted average is 1.
• If I deviates only once, she plays C. Then she reverts to the grim trigger strategy, that prescribes to play D at all subsequent periods.
• The opponent, playing grim trigger strategy, plays D forever as a consequence of i’s one-shot deviation.
• The OSD yields the stream of payoffs (0, 1, 1, . . .) with discounted average
(1 − δ)[0 + δ + δ 2 + δ 3 + · · ·] = δ. • Player i cannot increase her payoff by deviating: 1
≥ δ. • We conclude that if δ ≥ ½ then the strategy pair in
which each player’s strategy is the grim-trigger strategy is a Subgame-Perfect equilibrium of the infinitely repeated Prisoner’s Dilemma.
Tit-for-Tat
• The player initially cooperates. • At subsequent rounds, she plays the strategy
played by the opponent at the previous round.
si (a1, . . . , aT) = C if aTj = C.
si (a1, . . . , aT) = D if aTj = D.
* C D ( . ,D)
(C, . )
( . ,C)
(D, . )
Tit for Tat is SGPE • Suppose that both players adopt tit for tat strategy. • There are four sets of histories. They prescribe
respectively, (C,C), (C,D), (D,C) and (D,D). • In the first set of histories, if player i plays tit for tat,
then the outcome is (C, C) in every period with payoffs (2, 2, . . .), whose discounted average is 2.
• If i deviates only once, she plays D. Then she reverts to tit for tat. Given that the opponent plays tit for tat, the induced play is {(D,C),(C,D),(D,C),(C,D)…}, with payoffs (3,0,3,0,…).
• Hence player i does not deviate if: 2 ≥ (1−δ)[3+0δ +3δ 2+0δ 3+· · ·] = 3 (1−δ)/(1-δ2)
that is to say: δ ≥ 1/2.
• In the set of histories prescribing (C,D), if players play tit for tat, then the outcome is {(C,D),(D,C),…}, which yields (0,3,0,3,…).
• If i deviates only once, she plays D. Then she reverts to tit for tat. Given that the opponent plays tit for tat, the induced play is (D,D) forever with payoffs 1.
• Hence player i does not deviate if: (1−δ)[0+3δ +0δ 2+3δ 3+· · ·] = 3δ (1−δ)/(1-δ2) ≥ 1
that is to say: δ ≥ 1/2. • In the set of histories prescribing (D,C), if players
play tit for tat, then the outcome is {(D,C),(C,D),…}, which yields (3,0,3,0…). If i deviates only once, the induced play is (C,C) forever with payoffs 2. Player i does not deviate if:
(1−δ)[3+0δ +3δ 2+0δ 3+· · ·] = 3 (1−δ)/(1-δ2) ≥1 that is to say: 1/2 ≥ δ.
• In the set of histories prescribing (D,D), if players play tit for tat, then the outcome is (D,D) forever, with payoff 1.
• If i deviates only once, the induced play is {(C,D),(D,C),…}, which yields (0,3,0,3,…).
• Player i does not deviate if: 1 ≥ (1−δ)[0+3δ +0δ 2+3δ 3+· · ·] = 3 (1−δ)/(1-δ2) that is to say: 1/2 ≥ δ. • We conclude that the strategy pair in which each
player plays the tit-for-tat strategy is a Subgame-Perfect equilibrium of the infinitely repeated Prisoner’s Dilemma if and only if δ = ½.
• This underlines the inherent fragility of tit-for-tat: it works only in a knife-hedge case.
Other strategies for repeated games
Other strategies for repeated games