On the Approximation Performance of Fictitious Play in Finite Games Paul W. GoldbergU. Liverpool...

17
On the Approximation Performance of Fictitious Play in Finite Games Paul W. Goldberg U. Liverpool Rahul Savani U. Liverpool Troels Bjerre Sørensen U. Warwick

Transcript of On the Approximation Performance of Fictitious Play in Finite Games Paul W. GoldbergU. Liverpool...

Page 1: On the Approximation Performance of Fictitious Play in Finite Games Paul W. GoldbergU. Liverpool Rahul Savani U. Liverpool Troels Bjerre Sørensen U. Warwick.

On the Approximation Performance of Fictitious Play in Finite Games

Paul W. Goldberg U. Liverpool Rahul Savani U. LiverpoolTroels Bjerre Sørensen U. WarwickCarmine Ventre U. Liverpool

Page 2: On the Approximation Performance of Fictitious Play in Finite Games Paul W. GoldbergU. Liverpool Rahul Savani U. Liverpool Troels Bjerre Sørensen U. Warwick.

Penalty-kick practice

shoot on the right, left or center of the goal

dive on the right, left or center of the goal

players

actions

Scenario: Every day two friends meet to practice penalty-kicks

Page 3: On the Approximation Performance of Fictitious Play in Finite Games Paul W. GoldbergU. Liverpool Rahul Savani U. Liverpool Troels Bjerre Sørensen U. Warwick.

Penalty-kick game

0,1 1,0 1,0

1,0 0,1 1,0

1,0 1,0 0,1

R

C

L

R

C L

Q: How would a goalie “learn” to play this game?

Page 4: On the Approximation Performance of Fictitious Play in Finite Games Paul W. GoldbergU. Liverpool Rahul Savani U. Liverpool Troels Bjerre Sørensen U. Warwick.

Fictitious play [Brown, 51]

FP rule: Best respond to the empirical distribution of play of the opponent.

0, 1 1,0 1,0

1,0 0,1 1,0

1,0 1,0 0,1

R

C

L

R

C L

1/10

2/10

7/10

dives on the L

Is it a “good” choice? Ie, is it a good algorithm strategically?

R C CL L L L L L L

Days 1 2 3 4 5 6 7 8 9 10

‘saction

Page 5: On the Approximation Performance of Fictitious Play in Finite Games Paul W. GoldbergU. Liverpool Rahul Savani U. Liverpool Troels Bjerre Sørensen U. Warwick.

Where does the name come from? FP can also be seen as an algorithm for playing

the game just once1. Simulate what would happen in the repeated version

of the game up to some predetermined round r

2. Output the empirical distribution In the above example for r=10, the empirical distribution of

is: R wp 1/10, C wp 2/10, L wp 7/10

FP is a very simple iterative algorithm Sometimes, advocated to model bounded

rationality Is FP strategically “good”?

Page 6: On the Approximation Performance of Fictitious Play in Finite Games Paul W. GoldbergU. Liverpool Rahul Savani U. Liverpool Troels Bjerre Sørensen U. Warwick.

Fictitious play and Nash equilibria The empirical distribution of play defined by FP

converges to Nash equilibria for constant-sum games [Robinson, 51] non-degenerate 2 × 2 games [Miyasawa, 61] 2 × n games [Berger, 05]

... but it does not converge in general [Shapley, 64]

0, 1 1,0 1,0

1,0 0,1 1,0

1,0 1,0 0,1

R

C

L

R

C L

0,1+Ɛ 1,0 1,0

1,0 0,1+Ɛ 1,0

1,0 1,0 0,1+Ɛ

R

C

L

R

C L

Page 7: On the Approximation Performance of Fictitious Play in Finite Games Paul W. GoldbergU. Liverpool Rahul Savani U. Liverpool Troels Bjerre Sørensen U. Warwick.

Fictitious play and approximate NEs Analysis of the strategic performances of FP done

by means of approximate NEs NE = no incentive to deviate Ɛ-NE = little (Ɛ) incentive to deviate

Concept which assumed relevance given the PPAD-hardness of computing exact NEs [Daskalakis, Goldberg & Papadimitriou, 06] + [Chen & Deng, 06]

Payoffs normalized to [0,1] and additive approximation

[Conitzer, 09] proves: For any game, Ɛ ≤ (r+1)/(2r) at round r There exists an infinite game for which Ɛ = (r+1)/2r

Page 8: On the Approximation Performance of Fictitious Play in Finite Games Paul W. GoldbergU. Liverpool Rahul Savani U. Liverpool Troels Bjerre Sørensen U. Warwick.

Approximation guarantee of FP round

play

ers’

act

ions

By FP rule, si is a best response to the mixture of the first i-1 actions

1 2 3 ... i-1 i ... r-1 r

a1 a2 a3 ... ai-1 ai... ar-1 ar

s1 s2 s3 ... si-1 si ... sr-1 sr

Ɛ = 0 Ɛ = 1

Ɛ for playing si is (r-i+1)/r2

Ɛ of FP is

r

i r

rir

r 12 2

1)1(

1

Page 9: On the Approximation Performance of Fictitious Play in Finite Games Paul W. GoldbergU. Liverpool Rahul Savani U. Liverpool Troels Bjerre Sørensen U. Warwick.

Approximation for finite games?

Re-using strategies may guarantee a significantly better approximation of FP Experimentally, Shapley’s game (for which FP

does not converge) has Ɛ ≈ 1/4

round

play

ers’

act

ions

1 2 3 ... i-1 i ... r-1 r

a1 a2 a3 ... ai-1 ai... ar-1 ar

s1 s2 s3 ... si-1 si ... si sr

Ɛ = 1Ɛ = 0

Ɛ for playing si at round i is less than (r-i+1)/r2

Page 10: On the Approximation Performance of Fictitious Play in Finite Games Paul W. GoldbergU. Liverpool Rahul Savani U. Liverpool Troels Bjerre Sørensen U. Warwick.

Our contribution

We define a class of 4n × 4n symmetric games, n being a parameter, for which we show that FP fails to obtain any constant Ɛ < ½

Specifically, we prove a lower bound of

½ - O(1/n1-δ)

for any δ > 0 We also give a “matching” upper bound of

½ - O(1/n)

Page 11: On the Approximation Performance of Fictitious Play in Finite Games Paul W. GoldbergU. Liverpool Rahul Savani U. Liverpool Troels Bjerre Sørensen U. Warwick.

The game: row player’s payoff matrix

n=5 α>1, β<1 Blank entries stand for a 0 Column player’s payoff matrix is the transpose of the above

Players share the same sequence of actions (simpler analysis)

Page 12: On the Approximation Performance of Fictitious Play in Finite Games Paul W. GoldbergU. Liverpool Rahul Savani U. Liverpool Troels Bjerre Sørensen U. Warwick.

The role of α and β

α>1, β<1 Blank entries stand for a 0 Column player’s payoff matrix is the transpose of the above

Players share the same sequence of actions (simpler analysis)

Page 13: On the Approximation Performance of Fictitious Play in Finite Games Paul W. GoldbergU. Liverpool Rahul Savani U. Liverpool Troels Bjerre Sørensen U. Warwick.

The last block and the induction

Page 14: On the Approximation Performance of Fictitious Play in Finite Games Paul W. GoldbergU. Liverpool Rahul Savani U. Liverpool Troels Bjerre Sørensen U. Warwick.

Next ideas of the analysis

α and β govern the ratio between the probabilities of two consecutive actions

Ratios are such that probabilities increase in geometric progression

last n different actions played occupy all but an exponentially-small fraction of the probability

mass

best response has payoff around β (1-1/2n) ≈ 1- O(1/n1-δ) probability distribution does not allocate much probability to

any individual strategy

payoff from FP distribution is around α/2 ≈ 1/2

Page 15: On the Approximation Performance of Fictitious Play in Finite Games Paul W. GoldbergU. Liverpool Rahul Savani U. Liverpool Troels Bjerre Sørensen U. Warwick.

Upper bound

Ɛ for an action is defined by its last occurrence in the sequence

The maximum Ɛ is given by the sequencem1, ..., m1, ... , mn, ..., mn

round

play

ers’

act

ions

1 2 3 ... i-1 i ... r-1 r

a1 a2 a3 ... ai-1 ai... ar-1 ar

s1 s2 s3 ... si-1 si ... si sr

r/n r/n

Page 16: On the Approximation Performance of Fictitious Play in Finite Games Paul W. GoldbergU. Liverpool Rahul Savani U. Liverpool Troels Bjerre Sørensen U. Warwick.

Conclusions

FP is not good from a strategic point of view in terms of approximation guarantee to NEs for finite games There is a class of finite games for which a cyclic

behavior persists which leads to a poor guarantee (independently of the number of iterations)

Ie, fully rational player has always beaten his bounded rational friend

Page 17: On the Approximation Performance of Fictitious Play in Finite Games Paul W. GoldbergU. Liverpool Rahul Savani U. Liverpool Troels Bjerre Sørensen U. Warwick.

Open problems

Is ½ a limit to the approximation performance obtainable by simple or decentralized algorithms? Cf. algorithm of [Daskalakis, Mehta &

Papadimitriou, 09] vs more complex centralized algorithms achieving a ratio better than a half

Consider more general class of algorithms E.g., uncoupled dynamics defined by [Hart & Mas-

Colell, 03] + [Hart & Mas-Colell, 06]