Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of...

30
Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert X. Jiang

Transcript of Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of...

Page 1: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

Bilinear Games: Polynomial Time Algorithms for Rank Based

Subclasses

Ruta MehtaIndian Institute of Technology, Bombay

Joint work with Jugal Garg and Albert X. Jiang

Page 2: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

A Game: Rock-Paper-Scissor

Page 3: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

Rock-Paper-Scissor: A Play

Winner

$1

Page 4: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

Rock-Paper-Scissor: A Play

Winner

$1

Page 5: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

Rock-Paper-Scissor: A Play

Winner

$1

Page 6: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

0,0 -1,1 1,-1

1,-1 0,0 -1,1

-1,1 1,-1 0,0

Rock-Paper-Scissor Payoffs

Page 7: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

R P C

R 0 -1 1

P 1 0 -1

C -1 1 0

Bimatrix Game

Steady State: No player gains by unilateral deviation

R P C

R 0 1 -1

P -1 0 1

C 1 -1 0

S1 = { R, P, C }

S2 = { R, P, C }

A B

Page 8: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

R P C

R 0 -1 1

P 1 0 -1

C -1 1 0

Bimatrix Game

No Steady State

R P C

R 0 1 -1

P -1 0 1

C 1 -1 0

S1 = { R, P, C }

S2 = { R, P, C }

A B

Page 9: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

R 1/3 P1/3

C1/3

R 0 -1 1

P 1 0 -1

C -1 1 0

Mixed Play

Steady State

R P C

R 1/3 0 1 -1

P 1/3 -1 0 1

C 1/3 1 -1 0

S1 = { R, P, C }

A B

∆1={r1, p1, c1≥0; r1+p1+c1=1}

S1 = { R, P, C } ∆2={r2, p2, c2≥0; r2+p2+c2=1}

Page 10: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

John Nash (1951) Finite Game: Finitely many players, each with

finitely many strategies.

Nash: Every finite game has a steady state in mixed strategy.Hence forth called Nash equilibrium (NE)

Proved using Kakutani fixed point theorem: Highly non-constructive.

Page 11: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

Nash Equilibrium Computation Papadimitriou (JCSS’94): PPAD-class

Problems where existence is guaranteed like fixed point, Sperner’s Lemma, Nash equilibrium.

Chen and Deng (FOCS’06): It is PPAD-hard.

CDT (FOCS’06): Even approximation is PPAD-hard.

Page 12: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

Rank and Computation

Kannan and Theobald (SODA’07): Define rank of (A,B) as rank(A+B). FPTAS for fixed rank games.

Polynomial time algorithms for exact Nash. Dantzig (1963): Zero-sum (rank-0) is equiv. to LP. AGMS (STOC’11): Rank-1 games.

Page 13: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

Bilinear Games Bimatrix Game with polyhedral strategy sets.

Two players: 1 and 2 Polyhedral strategy sets:

X={x | Ex = e; x ≥ 0}, Y={y | Fy=f; y ≥ 0} Payoff matrices: A, B Bilinear Payoff: (x, y) fetches xTAy to player 1,

and xTBy to player 2.

Motivation: Koller et al. (STOC’94) for two-player extensive form game with perfect recall.

mR nRm nR

Page 14: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

Nash Equilibrium in Bilinear

NE: No player gains by unilateral deviation. Existence: Corollary of Glicksberg’s result.

Symmetric Game: B=AT and Y=X. (x, y) is a symmetric profile if y=x. Existence of symmetric NE: An adaptation of

Nash’s proof for symmetric bimatrix games.

Page 15: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

Bilinear Contains: Bimatrix, Polymatrix, Bayesian, etc.

Bimatrix: X = ∆1, Y = ∆2

Polymatrix: N players. Each pair plays a bimatrix game. Player i: Si finite strategy set, ∆i Mixed strategy

set. Goal of i: Choose xi from ∆i to maximize total

payoff.

Aij

i

j

Page 16: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

Polymatrix to Bilinear M= |S1|+ … + |Sn|. X = {(x1,…,xn) | xi in ∆i}, Y=X. A , B=AT

Symmetric NE of (A,B) maps to a NE of the polymatrix game

M MR0

0

Aij

0

0

i

j

A =

Page 17: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

Best Response (Koller et al.) Fix a strategy y of player 2. Player 1 solves

max: xT(Ay) min: eTp Ex = e pTE ≥ (Ay)T

x ≥ 0

At optimal: p s.t. Aiy ≤ pTEi & xi > 0 => Aiy = pTEi Given x X, for player 2 we getAt optimal: q s.t. Bjx ≤ qTFj & yj > 0 => qTFj =

Bjx

Page 18: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

Best Response Polytopes (BRPs) (x,y) is a NE iff p: Ay ≤ ETp; xi > 0 => Aiy = pTEi

q: xTB ≤ qTF; yj > 0 => qTFj = Bjx

xT(Ay - ETp) ≤ 0 and (xTB - qTF)y ≤ 0xT(A+B)y – eTp – fTy ≤ 0

{( , ) | , 0, }

{( , ) | 0, , }

T ii j

T j T ji

P y p A y p E y Fy f

Q x q x x B q F Ex e

Page 19: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

Nash Equilibrium in BRPs

NE iff xT(Ay - ETp)=0 and (xTB - qTF)y=0xT(A+B)y – eTp – fTy=0

Assumption: P and Q are non-degnerate.(u, v) of P x Q gives a NE => (u, v) is a vertex.

{( , ) | , 0, }

{( , ) | 0, , }

T ii j

T j T ji

P y p A y p E y Fy f

Q x q x x B q F Ex e

Page 20: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

QP Formulation

max: xT(A+B)y – eTp – fTy s.t. (y, p) P

(x, q) Q

Optimal value 0. Only vertex solutions.

Page 21: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

Our Results Rank-1 games: rank(A+B)=1

Extend Adsul et al. algorithm for exact NE.

Fixed rank games: rank(A+B)=k Extend FPTAS of Kannan et al.

Rank of A or B is constant Enumerate all NE in polynomial time.

Page 22: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

Rank-1 Case Zero-sum ~ rank(A+B)=0: LP formulation

(Charnes’53) rank(A+B)=1 then A+B = a.bT

The QP formulation: max: (xTa)(bTy) – eTp – fTy s.t. (y, p) P

(x, q) Q

Page 23: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

Rank-1 Case Replace (xTa) by z. Recall B = -A + a.bT

xT(A+B)y – eTp – fTy=0 z(bTy) – eTp – fTy=0

N = Points of P x Q’ with z(bTy) – eTp – fTy=0 Forms paths and cycles, since z gives one degree

of freedom.

NE of (A,B): Points in intersection of N and z – xTa =0.

' {( , , ) | 0; ( ) ; }T T ji jQ x z q x x A zb q F Ex e

{( , ) | 0, , }T j T jiQ x q x x B q F Ex e

Page 24: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

Parameterized LP

LP(z) = max: z(bTy) – eTp – fTy s.t. (y, p) P

(x, z, q) Q’

Given any c, Optimal value of LP(c) is 0. OPT(c) lies on N, and Let N(c)={Points of N with z=c}, then

OPT(c)=N(c). N is a single path on which z is monotonic.

Page 25: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

Rank-1: The Algorithm NE: Intersection of N and H: z – xTa =0. . c1=amin, c2=amaxmin maxmin ; maxT T

x X x Xa x a aa x

H

N

H– H+

NE

N(c1)

N(c2)

Page 26: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

Rank-1: Binary Search Algorithm NE of (A,B): Points in intersection of N and H. c=c1+c2/2.

H

NE

N(c1)

N(c2)

N

N(c)H+H–

Page 27: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

Rank-1: Binary Search Algorithm NE of (A,B): Points in intersection of N and H. c=c1+c2/2. If N(c) in H–,then c1=c else c2=c.

H

NE

N(c2)

N

N(c1)H+H–

Page 28: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

Analysis Terminates because,

z is monotonic on N. Increase in z on each edge is lower bounded by

1/d where d is polynomial sized in the input.

Time complexity: Solve LP(c) to get N(c) in each pivot. log(d) * log(amax – amin) pivots.

Page 29: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

Conclusions Bilinear games:

Bimatrix with polytopal strategy sets. Fairly general. Contains polymatrix, bayesian, etc. Polynomial time algorithm for rank based

subclasses.

Open problems: Designing a Lemke-Howson type algorithm. Degree, index, stability concepts. Computation of approximate equilibrium.

Page 30: Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert.

Thank You