Nesterov’s excessive gap technique and poker

31
Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas Sandholm

description

Nesterov’s excessive gap technique and poker. Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas Sandholm. Outline. Two-person zero-sum sequential games First-order methods for convex optimization - PowerPoint PPT Presentation

Transcript of Nesterov’s excessive gap technique and poker

Page 1: Nesterov’s excessive gap technique and poker

Nesterov’s excessive gap technique and poker

Andrew GilpinCMU Theory Lunch

Feb 28, 2007

Joint work with:Samid Hoda, Javier Peña, Troels Sørensen, Tuomas Sandholm

Page 2: Nesterov’s excessive gap technique and poker

Outline

• Two-person zero-sum sequential games

• First-order methods for convex optimization

• Nesterov’s excessive gap technique (EGT)

• EGT for sequential games

• Heuristics for EGT

• Application to Texas Hold’em poker

Page 3: Nesterov’s excessive gap technique and poker

We want to solve:

If Q1 and Q2 are simplices, this is the Nash equilibrium problem for two-person zero-sum matrix games

If Q1 and Q2 are complexes, this is the Nash equilibrium problem for two-person zero-sum sequential games

Page 4: Nesterov’s excessive gap technique and poker

What’s a complex?

It’s just like a simplex, but more complex.

Each player’s complex encodes her set ofrealization plans in the game

In particular, player 1’s complex is

where E and e depend on the game…

Page 5: Nesterov’s excessive gap technique and poker

A B C D E F G H

Page 6: Nesterov’s excessive gap technique and poker

Recall our problem:

where Q1 and Q2 are complexes

Since Q1 and Q2 have a linear description,this problem can be solved as an LP. However,current LP solution methods do not scale

Page 7: Nesterov’s excessive gap technique and poker

(Un)scalability of LP solvers

• Rhode Island poker [Shi & Littman 01]– LP has 91 million rows and columns– Applying GameShrink automated abstraction algorithm yields an

LP with only 1.2 million rows and columns, and 50 million non-zeros [G. & Sandholm, 06a]

– Solution requires 25 GB RAM and over a week of CPU time

• Texas Hold’em poker– ~1018 nodes in game tree– Lossy abstractions need to be performed– Limitations of current solver technology primary limitation

to achieving expert-level strategies [G. & Sandholm 06b, 07a]

• Instead of standard LP solvers, what about a first-order method?

Page 8: Nesterov’s excessive gap technique and poker

Convex optimization

Suppose we want to solve

where f is convex.

For general f, convergence requires O(1/ε2) iterations(e.g., for subgradient methods)

For smooth, strongly convex f with Lipschitz-continuous gradient, can be done in O(1/ε½) iterations

Note that this formulation capturesALL convex optimization problems(can model feasible space using anindicator function)

Analysis based on black-box oracleaccess model. Can we do better bylooking inside the box?

Page 9: Nesterov’s excessive gap technique and poker

Strong convexity

A function is strongly convex if there exists such that

for all and all

is the strong convexity parameter of d

Page 10: Nesterov’s excessive gap technique and poker

Recall our problem:

where Q1 and Q2 are complexes

Equivalently:

where

and

Page 11: Nesterov’s excessive gap technique and poker

, ,

Unfortunately, Φ and f are non-smooth

Fortunately, they have a special structure

Let d1,d2 be smooth and strongly convex on Q1,Q2

These are called prox-functions

Now let μ > 0 and consider:

These are well-defined smooth functions

Page 12: Nesterov’s excessive gap technique and poker

Excessive gap condition

From weak duality, we have that f(y) ≤ Φ(x)

The excessive gap condition requires that

fμ(y) ≤ Φμ(x) (EGC)

The algorithm maintains (EGC), and gradually decreases μ

As μ decreases, the smoothed functions approach thenon-smooth functions, and thus iterates satisfying (EGC)converge to optimal solutions

Page 13: Nesterov’s excessive gap technique and poker

Nesterov’s main theorem

Theorem [Nesterov 05]There exists an algorithm such that after at most N iterations, the iterates have duality gap at most

Furthermore, each iteration only requires solving three problems of the form

and performing three matrix-vector product operations on A.

Page 14: Nesterov’s excessive gap technique and poker

Nice prox functions

A prox function d for Q is nice if it is:1. Strongly convex continuous everywhere in Q,

and differentiable in the relative interior of Q

2. The min of d over Q is 0

3. The following maps are easily computable:

Page 15: Nesterov’s excessive gap technique and poker

Nice simplex prox function 1: Entropy

Page 16: Nesterov’s excessive gap technique and poker

Nice simplex prox function 2: Euclidean

sargmax can be computed in O(n log n) time

Page 17: Nesterov’s excessive gap technique and poker

From the simplex to the complex

Theorem [Hoda, G., Peña 06]

A nice prox function can be constructed for

the complex via a recursive application of

any nice prox function for the simplex

Page 18: Nesterov’s excessive gap technique and poker

Prox function example

Let be any nice simplex prox function.The prox function for this matrix is:

Page 19: Nesterov’s excessive gap technique and poker

Solving

Page 20: Nesterov’s excessive gap technique and poker

(similar to b(i-vii))

Page 21: Nesterov’s excessive gap technique and poker

Heuristics [G., Hoda, Peña, Sandholm 07]

• Heuristic 1: Aggressive μ reduction– The μ given in the previous algorithm is a

conservative choice guaranteeing convergence– In practice, we can do much better by aggressively

pushing μ, while checking that the excessive gap condition is satisfied

• Heuristic 2: Balanced μ reduction– To prevent one μ from dominating the other, we also

perform periodic adjustments to keep them within a small factor of one another

Page 22: Nesterov’s excessive gap technique and poker

Matrix-vector multiplication in poker[G., Hoda, Peña, Sandholm 07]

• The main time and space bottleneck of the algorithm is the matrix-vector product on A

• Instead of storing the entire matrix, we can represent it as a composition of Kronecker products

• We can also effectively take advantage of parallelization in the matrix-vector product to achieve near-linear speedup

Page 23: Nesterov’s excessive gap technique and poker

Memory usage comparison

Instance CPLEX IPM CPLEX Simplex EGT

10k 0.082 GB >0.051 GB 0.012 GB

160k 2.25 GB >0.664 GB 0.035 GB

RI 25.2 GB >3.45 GB 0.15 GB

Texas >458 GB >458 GB 2.49 GB

Page 24: Nesterov’s excessive gap technique and poker

Poker

• Poker is a recognized challenge problem in AI because (among other reasons)– the other players’ cards are hidden;– bluffing and other deceptive strategies are needed in

a good player;– there is uncertainty about future events.

• Texas Hold’em: most popular variant of poker• Two-player game tree has ~1018 nodes

Page 25: Nesterov’s excessive gap technique and poker

Potential-aware automated abstraction[G., Sandholm, Sørensen 07]

• Most prior automated abstraction algorithms employ a myopic expected value computation as a similarity metric– This ignores hands like flush draws where although the

probability of winning is small, the payoff could be high

• Our newest algorithm considers higher-dimensional spaces consisting of histograms over abstracted classes of states from later stages of the game

• This enables our bottom-up abstraction algorithm to automatically take into account positive and negative potential

Page 26: Nesterov’s excessive gap technique and poker

Solving the four-round model

• Computed abstraction with– 20 first-round buckets– 800 second-round buckets– 4800 third-round buckets– 28800 fourth-round buckets

• Algorithm using 30 GB RAM– Simply representing as an LP requires 32 TB– Outputs new, improved solution every 2.5 days

Page 27: Nesterov’s excessive gap technique and poker

[G., Sandholm, Sørensen 07]

Page 28: Nesterov’s excessive gap technique and poker

[G., Sandholm, Sørensen 07]

Page 29: Nesterov’s excessive gap technique and poker

[G., Sandholm, Sørensen 07]

Page 30: Nesterov’s excessive gap technique and poker

Future research

• Customizing second-order (e.g. interior-point methods) for the equilibrium problem

• Additional heuristics for improving practical performance of EGT algorithm

• Techniques for finding an optimal solution from an ε-solution

Page 31: Nesterov’s excessive gap technique and poker

Thank you ☺