Nesterov’s excessive gap technique and poker
Andrew GilpinCMU Theory Lunch
Feb 28, 2007
Joint work with:Samid Hoda, Javier Peña, Troels Sørensen, Tuomas Sandholm
Outline
• Two-person zero-sum sequential games
• First-order methods for convex optimization
• Nesterov’s excessive gap technique (EGT)
• EGT for sequential games
• Heuristics for EGT
• Application to Texas Hold’em poker
We want to solve:
If Q1 and Q2 are simplices, this is the Nash equilibrium problem for two-person zero-sum matrix games
If Q1 and Q2 are complexes, this is the Nash equilibrium problem for two-person zero-sum sequential games
What’s a complex?
It’s just like a simplex, but more complex.
Each player’s complex encodes her set ofrealization plans in the game
In particular, player 1’s complex is
where E and e depend on the game…
A B C D E F G H
Recall our problem:
where Q1 and Q2 are complexes
Since Q1 and Q2 have a linear description,this problem can be solved as an LP. However,current LP solution methods do not scale
(Un)scalability of LP solvers
• Rhode Island poker [Shi & Littman 01]– LP has 91 million rows and columns– Applying GameShrink automated abstraction algorithm yields an
LP with only 1.2 million rows and columns, and 50 million non-zeros [G. & Sandholm, 06a]
– Solution requires 25 GB RAM and over a week of CPU time
• Texas Hold’em poker– ~1018 nodes in game tree– Lossy abstractions need to be performed– Limitations of current solver technology primary limitation
to achieving expert-level strategies [G. & Sandholm 06b, 07a]
• Instead of standard LP solvers, what about a first-order method?
Convex optimization
Suppose we want to solve
where f is convex.
For general f, convergence requires O(1/ε2) iterations(e.g., for subgradient methods)
For smooth, strongly convex f with Lipschitz-continuous gradient, can be done in O(1/ε½) iterations
Note that this formulation capturesALL convex optimization problems(can model feasible space using anindicator function)
Analysis based on black-box oracleaccess model. Can we do better bylooking inside the box?
Strong convexity
A function is strongly convex if there exists such that
for all and all
is the strong convexity parameter of d
Recall our problem:
where Q1 and Q2 are complexes
Equivalently:
where
and
, ,
Unfortunately, Φ and f are non-smooth
Fortunately, they have a special structure
Let d1,d2 be smooth and strongly convex on Q1,Q2
These are called prox-functions
Now let μ > 0 and consider:
These are well-defined smooth functions
Excessive gap condition
From weak duality, we have that f(y) ≤ Φ(x)
The excessive gap condition requires that
fμ(y) ≤ Φμ(x) (EGC)
The algorithm maintains (EGC), and gradually decreases μ
As μ decreases, the smoothed functions approach thenon-smooth functions, and thus iterates satisfying (EGC)converge to optimal solutions
Nesterov’s main theorem
Theorem [Nesterov 05]There exists an algorithm such that after at most N iterations, the iterates have duality gap at most
Furthermore, each iteration only requires solving three problems of the form
and performing three matrix-vector product operations on A.
Nice prox functions
A prox function d for Q is nice if it is:1. Strongly convex continuous everywhere in Q,
and differentiable in the relative interior of Q
2. The min of d over Q is 0
3. The following maps are easily computable:
Nice simplex prox function 1: Entropy
Nice simplex prox function 2: Euclidean
sargmax can be computed in O(n log n) time
From the simplex to the complex
Theorem [Hoda, G., Peña 06]
A nice prox function can be constructed for
the complex via a recursive application of
any nice prox function for the simplex
Prox function example
Let be any nice simplex prox function.The prox function for this matrix is:
Solving
(similar to b(i-vii))
Heuristics [G., Hoda, Peña, Sandholm 07]
• Heuristic 1: Aggressive μ reduction– The μ given in the previous algorithm is a
conservative choice guaranteeing convergence– In practice, we can do much better by aggressively
pushing μ, while checking that the excessive gap condition is satisfied
• Heuristic 2: Balanced μ reduction– To prevent one μ from dominating the other, we also
perform periodic adjustments to keep them within a small factor of one another
Matrix-vector multiplication in poker[G., Hoda, Peña, Sandholm 07]
• The main time and space bottleneck of the algorithm is the matrix-vector product on A
• Instead of storing the entire matrix, we can represent it as a composition of Kronecker products
• We can also effectively take advantage of parallelization in the matrix-vector product to achieve near-linear speedup
Memory usage comparison
Instance CPLEX IPM CPLEX Simplex EGT
10k 0.082 GB >0.051 GB 0.012 GB
160k 2.25 GB >0.664 GB 0.035 GB
RI 25.2 GB >3.45 GB 0.15 GB
Texas >458 GB >458 GB 2.49 GB
Poker
• Poker is a recognized challenge problem in AI because (among other reasons)– the other players’ cards are hidden;– bluffing and other deceptive strategies are needed in
a good player;– there is uncertainty about future events.
• Texas Hold’em: most popular variant of poker• Two-player game tree has ~1018 nodes
Potential-aware automated abstraction[G., Sandholm, Sørensen 07]
• Most prior automated abstraction algorithms employ a myopic expected value computation as a similarity metric– This ignores hands like flush draws where although the
probability of winning is small, the payoff could be high
• Our newest algorithm considers higher-dimensional spaces consisting of histograms over abstracted classes of states from later stages of the game
• This enables our bottom-up abstraction algorithm to automatically take into account positive and negative potential
Solving the four-round model
• Computed abstraction with– 20 first-round buckets– 800 second-round buckets– 4800 third-round buckets– 28800 fourth-round buckets
• Algorithm using 30 GB RAM– Simply representing as an LP requires 32 TB– Outputs new, improved solution every 2.5 days
[G., Sandholm, Sørensen 07]
[G., Sandholm, Sørensen 07]
[G., Sandholm, Sørensen 07]
Future research
• Customizing second-order (e.g. interior-point methods) for the equilibrium problem
• Additional heuristics for improving practical performance of EGT algorithm
• Techniques for finding an optimal solution from an ε-solution
Thank you ☺
Top Related