Sparse Binary Zero Sum Games (ACML2014)

Sparse Binary Zero-Sum Games[ACML 2014]

David Auger1 Jialin Liu2 Sylvie Ruette3 David L. St-Pierre4

Olivier Teytaud2

1AlCAAP, Laboratoire PRiSM, Universite de Versailles Saint Quentin-en-Yvelines, France

2TAO, INRIA-CNRS-LRI, Universite Paris-Sud, France

3Laboratoire de Mathematiques, CNRS, Universite Paris-Sud, France

4Universite du Quebec a Trois-Rivieres, Canada

Jialin LIU (INRIA-TAO) Sparse Binary Zero-Sum Games 1 / 26

Thanks to reviewers for very fruitful comments.

Introduction

Two-person zero-sum game MK×K

Nash Equilibrium → O(K 2α) with α > 3

If the Nash is sparse → k × k submatrix

→ O(k3kK logK ) with probability 1− δ (provable)

Introduction

Two-person zero-sum game MK×K

Nash Equilibrium → O(K 2α) with α > 3

If the Nash is sparse → k × k submatrix

→ O(k3kK logK ) with probability 1− δ (provable)

Zero-sum matrix games

Game defined by matrix M

I choose (privately) i

Simultaneously, you choose j

I earn Mi ,j

You earn −Mi ,j

So this is zero-sum.

Or you earn 1−Mi ,j (so this is 1-sum, equivalent).

Ok, I earn Mi ,j , you earn −Mi ,j

Nash Equilibrium

Nash Equilibrium (NE)

Zero-sum matrix game M

My strategy = probability distrib. on rows = x

Your strategy = probability distrib. on cols = y

Expected reward = xTMy

There exists x∗, y∗ such that ∀x , y ,

xTMy∗ ≤ x∗TMy∗ ≤ x∗TMy .

(x∗, y∗) is a Nash Equilibrium (no unicity).

Nash: Ok I play i with probability x∗i

How to compute x*?

Nash: Ok I play i with probability x∗i

How to compute x*?

Solving Nash

Solution 1: Linear Programming (LP)

1 M ← M + C so that it is positive (without loss of generality)

2 LP: find 0 ≤ u minimizing∑iui such that (MT ) · u ≥ 1

3 x∗ = u/∑iui

=⇒ classical, provably exact, polynomial time

Solving Nash

Solution 2: Approximate Nash Equilibrium

Approximate ε-NE

(x∗, y∗) such that

xTMy∗ − ε ≤ x∗TMy∗ ≤ x∗TMy + ε.

Solution 1: LP (comp. expensive)

Computing approximate Nash Equilibrium

Assuming the matrix is of size K × K ...

LP (see reduction from Nash to linear programming in[Von Stengel (2002)]): O(K 2α) with 3 < α ≤ 4

[Grigoriadis and Khachiyan(1995)]:

ε-Nash with expected time O(K log(K)ε2 ), i.e. less than the size of the

matrix!Parallel : O( log2(K)

ε2 ) if using Klog(K) processors

Other algorithms: similar complexity, approximate solution + fixedtime with probability 1− δ

EXP3 ([Auer et al.(1995)])Inf ([Audibert and Bubeck(2009)])

Computing approximate Nash Equilibrium

Assuming the matrix is of size K × K ...

LP (see reduction from Nash to linear programming in[Von Stengel (2002)]): O(K 2α) with 3 < α ≤ 4

[Grigoriadis and Khachiyan(1995)]:

ε-Nash with expected time O(K log(K)ε2 ), i.e. less than the size of the

matrix!Parallel : O( log2(K)

ε2 ) if using Klog(K) processors

Other algorithms: similar complexity, approximate solution + fixedtime with probability 1− δ

EXP3 ([Auer et al.(1995)])Inf ([Audibert and Bubeck(2009)])

Other tools 1: Hadamard determinant

Hadamard determinant bound([Hadamard(1893)], [Brenner and Cummings(1972)])

Given matrix Mk×k with coefficients in {−1, 0, 1}, then M has

determinant at most kk2 , i.e.

| detM| ≤ kk2 .

Other tools 2: Linear programming

min ax

Mx ≤ c

x ∈ Rd

If there is a finite optimum, then there is a finite optimum x suchthat, for some E with |E | = d ,

∀i ∈ E , Mix = cithe Mi for i in E are linear independent(=⇒ i.e. d lin. indep. constraints are active)

Why is this relevant ?

Nash = solution of linear programming problem

x∗: Nash Equilibrium of MK×K

Let us assume that x∗ is unique and has at most k non-zerocomponents (sparsity)

⇒ x∗ = also NE of a k × k submatrix: M ′k×k⇒ x∗ = solution of LP in dimension k⇒ x∗ = solution of k lin. eq. with coefficients in {−1, 0, 1}⇒ x∗ = inv-matrix ∗ vector⇒ x∗ = obtained by “cofactors / det matrix”

⇒ x∗ has denominator at most kk2

Why is this relevant ?

Nash = solution of linear programming problem

x∗: Nash Equilibrium of MK×K

Let us assume that x∗ is unique and has at most k non-zerocomponents (sparsity)⇒ x∗ = also NE of a k × k submatrix: M ′k×k⇒ x∗ = solution of LP in dimension k⇒ x∗ = solution of k lin. eq. with coefficients in {−1, 0, 1}⇒ x∗ = inv-matrix ∗ vector⇒ x∗ = obtained by “cofactors / det matrix”

⇒ x∗ has denominator at most kk2

How to realise ?

Under assumption that the Nash is sparse

x∗ is rational with “small” denominator

So let us compute an ε-Nash (sublinear time!)

And let us compute its closest approximation with “smalldenominator” (Hadamard)

variants for ε-Nash =⇒ exact Nash

Rounding: switch to closest approximation

Truncation: remove small components and work on the remainingsubmatrix (exact solving)

Evil in the details

||y − y∗||∞ ≥ ε does not imply V (y) ≥ V (y∗) + ε;

indeed V (y) ≥ V (y∗) + ||y−y∗||∞k

Results : (if Grigoriadis)

For a K × K matrix with Nash k-sparseExact solution in time O(poly(k) + (K logK )k3k) withtruncation-algorithm

Experimental results: two card games

Previous results: ingaming of Urban Rivals

New results: metagaming of Pokemon

Ingaming results (Urban Rivals)

Previous work: [Flory and Teytaud(2011)], implementation ofTruncated-EXP3, without proof

Urban Rivals AI= Monte Carlo Tree Search([Coulom (2006)]),using zero-sum matrix gamesas a key component

Ingaming results (Urban Rivals)

Previous work: [Flory and Teytaud(2011)], implementation ofTruncated-EXP3, without proof

Results don’t look impressive (∼ 56%), but the game is highlyrandomized =⇒ Reaching 55% is far from being negligible

New experiments

Test on Pokemon Deck choice (“metagaming”)

Based on EXP3+truncation

Various versions of EXP3 (6= parameters)

Code available https://www.lri.fr/~teytaud/games.html

New experiments

With a poorly tuned EXP3 : truncation brings a huge improvement

100 101 102 103 1040.45

TEXP3 vs EXP3

100 101 102 103 1040.45

TEXP3 vs UniformEXP3 vs Uniform

Figure: Performance in terms of budget T with a poorly tuned EXP3 for thegame of Pokeman using 2 cards.

New experiments

With a well-tuned EXP3, truncation brings a significant improvement

100 101 102 103 1040.5

TEXP3 vs EXP3

100 101 102 103 1040.45

TEXP3 vs UniformEXP3 vs Uniform

Figure: Performance in terms of budget T with a well-tuned EXP3 for the gameof Pokeman using 2 cards.

Conclusions & further work

Proved small improvement, experimentally big improvement.Improving the bound ?

We don’t know k (sparsity level). Adaptive algorithms ?

Proved only with unique Nash (x∗, y∗). Necessary ?

Jean-Yeves Audibert and Sebastien Bubeck.

Minimax policies for adversarial and stochastic bandits.In 22th annual conference on learning theory, 2009.

Peter Auer, Nicolo Cesa-Bianchi, Yoav Freund, and Robert E. Schapire.

Gambling in a rigged casino: the adversarial multi-armed bandit problem.In Proceedings of the 36th Annual Symposium on Foundations of Computer Science. IEEE Computer Society Press, 1995.

Remi Coulom (2006).

Efficient selectivity and backup operators in Monte-Carlo tree search.In Computers and games, 2006.

Joel Brenner and Larry Cummings.

The Hadamard maximum determinant problem.In Amer. Math. Monthly, 1972.

Sebastien Flory and Olivier Teytaud.

Upper confidence trees with short term partial information.In Procedings of EvoGames, 2011.

Michael D. Grigoriadis and Leonid G. Khachiyan.

A sublinear-time randomized approximation algorithm for matrix games.In Operations Research Letters, 1995.

Jacques Hadamard.

Resolution d’une question relative aux determinants.In Bull. Sci. Math., 1893.

Bernhard Von Stengel.

Computing equilibria for two-person games.In Handbook of game theory with economic applications, 2002.

Thank you for your attention !

David Auger

David L. St-Pierre

Sylvie Ruette

Olivier Teytaud

[ACML 2014]

Sparse Binary Zero-Sum Games

D. Auger J. Liu S. Ruette D. L. St-Pierre O. Teytaud

Sparse Binary Zero Sum Games (ACML2014)

Presentations & Public Speaking

Transcript of Sparse Binary Zero Sum Games (ACML2014)

Web viewThe transmitted data is treated as one large binary sum (similar to checksum). The sum is then divided by a constant and the remainder is sent to the

Tutorial on Sparse Coding - pami.sjtu.edu.cn... “Online dictionary learning for sparse coding,” in ... recognition via sparse ... et al. "Incremental sparse saliency detection."

Trellis-based Extended Min-Sum Algorithm for Non-binary ...

cdn.acehsc.net · Web viewThe transmitted data is treated as one large binary sum (similar to checksum). The sum is then divided by a constant and the remainder is sent to the receiver.

Monitoring Paired Binary Surgical Outcomes Using ...shsteine/papers/medcusum.pdf · Monitoring Paired Binary Surgical Outcomes Using Cumulative Sum Charts ... situations, simultaneous

aceh.b-cdn.net · Web viewThe transmitted data is treated as one large binary sum (similar to checksum). The sum is then divided by a constant and the remainder is sent to the receiver.

Sparse Optimization - Lecture: Sparse Recovery Guaranteeswotaoyin/summer2013/slides/Lec03... · 2013-08-16 · Sparse Optimization Lecture: Sparse Recovery Guarantees Instructor:

Sparse Regression Codes - ISIT 2016 BarcelonaOutline of Tutorial Sparse Superposition Codes or Sparse Regression Codes (SPARCs) ... Capacity-achieving codes For many binary/discrete

Sparse Spectrum Sensing in Infrastructure-less Cognitive Radio Networks via Binary Consensus Algorithms

Factors of Sparse Polynomials are Sparse

Performance Enhancement in Binary EEG Signal ... · INFONET, GIST Jan 12, 2013 /23 Performance Enhancement in Binary EEG Signal Classification using Sparse Representation . Younghak

Transcription Network Analysis by A Sparse Binary Factor ...lxu/papers/journal/12Tujib-198.pdf · with wide applications, on analysis of binary data (e.g., social research questionnaires,

Support Vector Machines with Sparse Binary High ... · SVM training when the feature vectors are sparse, high-dimensional, and binary. Such feature vectors arise when the CRO feature

Scalable GPU Graph Traversal - NVIDIA...parallel algorithms, prefix sum, graph traversal, sparse graph 1. Introduction Algorithms for analyzing sparse relationships represented as

SSVDAGs: Symmetry-aware Sparse Voxel DAGspublications.crs4.it/pubdocs/2016/JMG16/i3d2016-symmetry-dags.pdf · video-gaming scenes, a binary voxel grid can be represented orders of

Sparse Optimization - Lecture: Basic Sparse Optimization ...wotaoyin/summer2013/...Sparse Optimization Lecture: Basic Sparse Optimization Models Instructor: Wotao Yin July 2013 online

Neuromorphic Computing and Learning: A Stochastic Signal ... · Neuromorphic Computing Neurons in the brain sense, process, and communicate over time using sparse binary signals (spikes

Binary Adder DesignSpring 2003 1 Binary Adders. Binary Adder DesignSpring 2003 2 n-bit Addition –Ripple Carry Adder –Conditional Sum Adder –(Carry Lookahead.

Sparse coding - GitHub Pagesyiiwood.github.io/images/Sparse Model for Data.pdf · 3/3/ Sparse representation – Sparse coding – Optimization for sparse coding – Dictionary learning

Sparse Sum of Squares Optimization for Model Updating ...wang.ce.gatech.edu/sites/default/files/docs/Sparse Sum-of-Squares... · 1 Sparse Sum-of-Squares Optimization for Model Updating