Uri Zwick Tel Aviv University
description
Transcript of Uri Zwick Tel Aviv University
![Page 1: Uri Zwick Tel Aviv University](https://reader035.fdocuments.in/reader035/viewer/2022062322/5681433c550346895dafb189/html5/thumbnails/1.jpg)
Uri ZwickTel Aviv University
Simple Stochastic GamesMean Payoff Games
Parity Games
![Page 2: Uri Zwick Tel Aviv University](https://reader035.fdocuments.in/reader035/viewer/2022062322/5681433c550346895dafb189/html5/thumbnails/2.jpg)
Zero sum games
1 2 –3
0 –5 2
1 7 –2
Mixed strategiesMax-min theorem
…
![Page 3: Uri Zwick Tel Aviv University](https://reader035.fdocuments.in/reader035/viewer/2022062322/5681433c550346895dafb189/html5/thumbnails/3.jpg)
Stochastic games[Shapley (1953)]
1 2 –3
0 –5 2
1 7 –2
3 –7 –3
2 –4 –1
4
–1
7
Mixed positional (memoryless) optimal strategies
![Page 4: Uri Zwick Tel Aviv University](https://reader035.fdocuments.in/reader035/viewer/2022062322/5681433c550346895dafb189/html5/thumbnails/4.jpg)
Simple Stochastic games (SSGs)
2
–5
7
2 –4 –1
4
–1
7
Every game has only one row or column
Pure positional (memoryless) optimal strategies
![Page 5: Uri Zwick Tel Aviv University](https://reader035.fdocuments.in/reader035/viewer/2022062322/5681433c550346895dafb189/html5/thumbnails/5.jpg)
Simple Stochastic games (SSGs)Graphic representation
M
MAX min
m
RAND
R
The players construct an (infinite) path e0,e1,…
Terminating version
Non-terminating version
Discounted version
Fixed duration games easily solved using dynamic programming
![Page 6: Uri Zwick Tel Aviv University](https://reader035.fdocuments.in/reader035/viewer/2022062322/5681433c550346895dafb189/html5/thumbnails/6.jpg)
Simple Stochastic games (SSGs)Graphic representation – example
M M
m
R
MAX
Start vertex
min
RAND
![Page 7: Uri Zwick Tel Aviv University](https://reader035.fdocuments.in/reader035/viewer/2022062322/5681433c550346895dafb189/html5/thumbnails/7.jpg)
Simple Stochastic game (SSGs)Reachability version [Condon (1992)]
M
MAX min
m
RAND
R
M
0-sink
M
1-sink
Objective: Max / Min the prob. of getting to the 1-sink
Technical assumption: Game halts with prob. 1
No weights
All prob. are ½
![Page 8: Uri Zwick Tel Aviv University](https://reader035.fdocuments.in/reader035/viewer/2022062322/5681433c550346895dafb189/html5/thumbnails/8.jpg)
Simple Stochastic games (SSGs)Basic properties
Every vertex in the game has a value v
Both players have positional optimal strategies
Positional strategy for MAX: choice of an outgoing edge from each MAX vertex
Decision version: Is value v
![Page 9: Uri Zwick Tel Aviv University](https://reader035.fdocuments.in/reader035/viewer/2022062322/5681433c550346895dafb189/html5/thumbnails/9.jpg)
“Solving” binary SSGs
The values vi of the vertices of a game are the unique solution of the following equations:
Corollary: Decision version in NP co-NP
The values are rational numbersrequiring only a linear number of bits
![Page 10: Uri Zwick Tel Aviv University](https://reader035.fdocuments.in/reader035/viewer/2022062322/5681433c550346895dafb189/html5/thumbnails/10.jpg)
Markov Decision Processes (MDPs)
Values and optimal strategies of a MDP can be found by solving an LP
Theorem: [Derman (1970)]
M
MAX min
m
RAND
R
![Page 11: Uri Zwick Tel Aviv University](https://reader035.fdocuments.in/reader035/viewer/2022062322/5681433c550346895dafb189/html5/thumbnails/11.jpg)
NP co-NP – Another proof
Deciding whether the value of a game isat least (at most) v is in NP co-NP
To show that value v ,guess an optimal strategy for MAX
Find an optimal counter-strategy for min by solving the resulting MDP.
Is the problem in P ?
![Page 12: Uri Zwick Tel Aviv University](https://reader035.fdocuments.in/reader035/viewer/2022062322/5681433c550346895dafb189/html5/thumbnails/12.jpg)
Mean Payoff Games (MPGs)[Ehrenfeucht, Mycielski (1979)]
M
MAX min
m
RAND
R
Non-terminating version
Discounted version
MPGsReachability
SSGs(PZ’96)
Pseudo-polynomial algorithm (PZ’96)
![Page 13: Uri Zwick Tel Aviv University](https://reader035.fdocuments.in/reader035/viewer/2022062322/5681433c550346895dafb189/html5/thumbnails/13.jpg)
Mean Payoff Games (MPGs)[Ehrenfeucht, Mycielski (1979)]
Value – average of the cycle
![Page 14: Uri Zwick Tel Aviv University](https://reader035.fdocuments.in/reader035/viewer/2022062322/5681433c550346895dafb189/html5/thumbnails/14.jpg)
Parity Games (PGs)
EVEN
3
ODD
8
EVEN wins if largest priorityseen infinitely often in even
Equivalent to many interesting problemsin automata and verification:
Non-emptyness of -tree automata
modal -calculus model checking
Priorities
![Page 15: Uri Zwick Tel Aviv University](https://reader035.fdocuments.in/reader035/viewer/2022062322/5681433c550346895dafb189/html5/thumbnails/15.jpg)
Parity Games (PGs)
EVEN
3
ODD
8
Chang priority k to payoff (n)k
Mean Payoff Games (MPGs)
Move payoff to outgoing edges
[Stirling (1993)] [Puri (1995)]
![Page 16: Uri Zwick Tel Aviv University](https://reader035.fdocuments.in/reader035/viewer/2022062322/5681433c550346895dafb189/html5/thumbnails/16.jpg)
Simple Stochastic games (SSGs)Additional properties
An SSG is said to be binary if the outdegree of every non-sink vertex is 2
A switch is a change of a strategyat a single vertex
A strategy is optimal iff no switch is profitable
A switch is profitable for MAX if it increases the value of the game (sum of values of all
vertices)
![Page 17: Uri Zwick Tel Aviv University](https://reader035.fdocuments.in/reader035/viewer/2022062322/5681433c550346895dafb189/html5/thumbnails/17.jpg)
Start with an arbitrary strategy for MAX
Choose a random vertex iVMAX
Find the optimal strategy ’ for MAX in the gamein which the only outgoing edge from i is (i,(i))
If switching ’ at i is not profitable, then ’ is optimal
Otherwise, let (’)i and repeat
A randomized subexponentialalgorithm for binary SSGs
[Ludwig (1995) ][Kalai (1992) Matousek-Sharir-Welzl (1992) ]
![Page 18: Uri Zwick Tel Aviv University](https://reader035.fdocuments.in/reader035/viewer/2022062322/5681433c550346895dafb189/html5/thumbnails/18.jpg)
A randomized subexponentialalgorithm for binary SSGs
[Ludwig (1995) ][Kalai (1992) Matousek-Sharir-Welzl (1992) ]
There is a hidden order of MAX vertices under which the optimal strategy returned by
the first recursive call correctly fixes the strategy of MAX at vertices 1,2,…,i
All correct !Would never be switched !
MAX vertices
![Page 19: Uri Zwick Tel Aviv University](https://reader035.fdocuments.in/reader035/viewer/2022062322/5681433c550346895dafb189/html5/thumbnails/19.jpg)
Exponential algorithm for PGs[McNaughton (1993)] [Zielonka (1998)]
Vertices of highest priority
(even)
Vertices from whichEVEN can force the
game to enter A
Firstrecursive
call
Second recursive
call
In the worst case, both recursive calls are on games of size n1
![Page 20: Uri Zwick Tel Aviv University](https://reader035.fdocuments.in/reader035/viewer/2022062322/5681433c550346895dafb189/html5/thumbnails/20.jpg)
Deterministic subexponential alg for PGs Jurdzinski, Paterson, Z (2006)
Second recursive
call
Dominion
Idea: Look for small
dominions!
A (small) set from which one of the players can without the
play ever leaving this set
Dominions of size s can be found
in O(ns) time
![Page 21: Uri Zwick Tel Aviv University](https://reader035.fdocuments.in/reader035/viewer/2022062322/5681433c550346895dafb189/html5/thumbnails/21.jpg)
Open problems
● Polynomial algorithms?● Faster subexponential algorithms
for parity games? ● Deterministic subexponential algorithms for MPGs and SSGs?
● Faster pseudo-polynomial algorithms for MPGs?