Skill and Billiards -...
Transcript of Skill and Billiards -...
Skill and Billiards:Game Theory in Complex Domains
Chris Archibald
Department of Computing ScienceUniversity of Alberta
January 31, 2013
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 1 / 44
AI and Game Theory
Artificial Intelligence Game TheoryRational decision-making Strategic rational decision-making
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 2 / 44
Game Theory
Matching Pennies
b1 b2
a1 1,-1 -1,1a2 -1,1 1,-1
(mixed) strategy: a probability distribution over actionse.g. σA = (0.9, 0.1)best response: a strategy which yields the highest expectedpayoff against a given opponent strategye.g. br(σA) = (0.0, 1.0), since
� b1 �→ (−1 ∗ 0.9) + (1 ∗ 0.1) = −0.8� b2 �→ (+1 ∗ 0.9)− (1 ∗ 0.1) = +0.8
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 3 / 44
Game Theory
Matching Pennies
b1 b2
a1 1,-1 -1,1a2 -1,1 1,-1
(mixed) strategy: a probability distribution over actionse.g. σA = (0.9, 0.1)
best response: a strategy which yields the highest expectedpayoff against a given opponent strategye.g. br(σA) = (0.0, 1.0), since
� b1 �→ (−1 ∗ 0.9) + (1 ∗ 0.1) = −0.8� b2 �→ (+1 ∗ 0.9)− (1 ∗ 0.1) = +0.8
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 3 / 44
Game Theory
Matching Pennies
b1 b2
a1 1,-1 -1,1a2 -1,1 1,-1
(mixed) strategy: a probability distribution over actionse.g. σA = (0.9, 0.1)best response: a strategy which yields the highest expectedpayoff against a given opponent strategye.g. br(σA) = (0.0, 1.0), since
� b1 �→ (−1 ∗ 0.9) + (1 ∗ 0.1) = −0.8� b2 �→ (+1 ∗ 0.9)− (1 ∗ 0.1) = +0.8
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 3 / 44
Game Theory: Equilibrium
Matching Pennies
b1 b2
a1 1,-1 -1,1a2 -1,1 1,-1
(Nash) equilibrium: pair of strategies (σA,σB), such that� σA = br(σB)� σB = br(σA)
In Matching Pennies, the (only) equilibrium is((0.5, 0.5), (0.5, 0.5)):
� b1 �→ (−1 ∗ 0.5) + (1 ∗ 0.5) = 0.0� b2 �→ (+1 ∗ 0.5)− (1 ∗ 0.5) = 0.0
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 4 / 44
Game Theory and AI
Milind Tambe
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 5 / 44
Game Theory and AI
Kit Chen & Michael Bowling
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 6 / 44
Today’s domain: billiards
8-ballEach player has 7 ballsFirst to sink all 7 balls and then the 8-ball winsTo keep turn, called ball must be sunk in called pocketNot striking own ball first or pocketing cue ball gives ball in hand toopponent
Computational poolSoftware agents compete in virtual gameA deterministic physics simulator is usedNoise from a known distribution added to shots
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 7 / 44
Why billiards?
Continuous state space.Continuous action space.Actions taken at discrete times.Unique turn-taking structure.Results of actions are stochastic.
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 9 / 44
Outline
Game theory and billiardsModeling Billiards GamesChris Archibald and Yoav ShohamAAMAS 2009
AI and billiardsAnalysis of a Winning Computational Billiards PlayerChris Archibald, Alon Altman, and Yoav ShohamIJCAI 2009
Skill and billiardsSuccess, Strategy, and Skill: An Experimental StudyChris Archibald, Alon Altman, and Yoav ShohamAAMAS 2010
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 10 / 44
In search of a model
Does 8-ball have an equilibrium?
Results from previous game-theoretic models can’t be appliedFundamental dependence on finite number of states or actionsRestrictions on payoff function which don’t match billiards
We need a model that is more precise to billiards.
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 11 / 44
In search of a model
Does 8-ball have an equilibrium?
Results from previous game-theoretic models can’t be appliedFundamental dependence on finite number of states or actionsRestrictions on payoff function which don’t match billiards
We need a model that is more precise to billiards.
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 11 / 44
A Two-player Zero-sum Billiards Game
Tuple (S,A,λ, p, s0,C, r ), whereS ⊂ Rn is a compact n-dimensional state spaceA ⊂ Rm is the compact m-dimensional action space. at ∈ A is theaction chosen at time step t .λ : S �→ {1, 2} is the turn function. λ(st) indicates the playerwhose turn it is to play in state st .p : S × A �→ ∆(S) is the transition function, where ∆(S) is the setof all probability distributions over S.s0 is the starting state. (λ(s0) is the player who gets the first turnof the game.)C ⊆ S is the set of terminating states, which is a closed subset ofthe state space.r : C �→ R is the reward function.
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 12 / 44
Required assumption 1
Assumption�
f (·)dp(·|s, a) is continuous in A for any f ∈ B(S) and any s ∈ S \ C,where B(S) is the set of all bounded real-valued functions on S.
Continuous functions on compact sets are guaranteed to have amaximum value and a minimum value.
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 14 / 44
Main lemmaLemmaLet v∗(s) be the unique fixed point to the value iteration equation
v �(s) =
�maxa
��S v(·)dp(·|s, a)
�if λ(s) = 1
mina��
S v(·)dp(·|s, a)�
if λ(s) = 2
Then the strategies
σ1(s) = arg maxa
��
Sv∗(·)dp(·|s, a)
�
andσ2(s) = arg min
a
��
Sv∗(·)dp(·|s, a)
�
form a stationary pure strategy Markov perfect Nash equilibium in thegame.
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 15 / 44
Required assumptions 2 & 3
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 16 / 44
Required assumptions 2 & 3
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 16 / 44
Main result
TheoremIf the transition function obeys all three assumptions, then a stationarypure strategy Markov perfect Nash equilibrium exists in billiardsgames.
8-ball has an equilibrium!
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 17 / 44
Main result
TheoremIf the transition function obeys all three assumptions, then a stationarypure strategy Markov perfect Nash equilibrium exists in billiardsgames.
8-ball has an equilibrium!
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 17 / 44
Outline
Game theory and billiardsModeling Billiards GamesChris Archibald and Yoav ShohamAAMAS 2009
AI and billiardsAnalysis of a Winning Computational Billiards PlayerChris Archibald, Alon Altman, and Yoav ShohamIJCAI 2009
Skill and billiardsSuccess, Strategy, and Skill: An Experimental StudyChris Archibald, Alon Altman, and Yoav ShohamAAMAS 2010
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 18 / 44
Computational poolA shot is specified by five real-valued parameters ϕ, θ, a, b,V
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 19 / 44
State evaluation
State value = (1.0 · 0.95) + (0.33 · 0.82) + (0.15 · 0.63)
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 23 / 44
Computational pool results
We entered CueCard in the 2008 Computational Pool Tournamentin Beijing China
� 20 CPUs used� Each shot sampled 25-100 times
We won the gold medal, winning 82% of our games.Of the games that we broke, our agent won almost 75%off-the-break, meaning the other agent never got a turn
Let’s watch a game.
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 26 / 44
Computational pool results
We entered CueCard in the 2008 Computational Pool Tournamentin Beijing China
� 20 CPUs used� Each shot sampled 25-100 times
We won the gold medal, winning 82% of our games.Of the games that we broke, our agent won almost 75%off-the-break, meaning the other agent never got a turn
Let’s watch a game.
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 26 / 44
Analysis of success
A sampling of the results
Component Test Win %
20 CPUs vs 1 CPU 55 %1 CPU vs Pickpocket 77 %
Break shot CC|CC vs PP|CC 69 %CC|PP vs PP|PP 65 %
Sampling/Clustering 1 CPU vs < 30 samples 61 %
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 27 / 44
Outline
Game theory and billiardsModeling Billiards GamesChris Archibald and Yoav ShohamAAMAS 2009
AI and billiardsAnalysis of a Winning Computational Billiards PlayerChris Archibald, Alon Altman, and Yoav ShohamIJCAI 2009
Skill and billiardsSuccess, Strategy, and Skill: An Experimental StudyChris Archibald, Alon Altman, and Yoav ShohamAAMAS 2010
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 28 / 44
Skill in games
Skill: the ability of a player to perform the mental and physicaltasks necessary to succeed in a particular game or undertaking
Billiards exposes two facets of skill: strategic and execution
Strategic skill in billiardsThe method an agent uses to select a shot, (ϕ, θ, a, b,V ), for execution
Execution skill in billiardsThe amount and type of noise added to the shot parameters by theserver before it is executed in the game
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 29 / 44
Skill in games
Skill: the ability of a player to perform the mental and physicaltasks necessary to succeed in a particular game or undertakingBilliards exposes two facets of skill: strategic and execution
Strategic skill in billiardsThe method an agent uses to select a shot, (ϕ, θ, a, b,V ), for execution
Execution skill in billiardsThe amount and type of noise added to the shot parameters by theserver before it is executed in the game
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 29 / 44
Motivation
Main questions:How do different agents respond to changing execution skill?At which execution skill level is strategic skill most important?To identify most strategically skilled agent, what execution skilllevel should be used?
� Does it matter?
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 30 / 44
Experimental setup
Idea:Vary both strategic and execution skill, see how success incomputational pool is impacted
To vary strategic skill:� Four different agent strategies� Vary computing time
To vary execution skill:� Vary the noise added
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 31 / 44
Experimental setup
Idea:Vary both strategic and execution skill, see how success incomputational pool is impacted
To vary strategic skill:� Four different agent strategies� Vary computing time
To vary execution skill:� Vary the noise added
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 31 / 44
Experimental setup
Idea:Vary both strategic and execution skill, see how success incomputational pool is impacted
To vary strategic skill:� Four different agent strategies� Vary computing time
To vary execution skill:� Vary the noise added
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 31 / 44
Experimental setup: varying strategic skill
Four different agentsCueCard (CC)
� 2008 Champion (Single CPU)SingleLevel (SL)
� One level look-aheadOptimisticPlanner (OP)
� Noiseless shot planner� Success estimation with lookup table
MachineGunner (MG)� Random trial and error� Noise robustness (50 samples)� No state evaluation
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 32 / 44
Experimental setup: varying strategic skill
Varying computation timeEach agent has a time limitMore computing time ⇒ more shots to consider ⇒ improvement inchosen shotTime limits between 2 minutes and 6 minutes per gameConsistent time management
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 33 / 44
Experimental setup: varying execution skill
Varying the noise distributionIndependent zero-mean Gaussian distribution for each shotparametersScale all standard deviations by the same factor between 0 and 5
� 0 = perfect execution skill� 5 = very poor execution skill
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 34 / 44
Experimental setup: the experiment
Each agent participated in the same process to generate our data:1 Randomly generate:
� Noise level� Time limit
2 Agent breaks� Single break shot� Rebreak until successful
3 Agent continues game4 Win-off-the-break?5 Repeat around 20,000 times
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 35 / 44
Experimental results: raw data
Example raw data shown for CueCard agent.
(g) CueCard’s wins (h) CueCard’s non-wins
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 36 / 44
Experimental results: processed data
(i) Contour for CC (j) Contour for SL
(k) Contour for OP (l) Contour for MG
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 37 / 44
The value of execution skillMaximum win-off-the-break percentage for each agent
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 38 / 44
The value of strategic skill: time
QuestionDoes extra computation time benefit an agent the same amount ateach noise level?
To answer this, we looked at the difference in win-off-the-breakpercentage that extra time made for each agent at each noise level.
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 39 / 44
The value of strategic skill: timeDifference win-off-the-break percentage for each agent due to time
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 40 / 44
Conclusion
Superior strategic skill is most identifiable when agents have imperfectexecution skill
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 41 / 44
Other Interests
Execution skill and game theoryHustling in Repeated Zero-Sum Games with Imperfect InformationChris Archibald and Yoav ShohamIJCAI 2011
Search in densely stochastic domainsSparse Sampling for Adversarial GamesMarc Lanctot, Abdallah Saffadine, Joel Veness, and Chris ArchibaldComputer Games Workshop 2012, expanded version under submission.
Agent evaluationBaseline: Practical Control Variates for Agent EvaluationJosh Davidson, Chris Archibald, and Michael BowlingAAMAS 2013 (to appear)
Rating Players in Games with Real-Valued Outcomes (EXTENDED ABSTRACT)Chris Archibald, Matthew Rutherford, Neil Burch, and Michael BowlingAAMAS 2013 (to appear)
Automating Collusion Detection in Sequential GamesParisa Mazrooei, Chris Archibald, and Michael Bowlingunder submission
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 43 / 44
Thank you!Questions?
Chris [email protected]
Chris Archibald (Alberta CS) Skill and Billiards January, 31, 2013 44 / 44