COEVOLVING ROBUST STRATEGIES FOR Real-Time Strategy Games
-
Upload
kelsey-scott -
Category
Documents
-
view
43 -
download
2
description
Transcript of COEVOLVING ROBUST STRATEGIES FOR Real-Time Strategy Games
1
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno
COEVOLVING ROBUST STRATEGIES FOR REAL-TIME STRATEGY GAMES
Christopher [email protected] http://www.cse.unr.edu/~caballinger
2
Outline
Artificial Intelligence Game AI
Board Games RTS Games
StarCraft WaterCraft
Motivation Prior Work
Methodology Evolutionary
Methods Representation
Encoding AI Behavior
Current Progress Conclusions Future Work
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 3
Artificial Intelligence
(Broadly) understanding and building intelligent agents. (Russell & Norvig) Intelligent Agent
An autonomous entity which observes through sensors, acts upon an environment using actuators, and directs its activity towards achieving goals. (Russell & Norvig)
Computational Intelligence A set of nature-inspired computational
methodologies and approaches to address complex problems. (Kahraman)
Game AI Decision-making process of computer-controlled
opponents/NPCs (Ponsen et. al.)
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 4
Board Games
Present challenging problems Complex state space Adversarial planning
Checkers State Space - (1020)
Chess State Space - (1050)
Go State Space - (10170)
A lot of AI research in the past used board games Board Game AIs play competitively against humans RTS games present even more difficult challenges
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 5
Real-Time Strategy
Much more complex than board games State space is orders of magnitude larger
(1050)36,000
to (10200)36,000
for an entire game match MUCH more than the number of protons in the
observable universeRTS Games Board Games
Simultaneous Moves Turn-based Moves
Durative Actions Instant Actions
Partially Observable Fully Observable
Non-deterministic Determinist
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 6
Real-Time Strategy
Several categories of challenges for Game AI Resource Management Decisions under uncertainty Spatial/Temporal reasoning Collaboration Opponent modeling/learning Adversarial real-time planning
Remains a challenge for AI, but not an impossible problem Human players are capable of overcoming these challenges
Humans can adapt to these difficult challenges so well, professional RTS players can make a living playing “e-sports” Most well known pro-league is for StarCraft
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 7
StarCraft
Objectives: Manage economy To build army
Many types of units
Each type has strengths and weaknesses
Getting the right mix is key
Research upgrades/abilities
To destroy enemy
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 8
StarCraft
Development Problems StarCraft
3rd-Party API can be used for AI development
Runs (relatively) slow Hard to run multiple
instances in parallel StarCraft II
No API
StarCraft II
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 9
WaterCraft
† Source code can be found on Christopher Ballinger’s website
WaterCraft† Modeled after
StarCraft II Easy to run in
parallel Runs quicker by
disabling graphics
10
Motivation RTS games are good testbeds for AI research
Present many challenging aspects Intransitive relationships between strategies, similar to rock-
paper-scissors No one optimal strategy
Robustness of strategies We believe designing a good RTS game player will advance AI
research significantly (like chess and checkers did)
S1
S3
S2
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno
11
Previous Work Case-based reasoning
(Ontanion, 2006) Genetic Algorithms +
Case-based Reasoning (Miles, 2005)
Reinforcement Learning (Spronck, 2007)
Studies on specific aspects Combat (Churchill, 2012) Economy (Chan, 2007) Coordination (Keaveney,
2011) Case-Injection into
population (Miles, Sushil 2005)
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno
What we’ve done Focus on build-orders (our
‘strategy’) Robustness against
multiple opponents Defeat known/common
strategies Case-injection Compare two
evolutionary methods Genetic
Algorithm (GA) Coevolutionary
Algorithm (CA)
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 12
Evolutionary Methods
Terminology Chromosome
A possible solution Population
A set of chromosomes Typically, initial
population contains completely random chromosomes
Fitness A measure of how well a
solution/chromosome solves a problem
Generation The number of iterations
we repeat the process
Evaluatio
n
Selection
Crossove
r
Mutation
New Population
C0
C1
C2
Cn
. . .
Population
Series1
0
60
120
Time
Fitness
1 0 0 1 0
C0
Gene
Alleles =
0,1
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 13
Evaluation
C0
C1
C2
Cn
. . .
Evaluator
F0
Population Evaluate
Chromosome
Assign Fitness
F1
F2
Fn
…
Assign a fitness to all chromosomes in the population
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 14
Selection A method for selecting which chromosomes we
should select for crossover, and how often. Roulette Wheel Selection
F0
F1
F2
Fn
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 15
Crossover
1 0 1 0 1 0 1 0 1 01 0 1 0 1 0 1 0 1 0
Randomly Select Index
Parent 1
Child 1
Parent 2
Child 2
A method to exchange information between two chromosomes (parents), attempting to produce more effective chromosomes (children)
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 16
01
0
1
Mutation A method to make sure certain
patterns/capabilities do not permanently go extinct in the entire population
1 0 0 1 0
1 0 1 0 0
1 0 0 1 1
1 0 … … …
1 0 1 1 1
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 17
New Population
F0
Old Population(Parents)
F1
F2
Fn
…
All Children become the Parents for the start of the next generation
C0
C1
C2
Cn
. . .
New Population(Children)
C2
Cn
C0
C1
. . .
Evaluatio
n
Selection
Crossove
r
Mutation
New Population
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 18
Evolutionary Methods Differences between our GA and CA?
GA: Population plays against the same hand-tuned baselines every generation
C0
C1
Cn
Baseline 1
Population
Baseline 2
Baseline 3
Generation 1
Teachset(Evaluators)
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 19
Evolutionary Methods Differences between our GA and CA?
GA: Population plays against the same hand-tuned baselines every generation
C0
C1
Cn
Baseline 1
Population
Baseline 2
Baseline 3
Generation 2
Teachset(Evaluators)
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 20
Evolutionary Methods Differences between our GA and CA?
GA: Population plays against the same hand-tuned baselines every generation
C0
C1
Cn
Baseline 1
Population
Baseline 2
Baseline 3
Generation 3
Teachset(Evaluators)
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 21
Evolutionary Methods Differences between our GA and CA?
GA: Population plays against the same hand-tuned baselines every generation
CA: Population plays against chromosomes from previous generations
C0
C1
Cn
Parent 1
Population
Parent 2
Parent 3
Generation 1
Teachset(Evaluators)
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 22
Evolutionary Methods Differences between our GA and CA?
GA: Population plays against the same hand-tuned baselines every generation
CA: Population plays against chromosomes from previous generations
C0
C1
Cn
Parent 1
Population
Parent 2
Parent 3
Generation 2
Teachset(Evaluators)
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 23
Evolutionary Methods Differences between our GA and CA?
GA: Population plays against the same hand-tuned baselines every generation
CA: Population plays against chromosomes from previous generations
C0
C1
Cn
Parent 1
Population
Parent 2
Parent 3
Generation 3
Teachset(Evaluators)
24
Evolutionary Methods
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno
åÎ
úûù
êëé
øö
èæ=iDj
ijl
shared
iFjf 1
Identical parameters Pop. Size 50, 50
generations Scaled Fitness, CHC
selection, Uniform Crossover, Mutation
Teachset Shared Fitness
CA Teachset (8 Opponents)
Hall of Fame (HOF) Shared Selection
GA Teachset (3 Opponents)
Opponents never change
Baseline strategies
ååÎÎ
++=jj
k
BDkUDkiij BCUCSRF 32 k
25
Metric - Baseline Build-Orders Provide a diverse set of challenges
Fast Build Quickly build 5 Marines and attacks Doesn’t need much infrastructure
Medium Build Build 10 Marines and attack
Slow Build Build 5 Vultures and attacks Slow, requires a lot of infrastructure
Encoded as a chromosome
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno
26
0 1 0 1 0 0
Representation - Encoding
Bitstring 3-bits per action Decoded
sequentially Inserts required
prerequisites
Bit Sequence
Action Prereq.
000-001 Build SCV (Minerals)
None
010 Build Marine Barracks
011-100 Build Firebat Barracks, Refinery, Academy
101 Build Vulture Barracks, Refinery, Factory
110 Build SCV (Gas) Refinery
111 Attack None
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 27
Representation - AI Behavior Execute actions in the queue as quickly as
possible Do not skip ahead in the queue
“Attack” action All Marines, Firebats, and Vultures move to attack
opponents Command Center Attack any other opponent units/buildings along the
way If nearby ally-unit is attacked, assist it by attacking
opponent’s unit
If Command Center is attacked, send SCVs to defend Once all threats have been eliminated, SCVs
return to their tasks
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 28
Experiment #1
Want to show that GAs and CAs find good build-orders
Ran GA and CA 10 times Evolved 15-bit(5 action) build-orders CA never trained against the baselines Ran multiple times to see if results could be
repeated reliably GA always found the same two build-orders CA always found the same single build-order
Exhaustive SearchBallinger, C.; Louis, S., "Comparing Heuristic Search Methods for Finding Effective Real-Time Strategy Game Plans“Ballinger, C.; Louis, S., "Comparing Coevolution, Genetic Algorithms, and Hill-Climbers for Finding Real-Time Strategy Game Plans"
29
Exhaustive Search
Exhaustive Search against all three baselines
15-bits was the maximum solution length we could exhaustively search.
Takes 20hrs to do all evaluations
Baselines encoded in 24-39bits, providing them with a large advantage
Shows how frequently the best solutions occur
Ranks all solutions by how many baselines they defeat
Solution 1
Baseline 1Baseline 2Baseline 3
Solution 2
Baseline 1Baseline 2Baseline 3
Solution N
Baseline 1Baseline 2Baseline 3
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno
Ballinger, C.; Louis, S., "Comparing Heuristic Search Methods for Finding Effective Real-Time Strategy Game Plans“Ballinger, C.; Louis, S., "Comparing Coevolution, Genetic Algorithms, and Hill-Climbers for Finding Real-Time Strategy Game Plans"
30
Results
Exhaustive Search 32,768(215) possible
solutions 80% of possible
solutions lose to all three baselines
19.9% of possible solutions beat only one of the three baselines
Only 30 solutions (0.1% of possible solutions) can defeat two baselines
Zero solutions could beat all three
0 1 2 31
10
100
1000
10000
100000
Number of Wins
Number of Chromosomes
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno
Exhaus-tive
-1800-1600-1400-1200-1000-800-600-400-200
0
Avg. Score Difference
Ballinger, C.; Louis, S., "Comparing Heuristic Search Methods for Finding Effective Real-Time Strategy Game Plans“Ballinger, C.; Louis, S., "Comparing Coevolution, Genetic Algorithms, and Hill-Climbers for Finding Real-Time Strategy Game Plans"
31
Results CA
Always found the same solution Four Vultures and a
Firebat Never defeats any
baselines Doesn’t plan for
opponents that take more than 5 actions
Still improves score Beats many other
15-bit strategies
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno
Exhaus-tive
CA
-1800-1600-1400-1200-1000-800-600-400-200
0
Avg. Score Difference
Ballinger, C.; Louis, S., "Comparing Heuristic Search Methods for Finding Effective Real-Time Strategy Game Plans“Ballinger, C.; Louis, S., "Comparing Coevolution, Genetic Algorithms, and Hill-Climbers for Finding Real-Time Strategy Game Plans"
32
Results GA
Found solutions that could beat two baselines 100% of the time Strategy 1
Two SCVs, Two Firebats, One Vulture
Quick but weak defense
Strategy 2 Four Firebats, One
Vulture Strong but slow
defense
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno
Exhaus-tive
CA GA
-1800-1600-1400-1200-1000-800-600-400-200
0
Avg. Score Difference
Ballinger, C.; Louis, S., "Comparing Heuristic Search Methods for Finding Effective Real-Time Strategy Game Plans“Ballinger, C.; Louis, S., "Comparing Coevolution, Genetic Algorithms, and Hill-Climbers for Finding Real-Time Strategy Game Plans"
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 33
Discussion #1
GA reliably produces high-quality solutions
CA improves against baselines not seen during training 15-bits is very limited Huge disadvantage against the
baselines
Ballinger, C.; Louis, S., "Comparing Heuristic Search Methods for Finding Effective Real-Time Strategy Game Plans“Ballinger, C.; Louis, S., "Comparing Coevolution, Genetic Algorithms, and Hill-Climbers for Finding Real-Time Strategy Game Plans"
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 34
Experiment #2
Increased bit-string length to 39 Same length as our longest baseline Will CA perform better on a level playing field?
Ran GA and CA 10 times GA found one build-order CA found 3 build-orders
Selected 3 random Hall of Fame (HOF) build-orders
Generated 10 random build-orders All GA, CA, HOF, Random, and Baseline
build-orders competed against each otherBallinger, C.; Louis, S., "Robustness of Coevolved Strategies in a Real-Time Strategy Game"
35
Results - Score
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno
Set Baseline(3)
GA(1) CA(3) HOF(3) Rand(10)
Baseline(3)
1733 2341 1975 1775 2935
GA(1) 4591 2875 2175 2833 3573
CA(3) 2600 3925 2830 3322 3775
HOF(3) 2611 3533 2355 2877 3379
Rand(10)
1124 2017 1456 1498 1851
GA fitness highest against Baselines CA fitness highest against all other
build-orders
Ballinger, C.; Louis, S., "Robustness of Coevolved Strategies in a Real-Time Strategy Game"
36
Results - Wins
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno
Set Baseline(3)
GA(1) CA(3) HOF(3) Rand(10)
Baseline(3)
33% 33% 44% 33% 90%
GA(1) 100% 100% 0% 33% 80%
CA(3) 66% 100% 44% 66% 100%
HOF(3) 66% 66% 33% 44% 100%
Rand(10)
10% 40% 0% 0% 45%
GA always wins against the baselines CA beats two of the three baselines
Never appeared during training
Ballinger, C.; Louis, S., "Robustness of Coevolved Strategies in a Real-Time Strategy Game"
37
Results – Command Centers
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno
Set Baseline(3)
GA(1) CA(3) HOF(3) Rand(10)
Baseline(3)
22% 33% 44% 33% 86%
GA(1) 100% 0% 0% 33% 60%
CA(3) 44% 66% 11% 44% 66%
HOF(3) 44% 33% 0% 11% 60%
Rand(10)
0% 0% 0% 0% 0%
Percent of C.C. destroyed were very similar Only two of the three CA build-orders
attack
Ballinger, C.; Louis, S., "Robustness of Coevolved Strategies in a Real-Time Strategy Game"
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 38
Discussion #2
GA produces high-quality solutions for known opponents Highest score against the opponents used
for training CA produces more robust solutions
Defeats opponents not seen during training
How difficult are these strategies to a human player?
Can we bias a CA to defeat a human?Ballinger, C.; Louis, S., "Robustness of Coevolved Strategies in a Real-Time Strategy Game"
39
Experiment #3
Recorded actions of a human player against a previously coevolved strategy.
Coevolved strategy was 39-bits (13 actions) Human (me) selected which units to build
in real-time Unit actions were determined by the same
rules used by the GA and CA Human strategies took 75-bits (25 actions) to
encode Very hard to find winning 39-bit strategies
without “peeking” 39-bit strategies can still defeat 75-bit strategies
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno
Ballinger, C.; Louis, S., "Finding Robust Strategies to Defeat Specific Opponents Using Case-Injected Coevolution"
40
Metric – Human Cases
We used two strategies for picking actions Easy Human (EH) Strategy (75-bits, 25 actions)
Quickly build 2 Marines, attack, repeat Slows down opponent and chips away at the base
Hard Human (HH) Strategy (75-bits , 25 actions) Build 9 SCVs, then build Firebats and Vultures in
parallel until opponent sends attack force Defend Command Center and send remaining
units to destroy opponents defenseless base Slow, requires a lot of infrastructure
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno
Ballinger, C.; Louis, S., "Finding Robust Strategies to Defeat Specific Opponents Using Case-Injected Coevolution"
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 41
Case-Injection
Injected human replays into CA teachset 2 of the 8 teachset spaces are
permanently replaced with the human cases
Not injecting human cases into the population (yet)
GA only trains against the human cases
Ballinger, C.; Louis, S., "Finding Robust Strategies to Defeat Specific Opponents Using Case-Injected Coevolution"
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 42
Results Ran GA and CA 10 times
GA always found one build-order CA always found one build-order
Averaged the GA’s and CA’s population performance against each human strategy for each generation
Ballinger, C.; Louis, S., "Finding Robust Strategies to Defeat Specific Opponents Using Case-Injected Coevolution"
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 43
Results – Score
0 5 10 15 20 25 30 35 40 451000150020002500300035004000450050005500
CA vs EHCA vs HHGA vs EHGA vs HH
Generation
Avg. Score
GA got the highest scores against the EH strategy
CA got the highest scores against the HH strategy
Ballinger, C.; Louis, S., "Finding Robust Strategies to Defeat Specific Opponents Using Case-Injected Coevolution"
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 44
Results - Wins
0 5 10 15 20 25 30 35 40 4505
101520253035404550
CA vs EHCA vs HHGA vs EHGA vs HH
Generation
Avg.
Wins
Trivial to beat the EH strategy
GA never learns to defeat HH Over specializes against
EH strategy
CA quickly learns to defeat HH Still defeats the EH strategy
as often as the GA, though the score isn’t as high
Ballinger, C.; Louis, S., "Finding Robust Strategies to Defeat Specific Opponents Using Case-Injected Coevolution"
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 45
Discussion #3
GA with fitness sharing can be mislead by large difficulty gap
CA produces high-quality robust solutions Can be biased towards known opponents Less prone to being mislead
Ballinger, C.; Louis, S., "Finding Robust Strategies to Defeat Specific Opponents Using Case-Injected Coevolution"
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 46
Conclusion
Conclusion CAs are suitable for finding RTS strategies
Produces robust strategies Can defeat multiple opponents Can defeat opponents not seen during
training Can learn to defeat known opponents
without becoming over specialized
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 47
Future Work
Future Work Case-Injection into population
Learn to play like a known player/strategy Strategy identification and counter-strategy selection
What strategies might the current opponent be using? What strategies in my case database might be useful to
learn from to defeat the current opponent? System for perpetual Coevolution and Case-Injection
The more people play, the more new and useful strategies we can coevolve
Future-Future Work More Flexible Encoding
Complete game player Better opponent modeling
48
Acknowledgements
This research is supported by ONR grant
N000014-12-c-0522.
More information (papers, movies) [email protected] (
http://www.cse.unr.edu/~caballinger) [email protected] (
http://www.cse.unr.edu/~sushil)
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno
Evolutionary Computing Systems Lab (ECSL), University of Nevada, Reno 49
Publications In Preparation:
Identifying Pro StarCraft II players and strategies (IEEE T-CIAIG)
• Liu, S.; Ballinger, C.; Louis, S.; "Player Identification from RTS Game Replays", Computers and Their Applications (CATA), 2013 28th International Conference on, 4-6 March 2013
• Ballinger, C.; Louis, S., "Comparing Heuristic Search Methods for Finding Effective Real-Time Strategy Game Plans", IEEE Symposium Series on Computational Intelligence (SSCI) 2013, 16-19 April 2013
• Ballinger, C.; Louis, S., "Comparing Coevolution, Genetic Algorithms, and Hill-Climbers for Finding Real-Time Strategy Game Plans", Genetic and Evolutionary Computation Conference (GECCO) 2013, 6-10 July 2013
• Ballinger, C.; Louis, S., "Robustness of Coevolved Strategies in a Real-Time Strategy Game", IEEE Congress on Evolutionary Computation (CEC) 2013, 20-23 June 2013
• Ballinger, C.; Louis, S., "Finding Robust Strategies to Defeat Specific Opponents Using Case-Injected Coevolution", IEEE IEEE Conference on Computational Intelligence and Games (CIG) 2013, 11-13 August 2013