Evolving Heuristics for Searching Games
description
Transcript of Evolving Heuristics for Searching Games
![Page 1: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/1.jpg)
Evolving Heuristics forSearching Games
Evolutionary Computation and Artificial Life
Supervisor: Moshe Sipper
Achiya ElyasafJune, 2010
![Page 2: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/2.jpg)
2
Overview
Searching Games State-Graphs• Representation• Uninformed Search• Heuristics• Informed Search
Rush Hour• Domain Specific Heuristic• Evolving Heuristics• Coevolving Game Boards• Results
Freecell• Domain Specific Heuristic• Coevolving Game Boards• Learning Methods• Results
![Page 3: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/3.jpg)
3
Every puzzle/game can be represented as a state graph:
• Single player games such as puzzles, board games etc.: every piece move can be counted as a different state
• Multi player games such as chess, robocode etc. – the place of the player / the enemy, rest of the parameters (health, shield…) define a state
Searching Games State-GraphsRepresentation
![Page 4: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/4.jpg)
4
Searching Games State-GraphsRepresentation
Rush Hour:
![Page 5: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/5.jpg)
5
Searching Games State-GraphsRepresentation
Blocksworld:
![Page 6: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/6.jpg)
6
Searching Games State-GraphsUninformed Search
BFS – Exponential in the search depth DFS – Linear in the length of the current search
path. BUT:• We might “never” track down the right path.• Usually games contain cycles
Iterative Deepening: Combination of BFS & DFS• Each iteration DFS with a depth limit is performed.• Limit grows from one iteration to another
• Worst case - traverse the entire graph
![Page 7: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/7.jpg)
7
Searching Games State-GraphsUninformed Search
Most of the game domains are PSPACE-Complete!
Worst case - traverse the entire graph We need an informed-search!
![Page 8: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/8.jpg)
8
Searching Games State-GraphsHeuristics
h:states -> Real. • For every state s, h(s) is an estimation of the
minimal distance/cost from s to a solution• h is perfect: an informed search that tries states
with highest h-score first – will simply stroll to solution
• Bad heuristic means the search might never get to answer
• For hard problems, finding h is hard
We need a good heuristic function to guide informed search
![Page 9: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/9.jpg)
10
Searching Games State-Graphs Informed Search (Cont.)
IDA*: Iterative-Deepening with A*• The expanded nodes are pushed to the DFS stack
by descending heuristic values• Let g(si) be the min depth of state si: Only nodes
with f(s)=g(s)+h(s)<depth-limit are visited
Near optimal solution (depends on path-limit) The heuristic need to be admissible
![Page 10: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/10.jpg)
14
Overview
Searching Games State-Graphs• Representation• Uninformed Search• Heuristics• Informed Search
Rush Hour• Domain Specific Heuristic• Evolving Heuristics• Coevolving Game Boards• Results
Freecell• Domain Specific Heuristic• Coevolving Game Boards• Learning Methods• Results
![Page 11: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/11.jpg)
15
Rush HourDomain Specific Heuristic
GP-Rush [Hauptman et al, 2009]Hand Crafted heuristics: Goal distance – Manhattan distance Blocker estimation – lower bound
(Admissble) Hybrid blockers distance – combine the two
above Is Move To Secluded – did the car enter a
secluded area Is Releasing move
![Page 12: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/12.jpg)
20
For H1, … , Hn – building blocksHow should we choose the fittest heuristic?• Minimum? Maximum? Linear combination?
GA/GP may be used for:1. Building new heuristics from existing building blocks2. Finding weights for each heuristic (for applying
linear combination)3. Finding conditions for applying each
• Probably, H should fit stage of search• E.g. “goal” heuristics when assuming we’re close
GA/GP
![Page 13: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/13.jpg)
21
GA/GP (Cont.)
If
And
≤
H1 0.4
≥
H2 0.7
+
H3 *
H1 0.5
*
H5 /
H1 0.1
Condition True
False
![Page 14: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/14.jpg)
22
GA/GP (Cont.)Back to Rush Hour
Functions & Terminals:
Genetic Operators: Cross-Over & Mutation on trees as Koza describes
Conditions ResultsTerminals IsMoveToSecluded, isReleasingMove, g,
PhaseByDistance, PhaseByBlockers, NumberOfSyblings, DifficultyLevel,
BlockersLowerBound, GoalDistance, Hybrid, 0, 0.1, … , 0.9 , 1
BlockersLowerBound, GoalDistance, Hybrid,
0, 0.1, … , 0.9 , 1
Sets If, AND , OR , ≤ , ≥ + , *
![Page 15: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/15.jpg)
23
Fitness measure? Cross-over? Mutation?
GA/GP (Cont.)Policies
Condition ResultCondition 1 Heuristics Weights 1Condition 2 Heuristics Weights 2
Condition n Heuristics Weights nDefault Heuristics Weights
.
.
.
.
.
.
![Page 16: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/16.jpg)
24
Co-Evolving Difficult Solvable 8x8 Boards
Our enhanced IDA* search solved over 90% of the 6x6 problems
We wanted to demonstrate our method’s scalability to larger boards
24
![Page 17: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/17.jpg)
25
Co-Evolving Difficult Solvable 8x8 Boards
Fitness measure? Cross-over? Mutation?
25
C
B AP
M
IK
S F GH
F
C
B AP
M
IK
S F GH
F
![Page 18: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/18.jpg)
26
Rush Hour Results
Average percentage of nodes required to solve test problems, with respect to the number of nodes scanned by a blind search:
![Page 19: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/19.jpg)
27
Rush Hour Results (Cont.)
Time (in seconds) required to solve problems JAM01 . . . JAM40:
![Page 20: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/20.jpg)
28
Overview
Searching Games State-Graphs• Representation• Uninformed Search• Heuristics• Informed Search
Rush Hour• Domain Specific Heuristic• Evolving Heuristics• Coevolving Game Boards• Results
Freecell• Domain Specific Heuristic• Coevolving Game Boards• Learning Methods• Results
![Page 21: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/21.jpg)
29
FreecellIntro
FreeCell remained relatively obscure until Windows 95
There are 32,000 solvable problems (known as Microsoft 32K), except for game #11982, which has eluded solution so far
![Page 22: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/22.jpg)
3030
Freecells Foundations
Cascades
FreecellIntro (Cont.)
![Page 23: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/23.jpg)
31
Lowest card at Foundations Number of well placed cards Num of cards not at Foundations Num of Freecells and free Cascades Sum of the Cascades bottom cards Highest home card – lowest home card
31
FreecellHeuristics
![Page 24: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/24.jpg)
32
As opposed to Rush-Hour, blind search could not solve even one problem
The best solver to date solves 89% of Microsoft 32K
Reasons:• High branching factor• Hard to generate a good heuristic
FreecellLearning methods
![Page 25: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/25.jpg)
33
In Rush Hour:• Hyper-Heuristics population• Each generation – all individuals solve 5
different randomly selected instances• Test set - 20% of the problems• Training set – the rest
In Freecell:• This method failed
FreecellLearning methods
![Page 26: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/26.jpg)
34
First try:
Sort the problems by difficulty Learn gradually the whole training set
FAILED:• Days of training• Over fitting and forgetness
FreecellLearning methods
![Page 27: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/27.jpg)
35
Second try:
Co-evolution:• First population – Hyper-Heuristics• Second population – Game boards with Hillis
“Hall of Fame”
FAILD:• Ambiguous reason for low fitness
FreecellLearning methods
![Page 28: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/28.jpg)
36
Third try:
Co-evolution:• First population – Hyper-Heuristics• Second population – Group of 8 game boards
SUCCESS:• Fast learning process• No ambiguity• We create the right competioin
FreecellLearning methods
![Page 29: Evolving Heuristics for Searching Games](https://reader036.fdocuments.in/reader036/viewer/2022062501/56816964550346895de1208e/html5/thumbnails/29.jpg)
37
Freecell Results
Reduction
RunNode
reductionTime
reductionSolution Length
% of solved problems
HSD 100% 100% 100% 89%GA-1 23% 31% 1% 71%GA-2 23% 30% -3% 70%GP - - - -
Policy 28% 36% 6% 74%GA with
Co-Evolution 60% 69% 37% 98%
Policy withCo-Evolution 59% 69% 30% 99%