Last time: Simulated annealing algorithmcsci561b/slides/session08_gamePlaying_short.pdf · Last...
-
Upload
nguyentuong -
Category
Documents
-
view
224 -
download
3
Transcript of Last time: Simulated annealing algorithmcsci561b/slides/session08_gamePlaying_short.pdf · Last...
Last time: Simulated annealing algorithm
Idea: Escape local extrema by allowing “bad moves,” but gradually decreasebad moves, but gradually decrease their size and frequency.
Note: goal here is tomaximize E.-
1
Last time: Simulated annealing algorithm
Idea: Escape local extrema by allowing “bad moves,” but gradually decreasebad moves, but gradually decrease their size and frequency.
Algorithm when goalis to minimize E.<
-
2
-
This time: Outline
Game playingThe minimaxl ithalgorithm
Resource limitationsalpha-beta pruning
3
alpha beta pruningElements of chance
What kind of games?Abstraction: To describe a game we must capture every relevant aspect of the game. Such as:
ChessTic-tac-toe…
Accessible environments: Such games are characterized by perfect
4
games are characterized by perfect information
What kind of games?Search: game-playing then consists of a search through possible game positions
Unpredictable opponent: introduces uncertainty thus game-playing mustuncertainty thus game playing must deal with contingency problems
5
Searching for the next move
Complexity: many games have a huge search spacesearch space
Chess: b = 35, m=100 ⇒ nodes = 35 100
if each node takes about 1 ns to explorepthen each move will take about 10 50
millennia to calculate.
6
Searching for the next move
Resource (e.g., time, memory) limit: optimal solution not feasible/possible,optimal solution not feasible/possible, thus must approximate1 Pruning: makes the search more efficient1. Pruning: makes the search more efficient
by discarding portions of the search tree that cannot improve quality result.
2. Evaluation functions: heuristics to evaluate utility of a state without exhaustive
7
search.
Two-player games
A game formulated as a search problem:
Initial state: ?Operators: ?Operators: ?Terminal state: ?Utilit f ti ?Utility function: ?
8
The minimax algorithmPerfect play for deterministic environments with perfect informationBasic idea: choose move with highest minimax value= best achievable payoff against best playp y
12
The minimax algorithmAlgorithm:
1. Generate game tree completely2. Determine utility of each terminal state3. Propagate the utility values upward in the three by
applying MIN and MAX operators on the nodes inapplying MIN and MAX operators on the nodes in the current level
4. At the root node use minimax decision to select the move with the max (of the mins) utility value
Steps 2 and 3 in the algorithm assume that
13
Steps 2 and 3 in the algorithm assume that the opponent will play perfectly.
A subtreexx o
win
lose
xxo
o
ox
lose
drawxxo
o
ox
xxo
o
oxx
xxo
o
ox
x x oo ox x
xxo
oxx
xxo
oxx
xxo
ox
xxo
ox
xxo
ox
xxo
oxoo
oo oxox ox oxoo o o
xxo
oxx
xxo
oxx
xxo
oxo
xxo
oxo x
xxo
ox
16
ooxx
o x xxxo
oxx
oo
ox
xo o
ox
xo x o
ox
xo
What is a good move?xx o
win
lose
xxo
o
ox
lose
drawxxo
o
ox
xxo
o
oxx
xxo
o
ox
x x oo ox x
xxo
oxx
xxo
oxx
xxo
ox
xxo
ox
xxo
ox
xxo
oxoo
oo oxox ox oxoo o o
xxo
oxx
xxo
oxo
xxo
oxo x
xxo
ox
17
ooxx
o x xxo
ox
xo o
ox
xo x o
ox
xo
1. Move evaluation without complete search
Complete search is too complex and impracticalEvaluation function: evaluates value of state using heuristics and cuts off searchNew MINIMAX:
CUTOFF-TEST: cutoff test to replace the termination condition (e.g., deadline, depth-limit, etc.)
l f l l fEVAL: evaluation function to replace utility function (e.g., number of chess pieces taken)
22
Evaluation functions
Weighted linear evaluation function:to combine n heuristics: f = w1f1 + w2f2 + + wnfnto combine n heuristics: f w1f1 + w2f2 + … + wnfn
E.g,w’s could be the values of pieces (1 for prawn, 3 for bishop)
24
p ( p , p)f’s could be the number of type of pieces on the board