Last time: Simulated annealing algorithmcsci561b/slides/session08_gamePlaying_short.pdf · Last...

Last time: Simulated annealing algorithm

Idea: Escape local extrema by allowing “bad moves,” but gradually decreasebad moves, but gradually decrease their size and frequency.

Note: goal here is tomaximize E.-

1

Last time: Simulated annealing algorithm

Idea: Escape local extrema by allowing “bad moves,” but gradually decreasebad moves, but gradually decrease their size and frequency.

Algorithm when goalis to minimize E.<

-

2

-

This time: Outline

Game playingThe minimaxl ithalgorithm

Resource limitationsalpha-beta pruning

3

alpha beta pruningElements of chance

What kind of games?Abstraction: To describe a game we must capture every relevant aspect of the game. Such as:

ChessTic-tac-toe…

Accessible environments: Such games are characterized by perfect

4

games are characterized by perfect information

What kind of games?Search: game-playing then consists of a search through possible game positions

Unpredictable opponent: introduces uncertainty thus game-playing mustuncertainty thus game playing must deal with contingency problems

5

Searching for the next move

Complexity: many games have a huge search spacesearch space

Chess: b = 35, m=100 ⇒ nodes = 35 100

if each node takes about 1 ns to explorepthen each move will take about 10 50

millennia to calculate.

6

Searching for the next move

Resource (e.g., time, memory) limit: optimal solution not feasible/possible,optimal solution not feasible/possible, thus must approximate1 Pruning: makes the search more efficient1. Pruning: makes the search more efficient

by discarding portions of the search tree that cannot improve quality result.

2. Evaluation functions: heuristics to evaluate utility of a state without exhaustive

7

search.

Two-player games

A game formulated as a search problem:

Initial state: ?Operators: ?Operators: ?Terminal state: ?Utilit f ti ?Utility function: ?

8

Game vs. search problem

9

Example: Tic-Tac-Toe

Question:1. b (branching factor) = ?2 m (max depth) = ?2. m (max depth) = ?

10

Type of games

11

The minimax algorithmPerfect play for deterministic environments with perfect informationBasic idea: choose move with highest minimax value= best achievable payoff against best playp y

12

The minimax algorithmAlgorithm:

1. Generate game tree completely2. Determine utility of each terminal state3. Propagate the utility values upward in the three by

applying MIN and MAX operators on the nodes inapplying MIN and MAX operators on the nodes in the current level

4. At the root node use minimax decision to select the move with the max (of the mins) utility value

Steps 2 and 3 in the algorithm assume that

13

Steps 2 and 3 in the algorithm assume that the opponent will play perfectly.

Generate Game Tree

14

Generate Game Tree

x 1 ply

1 move

o x xo

xo

x o

15

A subtreexx o

win

lose

xxo

o

ox

lose

drawxxo

o

ox

xxo

o

oxx

xxo

o

ox

x x oo ox x

xxo

oxx

xxo

oxx

xxo

ox

xxo

ox

xxo

ox

xxo

oxoo

oo oxox ox oxoo o o

xxo

oxx

xxo

oxx

xxo

oxo

xxo

oxo x

xxo

ox

16

ooxx

o x xxxo

oxx

oo

ox

xo o

ox

xo x o

ox

xo

What is a good move?xx o

win

lose

xxo

o

ox

lose

drawxxo

o

ox

xxo

o

oxx

xxo

o

ox

x x oo ox x

xxo

oxx

xxo

oxx

xxo

ox

xxo

ox

xxo

ox

xxo

oxoo

oo oxox ox oxoo o o

xxo

oxx

xxo

oxo

xxo

oxo x

xxo

ox

17

ooxx

o x xxo

ox

xo o

ox

xo x o

ox

xo

Minimax

3 812 4 6 14 252

Mi i i t’ h•Minimize opponent’s chance•Maximize your chance

18

minimax = maximum of the minimum

1st ply

2nd ply2 ply

19

JavaApplet

-Minimx java appletdddd

20

Minimax: Recursive implementationp

21Complete: ?Optimal: ?

Time complexity: ?Space complexity: ?

1. Move evaluation without complete search

Complete search is too complex and impracticalEvaluation function: evaluates value of state using heuristics and cuts off searchNew MINIMAX:

CUTOFF-TEST: cutoff test to replace the termination condition (e.g., deadline, depth-limit, etc.)

l f l l fEVAL: evaluation function to replace utility function (e.g., number of chess pieces taken)

22

Do We Have To Do All That Work?

MAX

MIN

3 812

23

Evaluation functions

Weighted linear evaluation function:to combine n heuristics: f = w1f1 + w2f2 + + wnfnto combine n heuristics: f w1f1 + w2f2 + … + wnfn

E.g,w’s could be the values of pieces (1 for prawn, 3 for bishop)

24

p ( p , p)f’s could be the number of type of pieces on the board

Note: exact values do not matter

Ordering is preserved

25

Minimax with cutoff: viable algorithm?

Assume we have 100 seconds, evaluate 104

nodes/s; cannodes/s; can evaluate 106

nodes/move

26

Last time: Simulated annealing algorithmcsci561b/slides/session08_gamePlaying_short.pdf · Last...

Documents

Transcript of Last time: Simulated annealing algorithmcsci561b/slides/session08_gamePlaying_short.pdf · Last...