L8 Adversarial Search

8/3/2019 L8 Adversarial Search

1/46

241-320 Design Architecture and Engineeringfor Intelligent System

Suntorn Witosurapot

Contact Address:Phone: 074 287369 or

Email: [email protected]

November 2009


2/46

Lecture 8:

Adversarial Search

(Playing and Solving Games)


3/46

241-320 Design Architecture &Engineering for Intelligent System

Adversarial Search 3

Outline

Games Adversarial search problems

MiniMaxalgorithm Alpha-beta pruning

Heuristics for adversarial search, makingimperfect decisions


4/46



Types of Games (informal)

Deterministic Chance

Perfect

Information

Chess, Checkers,Go, tic-tac-toe

Backgammon,

Monopoly

Imperfect

Information

Battleship Bridge, Poker,

Scrabble,

War games

The study of games is called game theory

A branch of economics


5/46



Types of Games (informal)

Deterministic ChancePerfect

InformationChess, Checkers,

Go, tic-tac-toeBackgammon,

Monopoly

ImperfectInformation

Battleship Bridge, Poker,Scrabble,

War games

Photo from: commons.wikimedia.org


6/46



Interested Games: Definitions

Well consider the special kind of games, whichcontain these definitions:

Two-player game: Player A and B. Player A starts

Deterministic: None of the moves/states are subject tochance (no random draws)

Perfect information: Fully observable environments. Both players see all the states and decisions. Each

decision is made sequentially.

Zero-sum Players A gain is exactly equal to player Bs loss. the utility values at the end of the game total to 0 e.g. +1

for winning, -1 for losing, 0 for tie (both gains are equal)


7/46



Games vs. search problems

Unpredictable opponent=> solution is a strategy specifying a move forevery possible opponent reply

Time limits=> unlikely to find goal, must approximate


8/46



Outline

Games

Adversarial search problems

MiniMax algorithm Alpha-beta pruning

Heuristics for adversarial search, makingimperfect decisions


9/46



Adversarial search: Assumptions & aims

Multiplayer

Formal, well-defined problems, where perfectdecisions are possible

At any position in the game, the player wants to knowwhat move to make

Success depends on opponents moves

Search forward: Exhaustive

Heuristic


10/46



Game Search Problem Formulation

Initial state: Initial board position, player to move

Successor function: it tells which actions can beexecuted in each state and gives the successor statefor each action

MAXs and MINs actions alternate, with MAXplaying first in the initial state

Terminal test: it tells if a state is terminal and, if yes,if its a win or a loss for MAX, or a draw

Utility function: Numeric value for terminal statesrepresenting its utility for a given player.

E.g. In chess, the outcome is win, loss or draw with values+1, -1 or 0


11/46



Relation to Previous Lecture

Here, uncertainty is caused by the actions of anotheragent (MIN), who competes with our agent (MAX)


12/46



Relation to Previous Lecture

Here, uncertainty is caused by the actions of anotheragent (MIN), who competes with our agent (MAX)

MIN wants MAX to lose (and vice versa)

No plan exists that guarantees MAXs successregardless of which actions MIN executes (the sameis true for MIN)

At each turn, the choice of which action to performmust be made within a specified time limit

The state space is enormous: only a tiny fraction ofthis space can be explored within the time limit


13/46



Game TreeMAXs play

MINs play

Terminal state(win for MAX)

Here, symmetries havebeen used to reducethe branching factor

MIN nodes

MAX nodes


14/46



Game TreeMAXs play

MINs play

Terminal state(win for MAX)

MIN nodes

MAX nodes

In general, the branchingfactor and the depth ofterminal states are large

Chess:

Number of states: ~1040 Branching factor: ~35 Number of total movesin a game: ~100


15/46



Recall: Optimal decisions in games

Consider

Two players - MIN and MAX

MAX moves first

Take turns moving until the game is over

At end of game points are awarded to thewinning player and penalties are given tothe loser


16/46



Game tree Solution:Example (Grundys game)

Initially a stack of pennies stands between two players

1 / 2

Each player divides one of the current stacks into two

unequal stacks ( 1 2)

2

The game ends when every stack contains one or twopennies

1 2

The first player who cannot play loses


17/46



Scenario


18/46

241-320 Design Architecture &Engineering for Intelligent System Adversarial Search 18

Search Problem

States: Board configuration + next player to move

Successor: List of states that can be reached from thecurrent state through of legal moves

Terminal state: States at which the games ends

Utility: Numerical value assigned to each terminal state

Example: U(s) = +1 for A win, -1 for B win, 0 for draw

Game value:The value of a terminal that will bereached assuming optimal strategies from both players(minimaxvalue)

Search: Find move that maximizes game value fromcurrent state


19/46


Scenario

= Max nodes= Min nodes


20/46


Choosing an Action: Basic Idea

1) Using the current state as the initial state,build the game tree uniformly to the maximaldepth h (called horizon) feasible within the

time limit2) Evaluate the states of the leaf nodes

3) Back up the results from the leaves to theroot and pick the best action assuming theworst from MIN

Minimax algorithm


21/46


Evaluation Function

Function e: state s number e(s)

e(s) is a heuristics that estimates how

favorable s is for MAXe(s) > 0 means that s is favorable to MAX

(the larger the better)

e(s) < 0 means that s is favorable to MINe(s) = 0 means that s is neutral


22/46


Example: Tic-tac-Toe

e(s) = number of rows, columns, and diagonalsopen for MAX- number of rows, columns,and diagonalsopen for MIN

8-8 = 0 6-4 = 2 3-3 = 0


23/46


Backing up Values

6-5=1

5-6=-15-5=0

5-5=0 6-5=1 5-5=1 4-5=-1

5-6=-1

6-4=25-4=1

6-6=0 4-6=-2

-1

-2

1

1Tic-Tac-Toe treeat dept (horizon) = 2

Best move

MAXs play

MAX: MIN:


24/46


Continuation

0

1

1

1 32 11 2

1

0

1 1 0

0 2 01 1 1

2 22 3 1 2

MAX: MIN:


25/46


Minimax Algorithm

1. Expand the game tree uniformly from the currentstate (where it is MAXs turn to play) to depth h

2. Compute the evaluation function at every leaf of thetree

3. Back-up the values from the leaves to the root of thetree as follows:

a. A MAX node gets the maximum of the evaluationof its successors

b. A MIN node gets the minimum of the evaluationof its successors

4. Select the move toward a MIN node that has thelargest backed-up value


26/46


Minimax Algorithm

1. Expand the game tree uniformly from the currentstate (where it is MAXs turn to play) to depth h

2. Compute the evaluation function at every leaf of thetree

3. Back-up the values from the leaves to the root of thetree as follows:

a. A MAX node gets the maximum of the evaluationof its successors

b. A MIN node gets the minimum of the evaluationof its successors

4. Select the move toward a MIN node that has thelargest backed-up value

Horizon: Needed to return adecision within allowed time


27/46


Game Playing (for MAX)

Repeat until a terminal state is reached

1. Select move using Minimax

2. Execute move3. Observe MINs move

Note that at each cycle the large game tree built to thedept (horizon) h is used to select only one move


28/46


Question

Which path will you move?


29/46


Answer

Which path will you move?


30/46


Can we do better?

Yes ! Much better !

3

-1

Pruning

-1

3

This part of the tree cant haveany effect on the value that willbe backed up to the root

MAX


31/46


Example


32/46


Example


33/46


Example


34/46


Example


35/46


Example


36/46


Why is it called Alpha-Beta Pruning?


37/46


How much do we gain?

Consider these two cases:

3

= 3

-1

=-1

(4)

3

= 3

4

=4

-1


38/46


Other Improvements

Adaptive horizon + iterative deepening

Extended search: Retain k>1 best paths, instead of justone, and extend the tree at greater depth below their

leaf nodes (to help dealing with the horizon effect) Singular extension: If a move is obviously better than

the others in a node at horizon h, then expand this nodealong this move

Use transposition tables to deal with repeated states

Null-move search


39/46


Summary

Game playing is best modeled as a search problem Game trees represent alternate computer/opponent moves

Evaluation functions estimate the quality of a given board

configuration for each player

Minimax is a procedure which chooses moves by assumingthat the opponent will always choose the move which is bestfor them

Alpha-Beta is a procedure which can prune large parts of

the search tree and allow search to go deeper For many well-known games, computer algorithms based on

heuristic search match or out-perform human world experts.


40/46


Suggestion:

Useful web site (with Interactive) :

http://202.28.94.51/users/ai/mainContents.php

Reading:

3 ( 3.4) 4
http://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.php


41/46


http://202.28.94.51/users/ai/tictactoe/TicTacToeApplet.html
http://202.28.94.51/users/ai/tictactoe/TicTacToeApplet.htmlhttp://202.28.94.51/users/ai/tictactoe/TicTacToeApplet.html


42/46



43/46



44/46


Adversarial Search44


45/46


Adversarial Search45


46/46

Adversarial Search

L8 Adversarial Search

Documents

Transcript of L8 Adversarial Search