L8 Adversarial Search

download L8 Adversarial Search

of 46

Transcript of L8 Adversarial Search

  • 8/3/2019 L8 Adversarial Search

    1/46

    241-320 Design Architecture and Engineeringfor Intelligent System

    Suntorn Witosurapot

    Contact Address:Phone: 074 287369 or

    Email: [email protected]

    November 2009

  • 8/3/2019 L8 Adversarial Search

    2/46

    Lecture 8:

    Adversarial Search

    (Playing and Solving Games)

  • 8/3/2019 L8 Adversarial Search

    3/46

    241-320 Design Architecture &Engineering for Intelligent System

    Adversarial Search 3

    Outline

    Games Adversarial search problems

    MiniMaxalgorithm Alpha-beta pruning

    Heuristics for adversarial search, makingimperfect decisions

  • 8/3/2019 L8 Adversarial Search

    4/46

    241-320 Design Architecture &Engineering for Intelligent System

    Adversarial Search 4

    Types of Games (informal)

    Deterministic Chance

    Perfect

    Information

    Chess, Checkers,Go, tic-tac-toe

    Backgammon,

    Monopoly

    Imperfect

    Information

    Battleship Bridge, Poker,

    Scrabble,

    War games

    The study of games is called game theory

    A branch of economics

  • 8/3/2019 L8 Adversarial Search

    5/46

    241-320 Design Architecture &Engineering for Intelligent System

    Adversarial Search 5

    Types of Games (informal)

    Deterministic ChancePerfect

    InformationChess, Checkers,

    Go, tic-tac-toeBackgammon,

    Monopoly

    ImperfectInformation

    Battleship Bridge, Poker,Scrabble,

    War games

    Photo from: commons.wikimedia.org

  • 8/3/2019 L8 Adversarial Search

    6/46

    241-320 Design Architecture &Engineering for Intelligent System

    Adversarial Search 6

    Interested Games: Definitions

    Well consider the special kind of games, whichcontain these definitions:

    Two-player game: Player A and B. Player A starts

    Deterministic: None of the moves/states are subject tochance (no random draws)

    Perfect information: Fully observable environments. Both players see all the states and decisions. Each

    decision is made sequentially.

    Zero-sum Players A gain is exactly equal to player Bs loss. the utility values at the end of the game total to 0 e.g. +1

    for winning, -1 for losing, 0 for tie (both gains are equal)

  • 8/3/2019 L8 Adversarial Search

    7/46

    241-320 Design Architecture &Engineering for Intelligent System

    Adversarial Search 7

    Games vs. search problems

    Unpredictable opponent=> solution is a strategy specifying a move forevery possible opponent reply

    Time limits=> unlikely to find goal, must approximate

  • 8/3/2019 L8 Adversarial Search

    8/46

    241-320 Design Architecture &Engineering for Intelligent System

    Adversarial Search 8

    Outline

    Games

    Adversarial search problems

    MiniMax algorithm Alpha-beta pruning

    Heuristics for adversarial search, makingimperfect decisions

  • 8/3/2019 L8 Adversarial Search

    9/46

    241-320 Design Architecture &Engineering for Intelligent System

    Adversarial Search 9

    Adversarial search: Assumptions & aims

    Multiplayer

    Formal, well-defined problems, where perfectdecisions are possible

    At any position in the game, the player wants to knowwhat move to make

    Success depends on opponents moves

    Search forward: Exhaustive

    Heuristic

  • 8/3/2019 L8 Adversarial Search

    10/46

    241-320 Design Architecture &Engineering for Intelligent System

    Adversarial Search 10

    Game Search Problem Formulation

    Initial state: Initial board position, player to move

    Successor function: it tells which actions can beexecuted in each state and gives the successor statefor each action

    MAXs and MINs actions alternate, with MAXplaying first in the initial state

    Terminal test: it tells if a state is terminal and, if yes,if its a win or a loss for MAX, or a draw

    Utility function: Numeric value for terminal statesrepresenting its utility for a given player.

    E.g. In chess, the outcome is win, loss or draw with values+1, -1 or 0

  • 8/3/2019 L8 Adversarial Search

    11/46

    241-320 Design Architecture &Engineering for Intelligent System

    Adversarial Search 11

    Relation to Previous Lecture

    Here, uncertainty is caused by the actions of anotheragent (MIN), who competes with our agent (MAX)

  • 8/3/2019 L8 Adversarial Search

    12/46

    241-320 Design Architecture &Engineering for Intelligent System

    Adversarial Search 12

    Relation to Previous Lecture

    Here, uncertainty is caused by the actions of anotheragent (MIN), who competes with our agent (MAX)

    MIN wants MAX to lose (and vice versa)

    No plan exists that guarantees MAXs successregardless of which actions MIN executes (the sameis true for MIN)

    At each turn, the choice of which action to performmust be made within a specified time limit

    The state space is enormous: only a tiny fraction ofthis space can be explored within the time limit

  • 8/3/2019 L8 Adversarial Search

    13/46

    241-320 Design Architecture &Engineering for Intelligent System

    Adversarial Search 13

    Game TreeMAXs play

    MINs play

    Terminal state(win for MAX)

    Here, symmetries havebeen used to reducethe branching factor

    MIN nodes

    MAX nodes

  • 8/3/2019 L8 Adversarial Search

    14/46

    241-320 Design Architecture &Engineering for Intelligent System

    Adversarial Search 14

    Game TreeMAXs play

    MINs play

    Terminal state(win for MAX)

    MIN nodes

    MAX nodes

    In general, the branchingfactor and the depth ofterminal states are large

    Chess:

    Number of states: ~1040 Branching factor: ~35 Number of total movesin a game: ~100

  • 8/3/2019 L8 Adversarial Search

    15/46

    241-320 Design Architecture &Engineering for Intelligent System

    Adversarial Search 15

    Recall: Optimal decisions in games

    Consider

    Two players - MIN and MAX

    MAX moves first

    Take turns moving until the game is over

    At end of game points are awarded to thewinning player and penalties are given tothe loser

  • 8/3/2019 L8 Adversarial Search

    16/46

    241-320 Design Architecture &Engineering for Intelligent System

    Adversarial Search 16

    Game tree Solution:Example (Grundys game)

    Initially a stack of pennies stands between two players

    1 / 2

    Each player divides one of the current stacks into two

    unequal stacks ( 1 2)

    2

    The game ends when every stack contains one or twopennies

    1 2

    The first player who cannot play loses

  • 8/3/2019 L8 Adversarial Search

    17/46

    241-320 Design Architecture &Engineering for Intelligent System

    Adversarial Search 17

    Scenario

  • 8/3/2019 L8 Adversarial Search

    18/46

    241-320 Design Architecture &Engineering for Intelligent System Adversarial Search 18

    Search Problem

    States: Board configuration + next player to move

    Successor: List of states that can be reached from thecurrent state through of legal moves

    Terminal state: States at which the games ends

    Utility: Numerical value assigned to each terminal state

    Example: U(s) = +1 for A win, -1 for B win, 0 for draw

    Game value:The value of a terminal that will bereached assuming optimal strategies from both players(minimaxvalue)

    Search: Find move that maximizes game value fromcurrent state

  • 8/3/2019 L8 Adversarial Search

    19/46

    241-320 Design Architecture &Engineering for Intelligent System Adversarial Search 19

    Scenario

    = Max nodes= Min nodes

  • 8/3/2019 L8 Adversarial Search

    20/46

    241-320 Design Architecture &Engineering for Intelligent System Adversarial Search 20

    Choosing an Action: Basic Idea

    1) Using the current state as the initial state,build the game tree uniformly to the maximaldepth h (called horizon) feasible within the

    time limit2) Evaluate the states of the leaf nodes

    3) Back up the results from the leaves to theroot and pick the best action assuming theworst from MIN

    Minimax algorithm

  • 8/3/2019 L8 Adversarial Search

    21/46

    241-320 Design Architecture &Engineering for Intelligent System Adversarial Search 21

    Evaluation Function

    Function e: state s number e(s)

    e(s) is a heuristics that estimates how

    favorable s is for MAXe(s) > 0 means that s is favorable to MAX

    (the larger the better)

    e(s) < 0 means that s is favorable to MINe(s) = 0 means that s is neutral

  • 8/3/2019 L8 Adversarial Search

    22/46

    241-320 Design Architecture &Engineering for Intelligent System Adversarial Search 22

    Example: Tic-tac-Toe

    e(s) = number of rows, columns, and diagonalsopen for MAX- number of rows, columns,and diagonalsopen for MIN

    8-8 = 0 6-4 = 2 3-3 = 0

  • 8/3/2019 L8 Adversarial Search

    23/46

    241-320 Design Architecture &Engineering for Intelligent System Adversarial Search 23

    Backing up Values

    6-5=1

    5-6=-15-5=0

    5-5=0 6-5=1 5-5=1 4-5=-1

    5-6=-1

    6-4=25-4=1

    6-6=0 4-6=-2

    -1

    -2

    1

    1Tic-Tac-Toe treeat dept (horizon) = 2

    Best move

    MAXs play

    MAX: MIN:

  • 8/3/2019 L8 Adversarial Search

    24/46

    241-320 Design Architecture &Engineering for Intelligent System Adversarial Search 24

    Continuation

    0

    1

    1

    1 32 11 2

    1

    0

    1 1 0

    0 2 01 1 1

    2 22 3 1 2

    MAX: MIN:

  • 8/3/2019 L8 Adversarial Search

    25/46

    241-320 Design Architecture &Engineering for Intelligent System Adversarial Search 25

    Minimax Algorithm

    1. Expand the game tree uniformly from the currentstate (where it is MAXs turn to play) to depth h

    2. Compute the evaluation function at every leaf of thetree

    3. Back-up the values from the leaves to the root of thetree as follows:

    a. A MAX node gets the maximum of the evaluationof its successors

    b. A MIN node gets the minimum of the evaluationof its successors

    4. Select the move toward a MIN node that has thelargest backed-up value

  • 8/3/2019 L8 Adversarial Search

    26/46

    241-320 Design Architecture &Engineering for Intelligent System Adversarial Search 26

    Minimax Algorithm

    1. Expand the game tree uniformly from the currentstate (where it is MAXs turn to play) to depth h

    2. Compute the evaluation function at every leaf of thetree

    3. Back-up the values from the leaves to the root of thetree as follows:

    a. A MAX node gets the maximum of the evaluationof its successors

    b. A MIN node gets the minimum of the evaluationof its successors

    4. Select the move toward a MIN node that has thelargest backed-up value

    Horizon: Needed to return adecision within allowed time

  • 8/3/2019 L8 Adversarial Search

    27/46

    241-320 Design Architecture &Engineering for Intelligent System Adversarial Search 27

    Game Playing (for MAX)

    Repeat until a terminal state is reached

    1. Select move using Minimax

    2. Execute move3. Observe MINs move

    Note that at each cycle the large game tree built to thedept (horizon) h is used to select only one move

  • 8/3/2019 L8 Adversarial Search

    28/46

    241-320 Design Architecture &Engineering for Intelligent System Adversarial Search 28

    Question

    Which path will you move?

  • 8/3/2019 L8 Adversarial Search

    29/46

    241-320 Design Architecture &Engineering for Intelligent System Adversarial Search 29

    Answer

    Which path will you move?

  • 8/3/2019 L8 Adversarial Search

    30/46

    241-320 Design Architecture &Engineering for Intelligent System Adversarial Search 30

    Can we do better?

    Yes ! Much better !

    3

    -1

    Pruning

    -1

    3

    This part of the tree cant haveany effect on the value that willbe backed up to the root

    MAX

  • 8/3/2019 L8 Adversarial Search

    31/46

    241-320 Design Architecture &Engineering for Intelligent System Adversarial Search 31

    Example

  • 8/3/2019 L8 Adversarial Search

    32/46

    241-320 Design Architecture &Engineering for Intelligent System Adversarial Search 32

    Example

  • 8/3/2019 L8 Adversarial Search

    33/46

    241-320 Design Architecture &Engineering for Intelligent System Adversarial Search 33

    Example

  • 8/3/2019 L8 Adversarial Search

    34/46

    241-320 Design Architecture &Engineering for Intelligent System Adversarial Search 34

    Example

  • 8/3/2019 L8 Adversarial Search

    35/46

    241-320 Design Architecture &Engineering for Intelligent System Adversarial Search 35

    Example

  • 8/3/2019 L8 Adversarial Search

    36/46

    241-320 Design Architecture &Engineering for Intelligent System Adversarial Search 36

    Why is it called Alpha-Beta Pruning?

  • 8/3/2019 L8 Adversarial Search

    37/46

    241-320 Design Architecture &Engineering for Intelligent System Adversarial Search 37

    How much do we gain?

    Consider these two cases:

    3

    = 3

    -1

    =-1

    (4)

    3

    = 3

    4

    =4

    -1

  • 8/3/2019 L8 Adversarial Search

    38/46

    241-320 Design Architecture &Engineering for Intelligent System Adversarial Search 38

    Other Improvements

    Adaptive horizon + iterative deepening

    Extended search: Retain k>1 best paths, instead of justone, and extend the tree at greater depth below their

    leaf nodes (to help dealing with the horizon effect) Singular extension: If a move is obviously better than

    the others in a node at horizon h, then expand this nodealong this move

    Use transposition tables to deal with repeated states

    Null-move search

  • 8/3/2019 L8 Adversarial Search

    39/46

    241-320 Design Architecture &Engineering for Intelligent System Adversarial Search 39

    Summary

    Game playing is best modeled as a search problem Game trees represent alternate computer/opponent moves

    Evaluation functions estimate the quality of a given board

    configuration for each player

    Minimax is a procedure which chooses moves by assumingthat the opponent will always choose the move which is bestfor them

    Alpha-Beta is a procedure which can prune large parts of

    the search tree and allow search to go deeper For many well-known games, computer algorithms based on

    heuristic search match or out-perform human world experts.

  • 8/3/2019 L8 Adversarial Search

    40/46

    241-320 Design Architecture &Engineering for Intelligent System Adversarial Search 40

    Suggestion:

    Useful web site (with Interactive) :

    http://202.28.94.51/users/ai/mainContents.php

    Reading:

    3 ( 3.4) 4

    http://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.phphttp://202.28.94.51/users/ai/mainContents.php
  • 8/3/2019 L8 Adversarial Search

    41/46

    241-320 Design Architecture &Engineering for Intelligent System Adversarial Search 41

    http://202.28.94.51/users/ai/tictactoe/TicTacToeApplet.html

    http://202.28.94.51/users/ai/tictactoe/TicTacToeApplet.htmlhttp://202.28.94.51/users/ai/tictactoe/TicTacToeApplet.html
  • 8/3/2019 L8 Adversarial Search

    42/46

    241-320 Design Architecture &Engineering for Intelligent System Adversarial Search 42

  • 8/3/2019 L8 Adversarial Search

    43/46

    241-320 Design Architecture &Engineering for Intelligent System Adversarial Search 43

  • 8/3/2019 L8 Adversarial Search

    44/46

    241-320 Design Architecture &Engineering for Intelligent System

    Adversarial Search44

  • 8/3/2019 L8 Adversarial Search

    45/46

    241-320 Design Architecture &Engineering for Intelligent System

    Adversarial Search45

  • 8/3/2019 L8 Adversarial Search

    46/46

    Adversarial Search