ITCS 3153 Artificial Intelligence

30
ITCS 3153 Artificial Intelligence Lecture 9 Lecture 9 Adversarial Search Adversarial Search Chapter 6 Chapter 6

description

ITCS 3153 Artificial Intelligence. Lecture 9 Adversarial Search Chapter 6. Games. “Shall we play a game?” Let’s play tic-tac-toe. Minimax. What data do we need to play?. Initial State How does the game start? Successor Function A list of legal (move, state) pairs for each state - PowerPoint PPT Presentation

Transcript of ITCS 3153 Artificial Intelligence

Page 1: ITCS 3153 Artificial Intelligence

ITCS 3153Artificial Intelligence

Lecture 9Lecture 9

Adversarial SearchAdversarial Search

Chapter 6Chapter 6

Lecture 9Lecture 9

Adversarial SearchAdversarial Search

Chapter 6Chapter 6

Page 2: ITCS 3153 Artificial Intelligence

Games

““Shall we play a game?”Shall we play a game?”

Let’s play tic-tac-toeLet’s play tic-tac-toe

““Shall we play a game?”Shall we play a game?”

Let’s play tic-tac-toeLet’s play tic-tac-toe

Page 3: ITCS 3153 Artificial Intelligence

Minimax

Page 4: ITCS 3153 Artificial Intelligence

What data do we need to play?

Initial StateInitial State

• How does the game start?How does the game start?

Successor FunctionSuccessor Function

• A list of legal (move, state) pairs for each stateA list of legal (move, state) pairs for each state

Terminal TestTerminal Test

• Determines when game is overDetermines when game is over

Utility FunctionUtility Function

• Provides numeric value for all terminal statesProvides numeric value for all terminal states

Initial StateInitial State

• How does the game start?How does the game start?

Successor FunctionSuccessor Function

• A list of legal (move, state) pairs for each stateA list of legal (move, state) pairs for each state

Terminal TestTerminal Test

• Determines when game is overDetermines when game is over

Utility FunctionUtility Function

• Provides numeric value for all terminal statesProvides numeric value for all terminal states

Page 5: ITCS 3153 Artificial Intelligence

Minimax Strategy

Optimal StragtegyOptimal Stragtegy

• Leads to outcomes at least as good as any other strategy Leads to outcomes at least as good as any other strategy when playing an infallible opponentwhen playing an infallible opponent

• Pick the option that most (max) minimizes the damage your Pick the option that most (max) minimizes the damage your opponent can do opponent can do

– maximize the worst-case outcomemaximize the worst-case outcome

– because your skillful opponent will certainly find the most because your skillful opponent will certainly find the most damaging movedamaging move

Optimal StragtegyOptimal Stragtegy

• Leads to outcomes at least as good as any other strategy Leads to outcomes at least as good as any other strategy when playing an infallible opponentwhen playing an infallible opponent

• Pick the option that most (max) minimizes the damage your Pick the option that most (max) minimizes the damage your opponent can do opponent can do

– maximize the worst-case outcomemaximize the worst-case outcome

– because your skillful opponent will certainly find the most because your skillful opponent will certainly find the most damaging movedamaging move

Page 6: ITCS 3153 Artificial Intelligence

Minimax

AlgorithmAlgorithm

• MinimaxValue(n) =MinimaxValue(n) =

Utility (n)Utility (n) if n is a terminal stateif n is a terminal state

max MinimaxValue(s) max MinimaxValue(s) of all successors, sof all successors, s if n is a MAX nodeif n is a MAX node

min MinimaxValue(s) min MinimaxValue(s) of all successors, sof all successors, s if n is a MIN nodeif n is a MIN node

AlgorithmAlgorithm

• MinimaxValue(n) =MinimaxValue(n) =

Utility (n)Utility (n) if n is a terminal stateif n is a terminal state

max MinimaxValue(s) max MinimaxValue(s) of all successors, sof all successors, s if n is a MAX nodeif n is a MAX node

min MinimaxValue(s) min MinimaxValue(s) of all successors, sof all successors, s if n is a MIN nodeif n is a MIN node

Page 7: ITCS 3153 Artificial Intelligence

Minimax

Page 8: ITCS 3153 Artificial Intelligence

Minimax Algorithm

We wish to identify minimax decision at the rootWe wish to identify minimax decision at the root

• Recursive evaluation of all nodes in game treeRecursive evaluation of all nodes in game tree

• Time complexity = O (bTime complexity = O (bmm))

We wish to identify minimax decision at the rootWe wish to identify minimax decision at the root

• Recursive evaluation of all nodes in game treeRecursive evaluation of all nodes in game tree

• Time complexity = O (bTime complexity = O (bmm))

Page 9: ITCS 3153 Artificial Intelligence

Feasibility of minimax?

How about a nice game of chess?How about a nice game of chess?

• Avg branching = 35 and avg # moves = 50 for each playerAvg branching = 35 and avg # moves = 50 for each player

– O(35O(35100100) time complexity = 10) time complexity = 10154154 nodes nodes

10104040 distinct nodes distinct nodes

Minimax is impractical if directly applied to chessMinimax is impractical if directly applied to chess

How about a nice game of chess?How about a nice game of chess?

• Avg branching = 35 and avg # moves = 50 for each playerAvg branching = 35 and avg # moves = 50 for each player

– O(35O(35100100) time complexity = 10) time complexity = 10154154 nodes nodes

10104040 distinct nodes distinct nodes

Minimax is impractical if directly applied to chessMinimax is impractical if directly applied to chess

Page 10: ITCS 3153 Artificial Intelligence

Pruning minimax tree

Are there times when you know you need not Are there times when you know you need not explore a particular move?explore a particular move?

• When the move is poor?When the move is poor?

• Poor compared to what?Poor compared to what?

• Poor compared to what you have explored so farPoor compared to what you have explored so far

Are there times when you know you need not Are there times when you know you need not explore a particular move?explore a particular move?

• When the move is poor?When the move is poor?

• Poor compared to what?Poor compared to what?

• Poor compared to what you have explored so farPoor compared to what you have explored so far

Page 11: ITCS 3153 Artificial Intelligence

Alpha-beta pruning

Page 12: ITCS 3153 Artificial Intelligence

Alpha-beta pruning

– the value of the best (highest) choice so far in search of the value of the best (highest) choice so far in search of MAXMAX

– the value of the best (lowest) choice so far in search of the value of the best (lowest) choice so far in search of MINMIN

• Order of considering successors matters (look at step f in Order of considering successors matters (look at step f in previous slide)previous slide)

– If possible, consider best successors firstIf possible, consider best successors first

– the value of the best (highest) choice so far in search of the value of the best (highest) choice so far in search of MAXMAX

– the value of the best (lowest) choice so far in search of the value of the best (lowest) choice so far in search of MINMIN

• Order of considering successors matters (look at step f in Order of considering successors matters (look at step f in previous slide)previous slide)

– If possible, consider best successors firstIf possible, consider best successors first

Page 13: ITCS 3153 Artificial Intelligence

Realtime decisions

What if you don’t have enough time to explore What if you don’t have enough time to explore entire search tree?entire search tree?

• We cannot search all the way down to terminal state for all We cannot search all the way down to terminal state for all decision sequencesdecision sequences

• Use a heuristic to approximate (guess) eventual terminal Use a heuristic to approximate (guess) eventual terminal statestate

What if you don’t have enough time to explore What if you don’t have enough time to explore entire search tree?entire search tree?

• We cannot search all the way down to terminal state for all We cannot search all the way down to terminal state for all decision sequencesdecision sequences

• Use a heuristic to approximate (guess) eventual terminal Use a heuristic to approximate (guess) eventual terminal statestate

Page 14: ITCS 3153 Artificial Intelligence

Evaluation Function

The heuristic that estimates expected utilityThe heuristic that estimates expected utility

• Cannot take too long (otherwise recurse to get answer)Cannot take too long (otherwise recurse to get answer)

• It should preserve the ordering among terminal statesIt should preserve the ordering among terminal states

– otherwise it can cause bad decision makingotherwise it can cause bad decision making

• Define Define featuresfeatures of game state that assist in evaluation of game state that assist in evaluation

– what are features of chess?what are features of chess?

The heuristic that estimates expected utilityThe heuristic that estimates expected utility

• Cannot take too long (otherwise recurse to get answer)Cannot take too long (otherwise recurse to get answer)

• It should preserve the ordering among terminal statesIt should preserve the ordering among terminal states

– otherwise it can cause bad decision makingotherwise it can cause bad decision making

• Define Define featuresfeatures of game state that assist in evaluation of game state that assist in evaluation

– what are features of chess?what are features of chess?

Page 15: ITCS 3153 Artificial Intelligence

Truncating minimax search

When do you recurse or use evaluation function?When do you recurse or use evaluation function?

• Cutoff-Test (state, depth) returns 1 or 0Cutoff-Test (state, depth) returns 1 or 0

– When 1 is returned, use evaluation functionWhen 1 is returned, use evaluation function

• Cutoff beyond a certain depthCutoff beyond a certain depth

• Cutoff if state is stable (more predictable)Cutoff if state is stable (more predictable)

• Cutoff moves you know are bad (forward pruning)Cutoff moves you know are bad (forward pruning)

When do you recurse or use evaluation function?When do you recurse or use evaluation function?

• Cutoff-Test (state, depth) returns 1 or 0Cutoff-Test (state, depth) returns 1 or 0

– When 1 is returned, use evaluation functionWhen 1 is returned, use evaluation function

• Cutoff beyond a certain depthCutoff beyond a certain depth

• Cutoff if state is stable (more predictable)Cutoff if state is stable (more predictable)

• Cutoff moves you know are bad (forward pruning)Cutoff moves you know are bad (forward pruning)

Page 16: ITCS 3153 Artificial Intelligence

Benefits of truncation

Comparing ChessComparing Chess

• Using minimaxUsing minimax 5 ply5 ply

• Average HumanAverage Human 6-8 ply6-8 ply

• Using alpha-betaUsing alpha-beta 10 ply10 ply

• Intelligent pruningIntelligent pruning 14 ply14 ply

Comparing ChessComparing Chess

• Using minimaxUsing minimax 5 ply5 ply

• Average HumanAverage Human 6-8 ply6-8 ply

• Using alpha-betaUsing alpha-beta 10 ply10 ply

• Intelligent pruningIntelligent pruning 14 ply14 ply

Page 17: ITCS 3153 Artificial Intelligence

Games with chance

How to include chance in game tree?How to include chance in game tree?

• Add chance Add chance nodesnodes

How to include chance in game tree?How to include chance in game tree?

• Add chance Add chance nodesnodes

Page 18: ITCS 3153 Artificial Intelligence

Expectiminimax

Expectiminimax (n) =Expectiminimax (n) =

• utility(n) utility(n) if n is a terminal stateif n is a terminal state

• if n is a MAX nodeif n is a MAX node

• if n is a MIN nodeif n is a MIN node

• if n is a chance nodeif n is a chance node

Expectiminimax (n) =Expectiminimax (n) =

• utility(n) utility(n) if n is a terminal stateif n is a terminal state

• if n is a MAX nodeif n is a MAX node

• if n is a MIN nodeif n is a MIN node

• if n is a chance nodeif n is a chance node

)(max )( sIMAXEXPECTIMINnSuccessorss

)(min )( sIMAXEXPECTIMINnSuccessorss

)(

)(*)(nSuccessorss

sIMAXEXPECTIMINsP

Page 19: ITCS 3153 Artificial Intelligence

Pruning

Can we prune search in games of chance?Can we prune search in games of chance?

• Think about alpha-beta pruningThink about alpha-beta pruning

– don’t explore nodes that you know are worse than what don’t explore nodes that you know are worse than what you haveyou have

– we don’t know what we havewe don’t know what we have

chance node values are average of successorschance node values are average of successors

Can we prune search in games of chance?Can we prune search in games of chance?

• Think about alpha-beta pruningThink about alpha-beta pruning

– don’t explore nodes that you know are worse than what don’t explore nodes that you know are worse than what you haveyou have

– we don’t know what we havewe don’t know what we have

chance node values are average of successorschance node values are average of successors

Page 20: ITCS 3153 Artificial Intelligence

History of Games

Chess, Deep BlueChess, Deep Blue

• IBM: 30 RS/6000 comps with 480 custom VLSI chess chipsIBM: 30 RS/6000 comps with 480 custom VLSI chess chips

• Deep Thought design came from Campbell and Hsu at CMUDeep Thought design came from Campbell and Hsu at CMU

• 126 mil nodes / s126 mil nodes / s

• 30 bil positions per move30 bil positions per move

• routine reaching depth of 14routine reaching depth of 14

• iterative deepening alpha-beta searchiterative deepening alpha-beta search

Chess, Deep BlueChess, Deep Blue

• IBM: 30 RS/6000 comps with 480 custom VLSI chess chipsIBM: 30 RS/6000 comps with 480 custom VLSI chess chips

• Deep Thought design came from Campbell and Hsu at CMUDeep Thought design came from Campbell and Hsu at CMU

• 126 mil nodes / s126 mil nodes / s

• 30 bil positions per move30 bil positions per move

• routine reaching depth of 14routine reaching depth of 14

• iterative deepening alpha-beta searchiterative deepening alpha-beta search

Page 21: ITCS 3153 Artificial Intelligence

Deep Blue

• evaluation function had 8000 featuresevaluation function had 8000 features

• 4000 opening moves in memory4000 opening moves in memory

• 700,000 grandmaster games from which recommendations 700,000 grandmaster games from which recommendations extractedextracted

• many endgames solved for all five piece combosmany endgames solved for all five piece combos

• evaluation function had 8000 featuresevaluation function had 8000 features

• 4000 opening moves in memory4000 opening moves in memory

• 700,000 grandmaster games from which recommendations 700,000 grandmaster games from which recommendations extractedextracted

• many endgames solved for all five piece combosmany endgames solved for all five piece combos

Page 22: ITCS 3153 Artificial Intelligence

Checkers

Arthur Samuel of IBM, 1952Arthur Samuel of IBM, 1952

• program learned by playing against itselfprogram learned by playing against itself

• beat a human in 1962 (but human clearly made error)beat a human in 1962 (but human clearly made error)

• 19 KB of memory19 KB of memory

• 0.000001 Ghz processor0.000001 Ghz processor

Arthur Samuel of IBM, 1952Arthur Samuel of IBM, 1952

• program learned by playing against itselfprogram learned by playing against itself

• beat a human in 1962 (but human clearly made error)beat a human in 1962 (but human clearly made error)

• 19 KB of memory19 KB of memory

• 0.000001 Ghz processor0.000001 Ghz processor

Page 23: ITCS 3153 Artificial Intelligence

Chinook, Jonathan Schaeffer, 1990

• Alpha-beta search on regular PCsAlpha-beta search on regular PCs

• database of all 444 billion endgame positions with 8 piecesdatabase of all 444 billion endgame positions with 8 pieces

• Played against Marion TinsleyPlayed against Marion Tinsley

– world champion for over 40 yearsworld champion for over 40 years

– lost only 3 games in 40 yearslost only 3 games in 40 years

– Chinook won two games, but lost matchChinook won two games, but lost match

• Rematch with Tinsley was incomplete for health reasonsRematch with Tinsley was incomplete for health reasons

– Chinook became world championChinook became world champion

• Alpha-beta search on regular PCsAlpha-beta search on regular PCs

• database of all 444 billion endgame positions with 8 piecesdatabase of all 444 billion endgame positions with 8 pieces

• Played against Marion TinsleyPlayed against Marion Tinsley

– world champion for over 40 yearsworld champion for over 40 years

– lost only 3 games in 40 yearslost only 3 games in 40 years

– Chinook won two games, but lost matchChinook won two games, but lost match

• Rematch with Tinsley was incomplete for health reasonsRematch with Tinsley was incomplete for health reasons

– Chinook became world championChinook became world champion

Page 24: ITCS 3153 Artificial Intelligence

Othello

Smaller search space (5 to 15 legal moves)Smaller search space (5 to 15 legal moves)

Humans are no match for computersHumans are no match for computers

Smaller search space (5 to 15 legal moves)Smaller search space (5 to 15 legal moves)

Humans are no match for computersHumans are no match for computers

Page 25: ITCS 3153 Artificial Intelligence

Backgammon

Garry Tesauro, TD-Gammon, 1992Garry Tesauro, TD-Gammon, 1992

• Reliably ranked in top-three players of worldReliably ranked in top-three players of world

Garry Tesauro, TD-Gammon, 1992Garry Tesauro, TD-Gammon, 1992

• Reliably ranked in top-three players of worldReliably ranked in top-three players of world

Page 26: ITCS 3153 Artificial Intelligence

Discussion

How reasonable is minimax?How reasonable is minimax?

• perfectly performing opponentperfectly performing opponent

• perfect knowledge of leaf node evaluationsperfect knowledge of leaf node evaluations

• strong assumptionsstrong assumptions

How reasonable is minimax?How reasonable is minimax?

• perfectly performing opponentperfectly performing opponent

• perfect knowledge of leaf node evaluationsperfect knowledge of leaf node evaluations

• strong assumptionsstrong assumptions

Page 27: ITCS 3153 Artificial Intelligence

Building alpha-beta tree

Can we restrict the size of game tree?Can we restrict the size of game tree?

• alpha-beta will blindly explore tree in depth-first fashion even alpha-beta will blindly explore tree in depth-first fashion even if only one move is possible from rootif only one move is possible from root

• even if multiple moves are possible, can we use a quick even if multiple moves are possible, can we use a quick search to eliminate some entirely?search to eliminate some entirely?

• utility vs. time tradeoff to decide when to explore new utility vs. time tradeoff to decide when to explore new branches or to stay with what you havebranches or to stay with what you have

Can we restrict the size of game tree?Can we restrict the size of game tree?

• alpha-beta will blindly explore tree in depth-first fashion even alpha-beta will blindly explore tree in depth-first fashion even if only one move is possible from rootif only one move is possible from root

• even if multiple moves are possible, can we use a quick even if multiple moves are possible, can we use a quick search to eliminate some entirely?search to eliminate some entirely?

• utility vs. time tradeoff to decide when to explore new utility vs. time tradeoff to decide when to explore new branches or to stay with what you havebranches or to stay with what you have

Page 28: ITCS 3153 Artificial Intelligence

Metareasoning

Reasoning about reasoningReasoning about reasoning

• alpha-beta is one examplealpha-beta is one example

– think before you thinkthink before you think

– think about utility of thinking about something before you think about utility of thinking about something before you think about itthink about it

– don’t think about choices you don’t have to think aboutdon’t think about choices you don’t have to think about

Reasoning about reasoningReasoning about reasoning

• alpha-beta is one examplealpha-beta is one example

– think before you thinkthink before you think

– think about utility of thinking about something before you think about utility of thinking about something before you think about itthink about it

– don’t think about choices you don’t have to think aboutdon’t think about choices you don’t have to think about

Page 29: ITCS 3153 Artificial Intelligence

Goal-directed reasoning / planning

Minimax starts from root and moves forward Minimax starts from root and moves forward using combinatorial searchusing combinatorial search

What about starting at goal and working backwardWhat about starting at goal and working backward

• We talked about difficulty of identifying goal states in We talked about difficulty of identifying goal states in bidirectional searchbidirectional search

• We do not know how to combine the two in practical way We do not know how to combine the two in practical way

Minimax starts from root and moves forward Minimax starts from root and moves forward using combinatorial searchusing combinatorial search

What about starting at goal and working backwardWhat about starting at goal and working backward

• We talked about difficulty of identifying goal states in We talked about difficulty of identifying goal states in bidirectional searchbidirectional search

• We do not know how to combine the two in practical way We do not know how to combine the two in practical way

Page 30: ITCS 3153 Artificial Intelligence