Game Playing. Towards Intelligence? Many researchers attacked “intelligent behavior” by looking...
-
Upload
rodger-richard -
Category
Documents
-
view
226 -
download
0
Transcript of Game Playing. Towards Intelligence? Many researchers attacked “intelligent behavior” by looking...
Game PlayingGame Playing
Towards Intelligence?Towards Intelligence?
Many researchers attacked “intelligent Many researchers attacked “intelligent behavior” by looking to strategy games behavior” by looking to strategy games involving deep thought.involving deep thought.
After 40 years, best chess player is now a After 40 years, best chess player is now a computer. (computer. (Kasparov vs. Deep Blue, 1997) ) http://en.wikipedia.org/wiki/Computer_chess
Is this intelligence?Is this intelligence? How can computers play games of How can computers play games of
strategy?strategy?
Play against a partnerPlay against a partner
L L W W L L W W W L W L W W W W
player W
player L
player L
player W
Demo: Demo: http://www.siivoset.com/gradu/impothello/appletpage.html
Many two-person games can be viewed as Many two-person games can be viewed as tree searchtree search
Game assumptions:Game assumptions:– no randomnessno randomness– 2-player2-player– each player plays best strategyeach player plays best strategy
Tic-Tac-Toe TREETic-Tac-Toe TREE
nodes are board positions
edges are valid moves
Some nodes are WIN, some are LOSE, Some nodes are WIN, some are LOSE, some are TIE, some are indeterminatesome are TIE, some are indeterminate
GOAL: get to a node that is a WINGOAL: get to a node that is a WIN Why can’t we use search techniques Why can’t we use search techniques
we’ve seen already to find a WIN node we’ve seen already to find a WIN node and head towards it?and head towards it?– opponent gets to move every other turn… opponent gets to move every other turn…
may decide to move a different way!may decide to move a different way! TOP level - we choose. NEXT level TOP level - we choose. NEXT level
opponent chooses, then us, etc.opponent chooses, then us, etc.
Who wins?Who wins?
W L
our choice
Must be a move for us so that: we win.
Who wins?Who wins?
W
our choice
his choice
What must be true for us to win?
W
Who wins?Who wins?
W L L L
our choice
his choice
What must be true for us to WIN?
Must be a move for us so that: no matter what our opponent
chooses: we win.
Who wins?Who wins?
W W L L W L L W
our choice
his choice
our choice
There must exist a move for us so that: no matter what our opponent chooses:
there is a move for us: so that we win.
Who wins?Who wins?
L L W W L L W W W L W L W W W W
our choice
his choice
his choice
our choice
Labeling the treeLabeling the tree
L L W W L L W W W L W L W W W W
our choice
his choice
his choice
our choice
Labeling the treeLabeling the tree
L L W W L L W W W L W L W W W W
our choice
his choice
his choice
our choice
W
L WWW LL
L
L W
W W L W
W
Label the nodes as W/L depending on whether we win or lose if we get to that node
IssueIssue
PROBLEMPROBLEM– In practice, don’t have entire tree - it is likely In practice, don’t have entire tree - it is likely
way way big. way way big. – quick calculation: size of chess tree?quick calculation: size of chess tree?
SOLUTION 1SOLUTION 1– look ahead fixed distance (construct tree only look ahead fixed distance (construct tree only
to certain depth)to certain depth)– label nodes based on heuristic / guess of label nodes based on heuristic / guess of
whether “win” or “lose”whether “win” or “lose”
Solution 2Solution 2
Extension of solution 1Extension of solution 1– look ahead fixed distance (construct tree only look ahead fixed distance (construct tree only
to certain depth)to certain depth)– label nodes with NUMBERS which measure how label nodes with NUMBERS which measure how
good the position is. Numbers come from an good the position is. Numbers come from an evaluation functionevaluation function
EXAMPLE: Chess eval function: 9q + 5r + 3n + EXAMPLE: Chess eval function: 9q + 5r + 3n + 3b + p3b + p
Many complex evaluation functions are usedMany complex evaluation functions are used example: Othello Game Eval Function...example: Othello Game Eval Function...
20.0*([0,0]+[0,7]+[7,0]+[7,7])+20.0*([0,0]+[0,7]+[7,0]+[7,7])+
-7.0*([1,1]+[1,6]+[6,1]+[6,6])+-7.0*([1,1]+[1,6]+[6,1]+[6,6])+
2.0*([2,2]+[2,5]+[5,2]+[5,5])+2.0*([2,2]+[2,5]+[5,2]+[5,5])+
-3.0*([3,3]+[3,4]+[4,3]+[4,4])+-3.0*([3,3]+[3,4]+[4,3]+[4,4])+
-3.0*([1,0]+[0,1]+[6,0]+[0,6]+[7,1]+[1,7]+[7,6]+[6,7])+-3.0*([1,0]+[0,1]+[6,0]+[0,6]+[7,1]+[1,7]+[7,6]+[6,7])+
11.0*([2,0]+[0,2]+[5,0]+[0,5]+[7,2]+[2,7]+[7,5]+[5,7])+11.0*([2,0]+[0,2]+[5,0]+[0,5]+[7,2]+[2,7]+[7,5]+[5,7])+
8.0*([3,0]+[0,3]+[4,0]+[0,4]+[7,3]+[3,7]+[7,4]+[4,7])+8.0*([3,0]+[0,3]+[4,0]+[0,4]+[7,3]+[3,7]+[7,4]+[4,7])+
-4.0*([2,1]+[1,2]+[5,1]+[1,5]+[6,2]+[2,6]+[6,5]+[5,6])+-4.0*([2,1]+[1,2]+[5,1]+[1,5]+[6,2]+[2,6]+[6,5]+[5,6])+
1.0*([3,1]+[1,3]+[4,1]+[1,4]+[6,3]+[3,6]+[6,4]+[4,6])+1.0*([3,1]+[1,3]+[4,1]+[1,4]+[6,3]+[3,6]+[6,4]+[4,6])+
2.0*([3,2]+[2,3]+[4,2]+[2,4]+[5,3]+[3,5]+[5,4]+[4,5])2.0*([3,2]+[2,3]+[4,2]+[2,4]+[5,3]+[3,5]+[5,4]+[4,5])
New goal: get to the highest valued nodeNew goal: get to the highest valued node(opponent tries to get to the lowest valued node)(opponent tries to get to the lowest valued node)
0 7 1 3 6 0 4 5
our choice
his choice
our choice
What is best we can do? Why? What is best we can do? Why?
Play against a partnerPlay against a partner
1 4 8 7 6 4 3 9 2 4 5 3 7 2 6 8
player Max
player Min
player Min
player Max
MIN-MAX labelingMIN-MAX labeling Start at bottom (dep on lookahead)Start at bottom (dep on lookahead) Compute evaluation function for the leaf nodesCompute evaluation function for the leaf nodes Work upwards. Label a node with largest of children if Work upwards. Label a node with largest of children if
it is our choice, or smallest of children if opponent it is our choice, or smallest of children if opponent choicechoice
0 7 1 3 6 0 4 5
our choice
his choice
our choice
MIN-MAX labelingMIN-MAX labeling Start at bottom (dep on lookahead)Start at bottom (dep on lookahead) Compute evaluation function for the leaf nodesCompute evaluation function for the leaf nodes Work upwards. Label a node with largest of children if Work upwards. Label a node with largest of children if
it is our choice, or smallest of children if opponent it is our choice, or smallest of children if opponent choicechoice
0 7 1 3 6 0 4 5
7 3 6 5
3 5
5our choice
his choice
our choice
alpha/beta pruningalpha/beta pruning
5 3 7 1
our choice
his choice
our choice?
Why doesn’t this matter?
General game-playing programs:General game-playing programs:
have large databases of beginning, have large databases of beginning, ending move combinationsending move combinations
look-ahead more in the direction of look-ahead more in the direction of most promising board positionsmost promising board positions
use advanced techniques for pruning use advanced techniques for pruning parts of the tree that are not likely to parts of the tree that are not likely to lead to good positions lead to good positions
SummarySummary Many two-person games of skill can be modeled Many two-person games of skill can be modeled
as a tree, with vertices representing board as a tree, with vertices representing board positions, and edges representing moves.positions, and edges representing moves.
An evaluation function is used to label how good An evaluation function is used to label how good a particular board position is.a particular board position is.
A game-playing program combines special A game-playing program combines special knowledge of beginning and end games, with knowledge of beginning and end games, with look-ahead based on evaluation function, and look-ahead based on evaluation function, and careful pruning, to determine what the likely best careful pruning, to determine what the likely best next move isnext move is
Some games will remain too complex for Some games will remain too complex for computers to play as well as humans until new computers to play as well as humans until new methods are proved effective.methods are proved effective.