Game Playing. Towards Intelligence? Many researchers attacked “intelligent behavior” by looking...

Game PlayingGame Playing

Towards Intelligence?Towards Intelligence?

Many researchers attacked “intelligent Many researchers attacked “intelligent behavior” by looking to strategy games behavior” by looking to strategy games involving deep thought.involving deep thought.

After 40 years, best chess player is now a After 40 years, best chess player is now a computer. (computer. (Kasparov vs. Deep Blue, 1997) ) http://en.wikipedia.org/wiki/Computer_chess

Is this intelligence?Is this intelligence? How can computers play games of How can computers play games of

strategy?strategy?

Play against a partnerPlay against a partner

L L W W L L W W W L W L W W W W

player W

player L

player L

player W

Demo: Demo: http://www.siivoset.com/gradu/impothello/appletpage.html

Many two-person games can be viewed as Many two-person games can be viewed as tree searchtree search

Game assumptions:Game assumptions:– no randomnessno randomness– 2-player2-player– each player plays best strategyeach player plays best strategy

Tic-Tac-Toe TREETic-Tac-Toe TREE

nodes are board positions

edges are valid moves

Some nodes are WIN, some are LOSE, Some nodes are WIN, some are LOSE, some are TIE, some are indeterminatesome are TIE, some are indeterminate

GOAL: get to a node that is a WINGOAL: get to a node that is a WIN Why can’t we use search techniques Why can’t we use search techniques

we’ve seen already to find a WIN node we’ve seen already to find a WIN node and head towards it?and head towards it?– opponent gets to move every other turn… opponent gets to move every other turn…

may decide to move a different way!may decide to move a different way! TOP level - we choose. NEXT level TOP level - we choose. NEXT level

opponent chooses, then us, etc.opponent chooses, then us, etc.

Who wins?Who wins?

W L

our choice

Must be a move for us so that: we win.

Who wins?Who wins?

W

our choice

his choice

What must be true for us to win?

W

Who wins?Who wins?

W L L L

our choice

his choice

What must be true for us to WIN?

Must be a move for us so that: no matter what our opponent

chooses: we win.

Who wins?Who wins?

W W L L W L L W

our choice

his choice

our choice

There must exist a move for us so that: no matter what our opponent chooses:

there is a move for us: so that we win.

Who wins?Who wins?


our choice

his choice

his choice

our choice

Labeling the treeLabeling the tree


our choice

his choice

his choice

our choice

Labeling the treeLabeling the tree


our choice

his choice

his choice

our choice

W

L WWW LL

L

L W

W W L W

W

Label the nodes as W/L depending on whether we win or lose if we get to that node

IssueIssue

PROBLEMPROBLEM– In practice, don’t have entire tree - it is likely In practice, don’t have entire tree - it is likely

way way big. way way big. – quick calculation: size of chess tree?quick calculation: size of chess tree?

SOLUTION 1SOLUTION 1– look ahead fixed distance (construct tree only look ahead fixed distance (construct tree only

to certain depth)to certain depth)– label nodes based on heuristic / guess of label nodes based on heuristic / guess of

whether “win” or “lose”whether “win” or “lose”

Solution 2Solution 2

Extension of solution 1Extension of solution 1– look ahead fixed distance (construct tree only look ahead fixed distance (construct tree only

to certain depth)to certain depth)– label nodes with NUMBERS which measure how label nodes with NUMBERS which measure how

good the position is. Numbers come from an good the position is. Numbers come from an evaluation functionevaluation function

EXAMPLE: Chess eval function: 9q + 5r + 3n + EXAMPLE: Chess eval function: 9q + 5r + 3n + 3b + p3b + p

Many complex evaluation functions are usedMany complex evaluation functions are used example: Othello Game Eval Function...example: Othello Game Eval Function...

20.0*([0,0]+[0,7]+[7,0]+[7,7])+20.0*([0,0]+[0,7]+[7,0]+[7,7])+

-7.0*([1,1]+[1,6]+[6,1]+[6,6])+-7.0*([1,1]+[1,6]+[6,1]+[6,6])+

2.0*([2,2]+[2,5]+[5,2]+[5,5])+2.0*([2,2]+[2,5]+[5,2]+[5,5])+

-3.0*([3,3]+[3,4]+[4,3]+[4,4])+-3.0*([3,3]+[3,4]+[4,3]+[4,4])+

-3.0*([1,0]+[0,1]+[6,0]+[0,6]+[7,1]+[1,7]+[7,6]+[6,7])+-3.0*([1,0]+[0,1]+[6,0]+[0,6]+[7,1]+[1,7]+[7,6]+[6,7])+

11.0*([2,0]+[0,2]+[5,0]+[0,5]+[7,2]+[2,7]+[7,5]+[5,7])+11.0*([2,0]+[0,2]+[5,0]+[0,5]+[7,2]+[2,7]+[7,5]+[5,7])+

8.0*([3,0]+[0,3]+[4,0]+[0,4]+[7,3]+[3,7]+[7,4]+[4,7])+8.0*([3,0]+[0,3]+[4,0]+[0,4]+[7,3]+[3,7]+[7,4]+[4,7])+

-4.0*([2,1]+[1,2]+[5,1]+[1,5]+[6,2]+[2,6]+[6,5]+[5,6])+-4.0*([2,1]+[1,2]+[5,1]+[1,5]+[6,2]+[2,6]+[6,5]+[5,6])+

1.0*([3,1]+[1,3]+[4,1]+[1,4]+[6,3]+[3,6]+[6,4]+[4,6])+1.0*([3,1]+[1,3]+[4,1]+[1,4]+[6,3]+[3,6]+[6,4]+[4,6])+

2.0*([3,2]+[2,3]+[4,2]+[2,4]+[5,3]+[3,5]+[5,4]+[4,5])2.0*([3,2]+[2,3]+[4,2]+[2,4]+[5,3]+[3,5]+[5,4]+[4,5])

New goal: get to the highest valued nodeNew goal: get to the highest valued node(opponent tries to get to the lowest valued node)(opponent tries to get to the lowest valued node)

0 7 1 3 6 0 4 5

our choice

his choice

our choice

What is best we can do? Why? What is best we can do? Why?

Play against a partnerPlay against a partner

1 4 8 7 6 4 3 9 2 4 5 3 7 2 6 8

player Max

player Min

player Min

player Max

MIN-MAX labelingMIN-MAX labeling Start at bottom (dep on lookahead)Start at bottom (dep on lookahead) Compute evaluation function for the leaf nodesCompute evaluation function for the leaf nodes Work upwards. Label a node with largest of children if Work upwards. Label a node with largest of children if

it is our choice, or smallest of children if opponent it is our choice, or smallest of children if opponent choicechoice

0 7 1 3 6 0 4 5

our choice

his choice

our choice

MIN-MAX labelingMIN-MAX labeling Start at bottom (dep on lookahead)Start at bottom (dep on lookahead) Compute evaluation function for the leaf nodesCompute evaluation function for the leaf nodes Work upwards. Label a node with largest of children if Work upwards. Label a node with largest of children if

it is our choice, or smallest of children if opponent it is our choice, or smallest of children if opponent choicechoice

0 7 1 3 6 0 4 5

7 3 6 5

3 5

5our choice

his choice

our choice

alpha/beta pruningalpha/beta pruning

5 3 7 1

our choice

his choice

our choice?

Why doesn’t this matter?

General game-playing programs:General game-playing programs:

have large databases of beginning, have large databases of beginning, ending move combinationsending move combinations

look-ahead more in the direction of look-ahead more in the direction of most promising board positionsmost promising board positions

use advanced techniques for pruning use advanced techniques for pruning parts of the tree that are not likely to parts of the tree that are not likely to lead to good positions lead to good positions

SummarySummary Many two-person games of skill can be modeled Many two-person games of skill can be modeled

as a tree, with vertices representing board as a tree, with vertices representing board positions, and edges representing moves.positions, and edges representing moves.

An evaluation function is used to label how good An evaluation function is used to label how good a particular board position is.a particular board position is.

A game-playing program combines special A game-playing program combines special knowledge of beginning and end games, with knowledge of beginning and end games, with look-ahead based on evaluation function, and look-ahead based on evaluation function, and careful pruning, to determine what the likely best careful pruning, to determine what the likely best next move isnext move is

Some games will remain too complex for Some games will remain too complex for computers to play as well as humans until new computers to play as well as humans until new methods are proved effective.methods are proved effective.

Game Playing. Towards Intelligence? Many researchers attacked “intelligent behavior” by looking...

Documents

Transcript of Game Playing. Towards Intelligence? Many researchers attacked “intelligent behavior” by looking...