Intelligence Artificial Intelligence Ian Gent [email protected] Games 1: Game Tree Search.

16
Artificial Intelligence Intelligence Ian Gent [email protected] Games 1: Game Tree Search

Transcript of Intelligence Artificial Intelligence Ian Gent [email protected] Games 1: Game Tree Search.

Page 1: Intelligence Artificial Intelligence Ian Gent ipg@cs.st-and.ac.uk Games 1: Game Tree Search.

Artificial IntelligenceIntelligence

Ian [email protected]

Games 1: Game Tree Search

Page 2: Intelligence Artificial Intelligence Ian Gent ipg@cs.st-and.ac.uk Games 1: Game Tree Search.

Artificial IntelligenceIntelligence

Part I : Game TreesPart II: MiniMaxPart III: A bit of Alpha-Beta

Game Tree Search

Page 3: Intelligence Artificial Intelligence Ian Gent ipg@cs.st-and.ac.uk Games 1: Game Tree Search.

3

Perfect Information Games

Unlike Bridge, we consider 2 player perfect information games

Perfect Information: both players know everything there is to know about the game position no hidden information (e.g. opponents hands in bridge) no random events (e.g. draws in poker) two players need not have same set of moves available examples are Chess, Go, Checkers, O’s and X’s

Ginsberg made Bridge 2 player perfect information by assuming specific random locations of cards two players were North-South and East-West

Page 4: Intelligence Artificial Intelligence Ian Gent ipg@cs.st-and.ac.uk Games 1: Game Tree Search.

4

Game Trees

A game tree is like a search tree nodes are search states, with full details about a position

e.g. chessboard + castling/en passant information edges between nodes correspond to moves leaf nodes correspond to determined positions

e.g. Win/Lose/Drawnumber of points for or against player

at each node it is one or other player’s turn to move

Page 5: Intelligence Artificial Intelligence Ian Gent ipg@cs.st-and.ac.uk Games 1: Game Tree Search.

5

Game Trees Search Trees

Strong similarities with 8s puzzle search trees there may be loops/infinite branches typically no equivalent of variable ordering heuristic

“variable” is always what move to make next

One major difference with 8s puzzle The key difference is that you have an opponent!

Call the two players Max and Min Max wants leaf node with max possible score

e.g. Win = + Min wants leaf node with min score,

e.g. Lose = -

Page 6: Intelligence Artificial Intelligence Ian Gent ipg@cs.st-and.ac.uk Games 1: Game Tree Search.

6

The problem with Game trees

Game trees are huge O’s and X’s not bad, just 9! = 362,880 Checkers/Draughts about 1040

Chess about 10 120

Go utterly ludicrous, e.g. 361! 10750

Recall from Search1 Lecture, It is not good enough to find a route to a win Have to find a winning strategy Unlike 8s/SAT/TSP, can’t just look for one leaf node

typically need lots of different winning leaf nodes Much more of the tree needs to be explored

Page 7: Intelligence Artificial Intelligence Ian Gent ipg@cs.st-and.ac.uk Games 1: Game Tree Search.

7

Coping with impossibility

It is usually impossible to solve games completely Connect 4 has been solved Checkers has not been

we’ll see a brave attempt later

This means we cannot search entire game tree we have to cut off search at a certain depth

like depth bounded depth first, lose completeness

Instead we have to estimate cost of internal nodesDo so using a static evaluation function

Page 8: Intelligence Artificial Intelligence Ian Gent ipg@cs.st-and.ac.uk Games 1: Game Tree Search.

8

Static evaluation

A static evaluation function should estimate the true value of a node true value = value of node if we performed exhaustive search need not just be /0/- even if those are only final scores can indicate degree of position

e.g. nodes might evaluate to +1, 0, -10

Children learn a simple evaluation function for chess P = 1, N = B = 3, R = 5, Q = 9, K = 1000 Static evaluation is difference in sum of scores chess programs have much more complicated functions

Page 9: Intelligence Artificial Intelligence Ian Gent ipg@cs.st-and.ac.uk Games 1: Game Tree Search.

9

O’s and X’s

A simple evaluation function for O’s and X’s is: Count lines still open for maX, Subtract number of lines still open for min evaluation at start of game is 0 after X moves in center, score is +4

Evaluation functions are only heuristics e.g. might have score -2 but maX can win at next move

O - X- O X- - -

Use combination of evaluation function and search

Page 10: Intelligence Artificial Intelligence Ian Gent ipg@cs.st-and.ac.uk Games 1: Game Tree Search.

10

MiniMax

Assume that both players play perfectly Therefore we cannot optimistically assume player will miss

winning response to our moves

E.g. consider Min’s strategy wants lowest possible score, ideally - but must account for Max aiming for + Min’s best strategy is:

choose the move that minimises the score that will result when Max chooses the maximising move

hence the name MiniMax

Max does the opposite

Page 11: Intelligence Artificial Intelligence Ian Gent ipg@cs.st-and.ac.uk Games 1: Game Tree Search.

11

Minimax procedure

Statically evaluate positions at depth d From then on work upwardsScore of max nodes is the max of child nodesScore of min nodes is the min of child nodesDoing this from the bottom up eventually gives score

of possible moves from root node hence best move to make

Can still do this depth first, so space efficient

Page 12: Intelligence Artificial Intelligence Ian Gent ipg@cs.st-and.ac.uk Games 1: Game Tree Search.

12

What’s wrong with MiniMax

Minimax is horrendously inefficient

If we go to depth d, branching rate b, we must explore bd nodes

but many nodes are wasted We needlessly calculate the

exact score at every node but at many nodes we don’t

need to know exact score e.g. outlined nodes are

irrelevant

Maxscore = 3

Maxscore = 2

Minscore = 2

Best move = Right

Maxscore = 1

Maxscore = ?

Maxscore = ?

Minscore = ? < 2

Best move = ?

Maxscore = 2

Best move = Left

Page 13: Intelligence Artificial Intelligence Ian Gent ipg@cs.st-and.ac.uk Games 1: Game Tree Search.

13

Alpha-Beta search

Alpha-Beta = Uses same insight as branch and boundWhen we cannot do better than the best so far

we can cut off search in this part of the tree

More complicated because of opposite score functions

To implement this we will manipulate alpha and beta values, and store them on internal nodes in the search tree

Page 14: Intelligence Artificial Intelligence Ian Gent ipg@cs.st-and.ac.uk Games 1: Game Tree Search.

14

Alpha and Beta values

At a Mx node we will store an alpha value the alpha value is lower bound on the exact minimax score the true value might be if we know Min can choose moves with score <

then Min will never choose to let Max go to a node where the score will be or more

At a Min node, we will store a beta value the beta value is upper bound on the exact minimax score the true value might be

Alpha-Beta search uses these values to cut search

Page 15: Intelligence Artificial Intelligence Ian Gent ipg@cs.st-and.ac.uk Games 1: Game Tree Search.

15

Alpha Beta in Action

Why can we cut off search? Beta = 1 < alpha = 2 where

the alpha value is at an ancestor node

At the ancestor node, Max had a choice to get a score of at least 2 (maybe more)

Max is not going to move right to let Min guarantee a score of 1 (maybe less)

Maxscore = 3

Maxscore = 2

Minscore = 2

Best move = Rightbeta = 2

Maxscore = 1

Maxscore = ?

Maxscore = ?

Minscore = ? < 2

Best move = ?beta = 1

Maxscore = 2

Best move = Leftalpha = 2

Page 16: Intelligence Artificial Intelligence Ian Gent ipg@cs.st-and.ac.uk Games 1: Game Tree Search.

16

Summary and Next Lecture

Game trees are similar to search trees but have opposing players

Minimax characterises the value of nodes in the tree but is horribly inefficient

Use static evaluation when tree too bigAlpha-beta can cut off nodes that need not be

searchedNext Time: More details on Alpha-Beta