Search Can we apply the recently discussed search methods too all problems? Why or why not? How does...

SearchCan we apply the recently discussed search methods too all problems? Why or why not? How does the search change when

there are multiple actors involved?

Game PlayingA second player (or more) is an adversary Examples: chess, checkers, go, tic-tac-toe, connect

four, backgammon, bridge, poker Adversarial search.

Players take turns or alternate play Unpredictable opponent. Solution specifies a move for every possible

opponent move.

Turn time limits How might they affect the search process?

Types of Games

deterministic

chance

perfect informatio

n

chess, checkers,

connect four, othello

backgammon

imperfect informatio

n

bridge, poker,

scrabble

Example Game: Nim

Deterministic, Opposite TurnsTwo players One pile of tokens in the middle of the

table. At each move, the player divides a pile

into two non-empty piles of different sizes.

Player who cannot move looses the game.

Example Game: Nim7

2-1-1-1-1-1

5-26-1 4-3

max

3-2-25-1-1 4-2-1 3-3-1

min

4-1-1-1 3-2-1-1 2-2-2-1

max

3-1-1-1-1 2-2-1-1-1

min

Assume a pile of seven tokens.

Max attempts to maximize the advantage and win.

Min always tries to move to a state that is the worst for Max. max

min

Perfect Decision, Two person games

Two players MAXMAX and MINMIN

MAX MAX moves first; alternate turns thereafter.Formal definition of game Initial State Successor Function Terminal Test Utility Function

No one player has full control, must develop a strategy.

Minimax Algorithm: BasicsMAX

MIN

MAX

MIN

2 6

6

3 5 9 0 7 4 2 1 5

27093

203

3

Process of Backing up: minimax decision

Assumption: Both players are knowledgeable and play the best possible move

Minmax Algorithm• Generate game tree to all terminal states.

• Apply utility function to each terminal state.

• Propagate utility of terminal states up one level. • MAX’s turn: MAX tries to maximize utility value.

• MIN’s turn: MIN minimizes utility value.

• Continue backing-up values toward root, one layer at a time.

• At root node, MAX chooses the value with highest utility.

Minimax SearchWhat is the time complexity of the minimax algorithm?

Minimax decision: maximizes utility for a player assuming that the opponent will play perfectly and minimize its utility. m – looking ahead m moves in game tree Search can be done in a depth first manner

i.e. traverse down a path till terminal node reached, then back up value using minimax decision function.

Minimax: Applied to Nim7

5-26-1 4-3

3-2-25-1-1 4-2-1 3-3-1

4-1-1-1 3-2-1-1 2-2-2-1

3-1-1-1-1 2-2-1-1-1

2-1-1-1-1-1

max

min

max

min

0

1

0

0

0

0 1

0 1 1

1 1 1

1

maxWhat do we observe from tree?

min

Partial SearchWhat were our concerns with the other search algorithms we have discussed?

•How do those concerns apply to games?

• Full game tree may be too large to generate.

Example: Partial Search for Tic-Tac-Toe

• Generate tree to intermediate depth.

• Calculate utility/evaluation function for leaf nodes.

• Back up values to the root node so MAX player can make a decision.

Example: Partial Search for Tic-Tac-Toe

Position p:

win for Max, u(p) =

loss for MAX, u(p) = –

otherwise, u(p) = (# of complete rows, cols, & diags still open for MAX) – (# of complete rows, cols, and diags still open for MIN)

Tic-Tac-Toe Move 2

MAX Player: 3rd move

QuestionIf we apply an evaluation function to non-terminal nodes of a game tree, why generate a partial tree at all?

Why not just compute an evaluation function to all states generated by applying valid operators (moves) to the current state, and selecting the one with the highest utility value?

Assumption: An evaluation function applied to successor nodes produces more reliable (accurate) values.

In other words, these configurations contain more information on how the game will likely end.

Each level of game tree – termed a ply.

Complex gamesWhat happens if Minimax is applied to large complex games? What happens to the search space?

Example, chess: decent amateur program 1000 moves /second 150 seconds /move (tournament play) Look at approx. 150,000 moves Chess branching factor of 35 generating trees that are 3-4 ply, Resultant play – pure amateur

Complex Games• Can we make search process more

intelligent, and explore more promising parts of tree in greater depth?

MAX Player: 3rd move

Node A generated first

Back up value immediately

MIN

Do not need to generate these nodes!

Why?

Search starts depth first

The alpha-beta pruning The alpha-beta pruning procedureprocedure

Alpha-Beta PruningDefinitions value – lower bound on MAX node value – upper bound on MIN node

Observations value on MAX nodes never decrease values on MIN nodes never increase

Application Search is discontinued below any MIN node

with min-value v : cut off Search is discontinued below any MAX node

with max-value v : cut off

Alpha-Beta Search Example

MAX

MIN

MAX

MIN

515 11 2 23 4 7 0 6

a

cb

d e f g

Node AlphaBetaa -∞ ∞b -∞ ∞d -∞ ∞d 1 ∞d 2 ∞d 3 ∞b -∞ 3

e -∞ 3e 4 3 CUT-OFF

Node AlphaBetaa 3 ∞

c 3 ∞f 3 ∞c 3 3 CUT-OFF

Completed

Function Node V Return

Max A -∞, 3 +∞ -∞, 3 3, 2

Min B -∞ +∞, 3 +∞, 3 3, 4

Max D -∞,1,2,3 +∞ -∞,1,2,3 1,2,3

Min 1 -∞ +∞

Min 2 1 +∞

Min 3 2 +∞

Max E -∞ 3 -∞, 4 4 Cutoff 5 & 7

Min 4 -∞ 3

Min C 3 +∞, 6 +∞, 6 6

Max F 3,4,5,6 +∞ -∞,4,5,6 4,5,6

Min 4 3 +∞

Min 5 4 +∞

Min 6 5 +∞

Max G 3 6 -∞, 6 6 Cutoff 1 & 5

Min 6 3 6

Key: -∞ = negative infinity; +∞ = positive infinityThe last value in a square is the final value assigned to the specific variable, i.e. at the end of the search Node A’s = 3.

Alpha-Beta Search Algorithm

Computing and value of MAX node = current largest final backed-up value of its successors. value of MIN node = current smallest final backed-up value of its successors.

Alpha-Beta Search Algorithm

Start with AB(n; -, +)

Alpha-Beta-Search(state) returns an action v = MAX-VALUE(state,-∞, +∞) return the action in SUCCESSORS(state) with value v

MAX-VALUE(state, , ) if TERMINAL-TEST(state) then return UTILITY(state) v = -∞ for a, s in SUCCESSORS(state) do v = MAX(v, MIN-VALUE(s, )) if v >= then return v = MAX(, v) return v (a utility value)

MIN-VALUE(state, , ) if TERMINAL-TEST(state) then return UTILITY(state) v = +∞ for a, s in SUCCESSORS(state) do v = MIN(v, MAX-VALUE(s, )) if v <= then return v = MIN(, v) return v (a utility value)

Alpha-Beta Search Example 2

MAX

MIN

MAX

MIN

5 87 7

a

cb

d e f g

4 2 5 1 20 3 0 1 2MAX

h i j k l m n

Alpha-Beta Search Example 3

MAX

MIN

MAX

MIN

0 -15 0

a

cb

d e f g

1 1 2 -1 0-5 3 1 1 4MAX

h i j k l n o

-4 -3

m

Alpha-Beta Search Performance

Uses depth-first search procedureSearch effectiveness depends on node ordering Worst case occurs if worst moves

generated first Best case occurs if best moves

generated first: min value first for MIN nodes, and max value first for MAX nodes.(Is this always possible?)

Alpha-Beta Search Performance

Search tree of depth d, b moves per node bd tip nodes: Minimax search - O(bd) Best case:

look at only O(bd/2) tip nodes Effective branching factor reduces to so can

double the depth of the search. Average case:

Number of nodes examined: Search depth increase 4/3

Pruning does not affect final result

b

)( 43d

bO

Games with chanceExample, Games with dice Backgammon: white can decide what

his/her legal moves are, but can’t determine black’s because that depends on what black rolls.

Game tree must include chance nodes.How to pick best move? Cannot apply minimax directly.

Games with ChanceFor each possible move, compute an average or expected value.

node chancea is if )imax(Expectimin

node a Min is if )(imaxExpectiminmin

nodea Max is if )(imaxExpectiminmax

state a terminal is if )( imaxExpectimin

(n)Successorss

)(

)(

nsP(s)*

ns

ns

nnUTILITY

nSuccessorss

nSuccessorss

(n)

Complexity – O(bmnm) where n is the number of distinct values (e.g. dice rolls)

Expectimax example

Types of Games

deterministic

chance

perfect informatio

n

chess, checkers,

connect four, othello

backgammon

imperfect informatio

n

bridge, poker,

scrabble

Game AI ModelExecution Management

Strategy

Decision Making

Movement

Character AI

Group AI

Animation Physics

Content Creation

Scripting

AI has implications forRelated technologies

AI is turned into on-screen action

AI is given processor time

Wor

ld I

nter

face

AI receives information

Artificial Intelligence for Games – Ian Millington 2006

Game AI Schematic


Package level data

Level design tool

Modeling package

AI specific tools

Content Construction

Per-frame processing

Level Loader

Main Game Engine

AI data is used to construct characters

World interfaceExtracts relevantGame data

Game engineCalls AI each frame

Results of AIAre written backTo game data

AI behaviormanager

WorldInterface

Behavior database

AI Engine

AI receives datafrom the game and from its internal information

AI in GamesTypes of AI in games Hacks Heuristics

Common Heuristics Most Constrained Do the most difficult thing first Try the most promising thing first

Algorithms


AI in GamesPath followingCollision avoidanceFinding pathsWorld RepresentationsMovement planningDecision makingTactical analysisTerrain analysis

Coordinating actionLearning


Board GamesBest AI agents for Chess, Backgammon and Reversi employ dedicated hardware, algorithms and optimizations.

Basic underlying algorithms are common across games. Most popular AI for board games is

minimax family. Recently Memory-enhanced test driver

(MTD) algorithms are gaining favor. Artificial Intelligence for Games – Ian Millington 2006

Transposition TablesTransposition Tables (memory) Multiple board positions that are

identical. Table contains board positions and

results of a search from that position.


Transposition TablesHash game states An hashing algorithm can work but… Zobrist Keys: a set of fixed-length

random bit patterns stored for each possible state of each possible location on the board. Chess: 64 x 2 x 6 = 768 entries For each non-empty square, the Zobrist key

is located and XORed with a running hash total.


Transposition TablesWhat to store Minimax: Store value and the depth

used to calculate the value. Alpha-Beta: store the value, if the

value is accurate or “fail-soft” (because of pruning). Alpha pruning: fail-low value Beta pruning: fail-high value


Transposition TablesIssues Path Dependency

Some games have scores dependent upon the sequence of moves.

Repetitions will be identically scored, so algorithm may disregard a winning move.

Fix by using a Zorbist key for “number of repeats”.


Transposition TablesIssues Instability

The stored values fluctuate during the same search since values may be over written.

Not guaranteed to get the same value on each lookup.

In rare cases, could have an oscillating value situation.


Memory-Enhanced Test (MT) Algorithm

Require an efficient transposition table that acts as memory. Implemented as a zero-width AB

negamax with a transition table. Negamax swaps and inverts the alpha

and beta values and checks and prunes only against the Beta value.


MTD AlgorithmThe driver routine repeatedly calls the MT Algorithm to “zoom in” on the correct minimax value and determine the next move. – Memory-enhanced Test Drivers.


MTD AlgorithmTrack the upper bound on the score value – GammaLet Gamma be the first guess of the score. Calculate another guess by calling the MT algorithm for the current board position, the maximum depth, zero for the current depth, and the gamma value.If the returned guess is not the same as gamma, then calculate another guess in order to confirm that the guess is correct. (usually limit the number of iterations)Return the guess as the score.


Opening BooksAn opening book is a set of moves combined with information regarding how good the average outcome is when using the moves. Typically at the start of the game.

Play books – certain games have stages where plays can be employed.

State of the Art: GamesTraditional games

Chess: HITECH (Berliner, CMU) – alpha-beta, horizon effect, 10 million positions per move, accurate evaluation functions: defeated human grandmaster in 1987

Deep Thought: Hsu, Ananthraman, and others at CMU: generated 720,000 chess positions per second – won World Computer Chess Championship in 1989

Hsu, Campbell and others, CMU to IBM: more focus on fast computation – parallel processing to solve complex problems: evolution from Deep Thought to Deep Blue (1993)

Deep Blue – massively parallel 32 node RISC system + distributed memory built by IBM. Each node has 8 VLSI chess processors – therefore, 256 processors working in tandem.First played Kasparov in 1996, won game 1, but eventually lost to Kasparov.

State of the Art: Games May 1997 (Deeper Blue) rematch. Processor

speed doubled, evaluation functions improved with help from a multitude of grandmasters (chief advisor: Joel Benjamin, US chess champ)Deep(er) Blue was generating and evaluating up to 200 million chess positions per sec, i.e., 200 billion moves every 3 minutes. Typical grandmaster computes only 500 moves in 3 minutes.Result: Deep Blue: won 2, lost 1, drew 3.Ques: Was Deep Blue really intelligent?

State of the Art: Games Chess: (continued) … Quote from Feng-hsuing Hsu:

“The previous version of Deep Blue, lost a match to Gary Kasparov in Philadelphia in 1996. But 2/3rds of the way into that match, we had played to a tie with Kasparov. That old version of Deep Blue was already faster than the machine I had conjectured in 1985, and yet it was not enough. There was more to solving the Computer Chess Problem than just increasing the hardware speed. Since that match, we rebuilt Deep Blue from scratch, going through every match problem, and engaging grandmasters extensively in our preparations. Somehow, all that work cause Grandmaster Joel Benjamin, our chess advisor, to say, “You know, sometimes Deep Blue plays chess.” Joel could not distinguish with certainty Deep Blue’s moves from the moves of top grandmasters.”

State of the Art: GamesCheckers: Big breakthrough in AI: Samuels in 1952 at IBM; played itself thousands of times and learned its own evaluation functions. Evaluation function was a weighted sum of a set of functions.Best checkers programs now – as good as human champions

Othello: human players refuse to play computer games, who can play perfectly.

Go: branching factor of game > 300, so straight search doesn’t work. Have to use pattern knowledge bases.Does not match up to human champions93 102.1)2021(20

State of the Art GamesBackgammon: 2 dice 20 legal moves (can be 6000 with 1-1 roll) depth 4 = as depth increases, probability of

reaching nodes decreases, so value of look ahead is diminished. (alpha-beta not a great idea).

best backgammon game: uses depth 2 search + very good evaluation function.

plays at world championship level.

Search Can we apply the recently discussed search methods too all problems? Why or why not? How does...

Documents

Transcript of Search Can we apply the recently discussed search methods too all problems? Why or why not? How does...