Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation...

23
Adversarial Search Seung-Hoon Na 1 1 Department of Computer Science Chonbuk National University 2017.9.25 Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 1 / 23

Transcript of Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation...

Page 1: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features

Adversarial Search

Seung-Hoon Na1

1Department of Computer ScienceChonbuk National University

2017.9.25

Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 1 / 23

Page 2: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features

Minimax value

MINIMAX (s): Minimax value of a node

The utility of being in the corresponding state, assuming that bothplayers play optimally from there to the end of the game.s is a terminal state: Its utility defined.

MINIMAX (s) = UTILITY (s) if TERMINAL-TEST (s)maxa∈Actions(s)MINIMAX (RESULT (s, a)) if PLAYER(s) = MAXmina∈Actions(s)MINIMAX (RESULT (s, a)) if PLAYER(s) = MIN

Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 2 / 23

Page 3: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features

Minimax algorithm

The algorithm first recurses down to the leaves of the tree, and

Then the minimax values are backed up through the tree as therecursion unwinds.

The time complexity: O(bm)

The space complexity: O(bm) or O(m)

Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 3 / 23

Page 4: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features

Minimax algorithm: A two-play game tree

Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 4 / 23

Page 5: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features

Optimal decision in multiplayer games

Use a vector of values for each node

〈vA, vB , vC 〉 in three-player game with A, B, and C .For terminal state, the vector gives the utility of the state from eachplayer’s viewpoint.

Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 5 / 23

Page 6: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features

Alpha-beta pruning

Returns the same move as minimax would, but prunes awaybranches that cannot possibly influence the final decision.

Use two parameters – alpha and beta – that describes bounds on thebacked-up values

α = the value of the best (i.e. highest-value) choice we have found sofar at any choice point along the path for MAX .β = the value of the best (i.e. lowest-value) choice we have found sofar at any choice point along the path for MIN.

Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 6 / 23

Page 7: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features

Alpha-beta pruning

If m is better than n for Player, n will never be reached in actual play.

Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 7 / 23

Page 8: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features

Alpha-beta search

Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 8 / 23

Page 9: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features

Alpha-beta pruning: An example

Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 9 / 23

Page 10: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features

Alpha-beta pruning: An example

MINIMAX (root) = max (min(3, 12, 8),min(2, x , y))

= max (3, z) where z = min(2, x , y) ≤ 2

= 3 (1)

Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 10 / 23

Page 11: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features

Alpha-beta pruning: Move ordering

We couldn’t prune any successors of D, because the worst successors weregenerated first.

Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 11 / 23

Page 12: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features

Imperfect real-time decision

Replace the utility function by a heuristic evaluation function EVAL,which estimates the position’s utility.

Replace the terminal test by a cutoff test that decides when to applyEVAL, resulting in:

H-MINIMAX (s, d) = EVAL(s) if CUTOFF -TEST (s, d)maxa∈Actions(s)H-MINIMAX (RESULT (s, a), d + 1) if PLAYER(s) = MAXmina∈Actions(s)H-MINIMAX (RESULT (s, a), d + 1) if PLAYER(s) = MIN

Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 12 / 23

Page 13: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features

Evaluation function

In most cases, use various features of the state:

E.g.) in chess, features can include: the num of white pawns, blackpawns, white queens, black queens, etc.

These features usually define various categories or equivalenceclasses of states

The states in each category have the same values for all the features

Any category will contain some states, that lead to wins, some thatlead to draws, and some that lead to losses.

E.g.) in chess, one category contains all two-pawn vs. one-pawnendgames. Among the states of the category, 72%: win, 20%: lose, 8%:draw- A reasonable evaluation for states is the expected value:(0.72)× 1 + (0.20)× 0 + 0.08× 1/2 = 0.76

Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 13 / 23

Page 14: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features

Evaluation function

Mathematically, it is a weighed linear function:

EVAL(s) = w1f1(s) + · · ·wnfn(s) =n∑

i=1

wi fi (s) (2)

where fi is a feature of the position and wi is a weight.

E.g.)material value for each piece: fi : the num of each kind of pieceon the board, wi : the value of the pieces (1 for pawn, 3 for bishop, 5for rook, 9 for queen, etc.).

Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 14 / 23

Page 15: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features

Cutting off search: The simple approach

CUTOFF -TEST (s, depth) returns true all depth greater than somefixed depth d

The depth d is chosen so that a move is selected within the allocatedtime, or an iterative deepening could be applied.

But lead to approximate errors

Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 15 / 23

Page 16: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features

Quiescence search

For a more sophisticated cutoff test, we apply the evaluation functiononly to quiescent positions.

that is, unlikely to exhibit wild swings in value in the near future.E.g.) in chess, positions in which favorable captures can be made arenot quiescent for an evaluation function that just counts material.

Quiescence search: The extra search that expands nonquiescentpositions until quiescent positions are reached.

This can also be used for dealing with the horizon problem

Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 16 / 23

Page 17: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features

Horizon effect

Horizon effect: the program is facing an opponent’s move that causesserious damage and is ultimately unavoidable, but can be temporarilyavoided by delaying tactics

Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 17 / 23

Page 18: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features

Singular extension

Another possible strategy to mitigate the horizon effect

Singular extension: move that is “clearly better” than all othermoves in a given position.

Once discovered anywhere in the tree in the course of a search, thissingular move is remembered.The algorithm checks to see if the singular extension is a legal move; ifit is, the algorithm allows the move to be considered.Makes the tree deeper, but because there will be only few singularextensions, it does not add many total nodes to the tree.

Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 18 / 23

Page 19: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features

Stochastic game

Stochastic game: The games that include unpredictable external eventsas random elements such as the throwing of dice

Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 19 / 23

Page 20: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features

Stochastic game tree

chance nodes are further included in addition to MAX and MIN nodes.

Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 20 / 23

Page 21: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features

Expecti-minimax value

E -MINIMAX (s) =UTILITY (s) if TERMINAL-TEST (s)maxa∈Actions(s)E -MINIMAX (RESULT (s, a)) if PLAYER(s) = MAXmina∈Actions(s)E -MINIMAX (RESULT (s, a)) if PLAYER(s) = MIN∑

r P(r)E -MINIMAX (RESULT (s, r)) if PLAYER(s) = CHANCE

Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 21 / 23

Page 22: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features

Partially observable games

Kriesgpiel: Partially observable chess

Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 22 / 23

Page 23: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features

Partially observable games: Card games

Card games: Gives stochastic partial observability – missinginformation is generated randomly

Suppose each deal s occurs with probability P(s).

argmaxa

∑s

P(s)MINIMAX (RESULT (s, a)) (3)

where we run either MINIMAX or H-MINIMAX .Monte Carlo approximation:

argmaxa

1

N

N∑i=1

MINIMAX (RESULT (si , a)) (4)

Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 23 / 23