CSE473 Winter 1998 1 02/04/98 State-Space Search Administrative –Next topic: Planning. Reading,...

13
1 CSE473 Winter 1998 02/04/98 State-Space Search • Administrative Next topic: Planning. Reading, Chapter 7, skip 7.3 through 7.5 Office hours/review after class today, Thursday 2:30 Last time informed search, satisficing and optimizing (A*) This time adversarial (game-tree) search introduction to Planning

Transcript of CSE473 Winter 1998 1 02/04/98 State-Space Search Administrative –Next topic: Planning. Reading,...

Page 1: CSE473 Winter 1998 1 02/04/98 State-Space Search Administrative –Next topic: Planning. Reading, Chapter 7, skip 7.3 through 7.5 –Office hours/review after.

1 CSE473 Winter 1998

02/04/98 State-Space Search

• Administrative– Next topic: Planning. Reading, Chapter 7, skip 7.3 through 7.5

– Office hours/review after class today, Thursday 2:30

• Last time– informed search, satisficing and optimizing (A*)

• This time– adversarial (game-tree) search

– introduction to Planning

Page 2: CSE473 Winter 1998 1 02/04/98 State-Space Search Administrative –Next topic: Planning. Reading, Chapter 7, skip 7.3 through 7.5 –Office hours/review after.

2 CSE473 Winter 1998

Search in Adversarial Games

• Non-adversarial game: you make a sequence of moves, and at the end you get a payoff depending on the state you are in– games of perfect information: deterministic moves (FreeCell)– games against nature: you make a move, “nature” changes the world

• same as perfect information if nature is perfectly predictable, but more generally probabilistic (stochastic next state generator)

• but, we assume that nature is dispassionate: her choice of move is not meant to minimize your payoff

– adversarial games: you make a move, then an opponent makes a move, then both get a payoff (possibly negative)

• both you and opponent are attempting to maximize an individual payoff function

• often maximizing one means minimizing the other

– zero-sum game

• perfect information: everybody knows all payoff functions

Page 3: CSE473 Winter 1998 1 02/04/98 State-Space Search Administrative –Next topic: Planning. Reading, Chapter 7, skip 7.3 through 7.5 –Office hours/review after.

3 CSE473 Winter 1998

Example: The Game of Chicken

Straight Left RightStraight (-100, -100) (10, -10) (10, -10)Left (-10, 10) (-50, -50) (-5, -5)Right (-10, -10) (-5, -5) (-50, -50)

You

Him

What is your optimal strategy if:• actions are chosen simultaneously• you get to choose first

Page 4: CSE473 Winter 1998 1 02/04/98 State-Space Search Administrative –Next topic: Planning. Reading, Chapter 7, skip 7.3 through 7.5 –Office hours/review after.

4 CSE473 Winter 1998

General Approach to Game Playing by Search

• Expand the tree some fixed number of moves• Apply a heuristic evaluation function to the

(incomplete) state• Apply MINIMAX to compute the best first move• Example: TIC-TAC-TOE

– players are MAX (drawing X’s) and MIN (drawing O’s)

– e(p) is if p is a win for MAX

• - if p is a win for MIN

• (number of available rows/columns/diagonals for MAX) - (number of available rows/columns/diagonals) for MIN)

Page 5: CSE473 Winter 1998 1 02/04/98 State-Space Search Administrative –Next topic: Planning. Reading, Chapter 7, skip 7.3 through 7.5 –Office hours/review after.

5 CSE473 Winter 1998

MINIMAX search, cutoff depth = 2

X X X

X X X X X

X X

X X X X X

OO O O

O

OO

O O OO O

6-5=15-5=0

1 0 -1

1 2

-1 0 -1 0 -2

-2

1

-1

1MAX

MIN MINMIN

MAX

Page 6: CSE473 Winter 1998 1 02/04/98 State-Space Search Administrative –Next topic: Planning. Reading, Chapter 7, skip 7.3 through 7.5 –Office hours/review after.

6 CSE473 Winter 1998

Early Pruning: The ALPHA-BETA Procedure

• The previous algorithm (implicitly)– generate the tree

– evaluate the leaves

– backup to generate the optimal first action

• Interleaving evaluation with generation means that some paths

• Cache partial evaluation information at each node– A MAX node has an value which is the best (greatest)

choice so far. It can never decrease.

– A MIN node has a value which is the best (least) choice so far. It can never increase.

Page 7: CSE473 Winter 1998 1 02/04/98 State-Space Search Administrative –Next topic: Planning. Reading, Chapter 7, skip 7.3 through 7.5 –Office hours/review after.

7 CSE473 Winter 1998

Cached Values

MAX

MIN MIN

MAX

=10

=10

=4

=4

MIN

MAX MAX

MAX

=-1

=-1

=3

=3

Page 8: CSE473 Winter 1998 1 02/04/98 State-Space Search Administrative –Next topic: Planning. Reading, Chapter 7, skip 7.3 through 7.5 –Office hours/review after.

8 CSE473 Winter 1998

Two sorts of pruning

• Search can be discontinued below any MIN node having a value less than or equal to the value of any of its MAX node ancestors.

• Search can be discontinued below any MAX node having an value greater than or equal to the value of any of its MIN node ancestors

• This can have an order-of-magnitude impact on the search– provided you choose the first alternative(s) well!

Page 9: CSE473 Winter 1998 1 02/04/98 State-Space Search Administrative –Next topic: Planning. Reading, Chapter 7, skip 7.3 through 7.5 –Office hours/review after.

9 CSE473 Winter 1998

State-Space Search: Summary

• A very abstract characterization of problem solving– non-deterministic graph search

• An interesting split between domain-dependent and domain-independent aspects of the process– the domain-independent part can be a library

• Extensions to optimizing, adversarial search, continuous spaces

• Disadvantages– the “direction” of the search may be wrong (progression versus

regression)

– the domain-independent components are “black boxes” • perhaps state generation, goal recognition could be further automated

Page 10: CSE473 Winter 1998 1 02/04/98 State-Space Search Administrative –Next topic: Planning. Reading, Chapter 7, skip 7.3 through 7.5 –Office hours/review after.

10 CSE473 Winter 1998

Planning: The “Neutral” Problem Description

• Inputs– a set of states S = {s1, s2, ..., sn}

– a set of actions A={a1, a2, ..., am}

• each action is a partial function ai: S S

– a unique initial state si

– a goal region G S

• Output – a sequence of actions <b1, b2, ..., bk> such that

bk( ... b3(b2(b1(si))...) G

Page 11: CSE473 Winter 1998 1 02/04/98 State-Space Search Administrative –Next topic: Planning. Reading, Chapter 7, skip 7.3 through 7.5 –Office hours/review after.

11 CSE473 Winter 1998

Planning as Search

• Search: can easily implement a planner using the standard search code/algorithms

• But we would like to – have a declarative representation for states and actions

• ease in specification (move generator, goal checker)

• could support explanation and learning tasks

– exploit the goal better using a regression algorithm• we believe fan-out is worse than fan-in

– further exploit the nature of the goal• goal is a conjunction of subgoals

• common solution technique is “divide and conquer”

– to solve G = G1^G2^..., solve the Gi subgoals separately, and merge the solutions

Page 12: CSE473 Winter 1998 1 02/04/98 State-Space Search Administrative –Next topic: Planning. Reading, Chapter 7, skip 7.3 through 7.5 –Office hours/review after.

12 CSE473 Winter 1998

Planning States and Operators

• Example:– goal is to be at B and fuel tank full

– truck is currently at A and fuel tank half

– A and B are connected

– you can only refuel at B

• State: – S0 = { at(TRUCK, A), fuel(HALF), connected(A,B), refuel-at(B) }

– everything is false unless explicitly stated true

Page 13: CSE473 Winter 1998 1 02/04/98 State-Space Search Administrative –Next topic: Planning. Reading, Chapter 7, skip 7.3 through 7.5 –Office hours/review after.

13 CSE473 Winter 1998

States versus State Descriptions

• A state is a set of formulas that describes a single state of the world– by convention, we include only positive formulas and assume

everything else is false

• We also need to represent sets of states– the goal is to be at B and have half a tank of gas, which describes a set

of states

– there might be other formulas that describe the world, but we don’t care what state they are

• A state description is a set of formulas that describes a set of states– both positive and negative formulas are allowed in the set

– any formula not mentioned is a “don’t care”