CS 188: Artificial Intelligence Fall 2008 Lecture 7: Expectimax Search 9/18/2008 Dan Klein – UC...

CS 188: Artificial IntelligenceFall 2008

Lecture 7: Expectimax Search9/18/2008

Dan Klein – UC Berkeley

Many slides over the course adapted from either Stuart Russell or Andrew Moore

Recap: Resource Limits

Cannot search to leaves

Depth-limited search Instead, search a limited depth

of tree Replace terminal utilities with

an eval function for non-terminal positions

Guarantee of optimal play is gone

Replanning agents: Search to choose next action Replan each new turn in

response to new state? ? ? ?

-1 -2 4 9

min min

Evaluation for Pacman

[DEMO: thrashing, smart ghosts]

Iterative DeepeningIterative deepening uses DFS as a subroutine:

1. Do a DFS which only searches for paths of length 1 or less. (DFS gives up on any path of length 2)

2. If “1” failed, do a DFS which only searches paths of length 2 or less.

3. If “2” failed, do a DFS which only searches paths of length 3 or less.

….and so on.

This works for single-agent search as well!

Why do we want to do this for multiplayer games?

- Pruning Example

- Pruning

General configuration is the best value that

MAX can get at any choice point along the current path

If n becomes worse than , MAX will avoid it, so can stop considering n’s other children

Define similarly for MIN

Player

Opponent

Player

Opponent

- Pruning Pseudocode

- Pruning Properties Pruning has no effect on final result

Good move ordering improves effectiveness of pruning

With “perfect ordering”: Time complexity drops to O(bm/2) Doubles solvable depth Full search of, e.g. chess, is still hopeless!

A simple example of metareasoning, here reasoning about which computations are relevant

Expectimax Search Trees What if we don’t know what the

result of an action will be? E.g., In solitaire, next card is unknown In minesweeper, mine locations In pacman, the ghosts act randomly

Can do expectimax search Chance nodes, like min nodes,

except the outcome is uncertain Calculate expected utilities Max nodes as in minimax search Chance nodes take average

(expectation) of value of children

Later, we’ll learn how to formalize the underlying problem as a Markov Decision Process

10 4 5 7

chance

[DEMO: minVsExp]

Maximum Expected Utility Why should we average utilities? Why not minimax?

Principle of maximum expected utility: an agent should chose the action which maximizes its expected utility, given its knowledge

General principle for decision making Often taken as the definition of rationality We’ll see this idea over and over in this course!

Let’s decompress this definition…

Reminder: Probabilities A random variable represents an event whose outcome is unknown A probability distribution is an assignment of weights to outcomes

Example: traffic on freeway? Random variable: T = whether there’s traffic Outcomes: T in {none, light, heavy} Distribution: P(T=none) = 0.25, P(T=light) = 0.55, P(T=heavy) = 0.20

Some laws of probability (more later): Probabilities are always non-negative Probabilities over all possible outcomes sum to one

As we get more evidence, probabilities may change: P(T=heavy) = 0.20, P(T=heavy | Hour=8am) = 0.60 We’ll talk about methods for reasoning and updating probabilities later

What are Probabilities? Objectivist / frequentist answer:

Averages over repeated experiments E.g. empirically estimating P(rain) from historical observation Assertion about how future experiments will go (in the limit) New evidence changes the reference class Makes one think of inherently random events, like rolling dice

Subjectivist / Bayesian answer: Degrees of belief about unobserved variables E.g. an agent’s belief that it’s raining, given the temperature E.g. pacman’s belief that the ghost will turn left, given the state Often learn probabilities from past experiences (more later) New evidence updates beliefs (more later)

Uncertainty Everywhere Not just for games of chance!

I’m snuffling: am I sick? Email contains “FREE!”: is it spam? Tooth hurts: have cavity? 60 min enough to get to the airport? Robot rotated wheel three times, how far did it advance? Safe to cross street? (Look both ways!)

Why can a random variable have uncertainty? Inherently random process (dice, etc) Insufficient or weak evidence Ignorance of underlying processes Unmodeled variables The world’s just noisy!

Compare to fuzzy logic, which has degrees of truth, or rather than just degrees of belief

Expectations Real valued functions of random variables:

Expectation of a function of a random variable

Example: Expected value of a fair die roll

X P f1 1/6 1

2 1/6 2

3 1/6 3

4 1/6 4

5 1/6 5

6 1/6 6 15

Utilities Utilities are functions from outcomes (states of the world)

to real numbers that describe an agent’s preferences

Where do utilities come from? In a game, may be simple (+1/-1) Utilities summarize the agent’s goals Theorem: any set of preferences between outcomes can be

summarized as a utility function (provided the preferences meet certain conditions)

In general, we hard-wire utilities and let actions emerge (why don’t we let agents decide their own utilities?)

CS 188: Artificial Intelligence Fall 2008 Lecture 7: Expectimax Search 9/18/2008 Dan Klein – UC...

Documents

Transcript of CS 188: Artificial Intelligence Fall 2008 Lecture 7: Expectimax Search 9/18/2008 Dan Klein – UC...

188-240 Transf0rmer Protection Quazvin.ppt [Schreibgeschützt] · 2008. 2. 18. · Author: G. Ziegler PTD SE PTI 525A 3 110kV 100MVA IN Trafo W1 = ... Microsoft PowerPoint - 188-240_Transf0rmer

CS 188: Artificial Intelligence Fall 2008 Lecture 18: Decision Diagrams 10/30/2008 Dan Klein – UC Berkeley 1.

Kitchens & workbenches · 2020. 3. 23. · K3 188 74" J1 J2 J3 188 74" H1 H2 H3 L1 L2 L3 188 74.1" M1 M2 M3 188 74.1" 11 14 188 74" 188 74" 10 188 74" 188 74" 13 Top with left sink

ISO9001:2008 - Liancheng › uploads › 5b15831c.pdf · 2019-10-16 · 400免费咨询电话：400-188-5009 400 Free consultation Tel：400-188-5009 ISO9001:2008 ISO9001:2008 印刷日期：2015.01.20

RECURSO DE APELACIÓN: SUP-RAP-188/2008 …sjf.scjn.gob.mx/IusElectoral/Documentos/Sentencias/SUP-RAP-188... · Manuel López Obrador. 4. ... para que en el plazo de ... informe circunstanciado

Kitchens & workbenches - Fantin · 2020. 5. 5. · K3 188 74" J1 J2 J3 188 74" H1 H2 H3 L1 L2 L3 188 74.1" M1 M2 M3 188 74.1" 11 14 188 74" 188 74" 10 188 74" 188 74" 13 Top with

#{188}. /7

13-188 CMR Ch. 25 · Web view13-188 Chapter 25 page 6 13-188 Chapter 25 page 9 13-188 Chapter 25 page 8 13-188 Chapter 25 page 62 13-188 Chapter 25 page 7 2 13-188 Chapter 25 page

Uncertainty and Utilities - GitHub Pagesaritter.github.io/courses/5522_slides/expectimax_utilities.pdf · outcomes, not worst-case (minimax) outcomes Expectimax search: compute the

Australia This quiz was created by Oxana Grigoryeva, an English teacher from school №188. Novosibirsk April, 2008.

WaveTek 188

Quiz 5: Expectimax/Prob review Expectimax assumes the worst case scenario.False Expectimax assumes the most likely scenario. False P(X,Y) is the.

CS 188: Artificial Intelligence Fall 2008

Expectimax Evaluation Evaluation functions quickly return an estimate for a node’s true value (which value, expectimax or minimax?) For minimax, evaluation.

188 Purdy Road High Yield Industrial Investment · Building Overview Built in 2008 specifically for the Tenant (previously HD Supply prior to being acquired by Anixter), 188 Purdy

Renograaf 188

MKGT 188 midterm

IN THE HIGH COURT OF DELHI AT NEW DELHI FAO (OS) 188 · FAO (O.S.) No. 188/2008 Page 1 of 57

NTP 188.doc

Studio 188 Brochure 11 5 - lekandulekandu.net/188/graphics/Studio 188 Brochure-medium.pdfMicrosoft Word - Studio 188 Brochure 11_5.docx Author AMB Created Date 11/12/2013 9:44:57 AM