Intelligent Agentsweb.ntnu.edu.tw/~tcchiang/ai/1_Agents (S).pdf · 12 “Agents,”Artificial...

Artificial Intelligence, Spring, 2010

Intelligent Agents

Instructor: Tsung-Che [email protected]

Department of Computer Science and Information EngineeringNational Taiwan Normal University

2“Agents,”Artificial Intelligence, Spring, 2010

Outline

Agents & EnvironmentsRationalityTask EnvironmentsAgent StructuresSummary


What is AI

Thinking humanly: The cognitive modeling approach

Acting humanly: The Turing Test approach http://www.elbot.com/ http://www.ieee-cig.org/cig-2009/competitions/#2k

Thinking rationally: The “laws of thought”(logical)approach

Acting rationally: The rational agent approach A rational agent is one that acts so as to achieve the

best outcome or, when there is uncertainty, the bestexpected outcome.

It is more general than “laws of thought”approach. It is more amenable than approaches based on human

behavior.


Agents & Environments

An agent is anything that can be viewed as perceiving its environment through sensors and acting upon that environment thorough

actuators.

examples: robot agent

perceives through cameras, infrared, etc.acts through arms, legs, etc.

software agentperceives through keyboard, mouse, etc.acts through monitors, speakers, etc.



The percept sequence is the complete history ofeverything the agent has ever perceived.

An agent’s behavior is described by the agentfunction that maps any given percept sequence toan action. It is an abstract mathematicaldescription.

The agent function is an abstract mathematicaldescription; the agent program is a concreteimplementation.



Vacuum-cleaner world

What is the right way to fillout the table?

Artificial Intelligence: A Modern Approach, 2nd ed., Figure 2.2 & 2.3

Tabulation of a simple agent function for the vacuum-cleaner world.


Rationality

A rational agent is one that does the rightthing.

The right action is the one that will causethe agent to be the most successful.

We need some way to measure success.


Rationality

A performance measure embodies thecriterion for success of an agent’s behavior.The amount of dirt cleaned up in a single 8-hour

shift?The number of clean square at each time step?

The selection of a performance measure isnot always easy.Maximum? Average? Multiobjective decision making/optimization


Rationality

What is rational depends on four things: Performance measureAgent’s prior knowledge of the environmentThe actions that the agent can performThe agent’s percept sequence to date


Rationality

Definition of a rational agent:

“For each possible percept sequence, a rationalagent should select an action that is expected tomaximize its performance measure, given theevidence provided by the percept sequence andwhatever built-in knowledge the agent has.”


Rationality

It depends!

Is this a rational agent?(Dirty then suck; otherwise,move to the other square.)



Rationality

Under the following assumptions, it is.The performance measure awards 1 point for

each clean square at each time step over a“lifetime”of 1000 time steps.

The “geography”of the environment is known apriori. Clean squares stay clean. (knowledge)

The only available actions are Left, Right, Suck,and NoOp.

The agent correctly perceives its location andwhether that location contains dirt.


Rationality

In the following situations, it is not.

The performance measure includes a penalty of1 point for each movement. What if the clean squares become dirty again?

The geography is unknown.


Omniscience, Learning, &AutonomyWe need to distinguish between rationality

and omniscience. e.g. Final Destination seriesThe rational choice depends only on the percept

sequence to date.Doing actions to modify future percepts

(information gathering) is an important part ofrationality.

Rationality maximizes expectedperformance, while perfection maximizesactual performance.


Omniscience, Learning, &AutonomyThe rational agent needs not only to gather

information but also to learn from what itperceives.

Successful agents split the task ofcomputing the agent function into threeperiods:Design phaseActing phase Learning phase


Omniscience, Learning, &AutonomyA rational agent should be autonomous –it should

learn what it can do to compensate for partial orincorrect prior knowledge.

Evolution provides animals with enough built-inreflexes so that they can survive long enough tolearn for themselves.

It would be reasonable to provide an agent withsome initial knowledge and the learning ability.


Task Environments

Task environments are essentially the“problems”to which rational agents are“solutions.”

In designing an agent, the first step mustalways be to specify the task environmentas fully as possible. PEAS: Performance, Environment, Actuators,

and Sensors.


Task Environments

Example: Automated taxi driver

Cameras, sonar,speedometer,GPS, odometer,accelerometer,engine sensors,keyboard

Steering,accelerator,brake, signal,horn, display

Roads,pedestrians,animals,customers,etc.

Safe, fast,legal,comfortable,profits, etc.

Taxi driver

SensorsActuatorsEnvironmentPerformanceMeasure

AgentType


Task Environments

Some other examples

Keyboardentry

Displayexercises,suggestions,corrections

Set ofstudents,testingagency

Maximizestudent’sscore on test

InteractiveEnglishtutor

Camera,joint anglesensors

Jointed armand hand

Conveyor beltwith parts;bins

Percentage ofparts incorrect bins

Part-pickingrobot

SensorsActuatorsEnvironmentPerformanceMeasure

AgentType


Properties of Environments

Fully observable vs. partially observable Fully observable: the sensors detect all aspects

that are relevant to the choice of action.The agent doesn’t need internal states in fully

observable environments.An environment might be partially observable

due to missing data noise & inaccuracy



Deterministic vs. stochasticDeterministic: the next state of the

environment is completely determined by thecurrent state and the action executed

The aforementioned vacuum world isdeterministic, whereas taxi driving is stochastic.

Strategic: the environment is deterministicexcept for the actions of other agents



Episodic vs. sequential Each episode consists of the agent perceiving

and then performing an action. In episodic environments, the choice of action

in each episode depends only on the episodeitself. E.g. agent spotting defective parts on an assembly line

Episodic environments are much simplerbecause the agent does not need to look ahead.



Dynamic vs. staticDynamic: the environment changes while an

agent is deliberating.Semidynamic: the environment itself does not

change but the performance score does.

Static environments are easy to deal withbecause the agent need not keep looking at theworld nor need it worry about the passage oftime.



Discrete vs. continuousThe discrete/continuous distinction can be

applied to the state of the environment, the way time is handled, and the percepts and actions of the agent.

Chess has discrete states, percepts, andactions.

Taxi driving has continuous state, time, andactions.



Single agent vs. multiagentWhen does an agent A treat an object B as an

agent?

“when B’s behavior is to maximize aperformance measure whose valuedepends on A’s behavior”

Chess is a competitive multiagent environment.Taxi driving is partially competitive and

partially cooperative.



Part-pickingrobot

Taxi driving

Poker

Chess with aclock

Crosswordpuzzle

AgentsDiscreteStaticEpisodicDetermin-istic

Observ-able

TaskEnvironment

Fully

Fully

D

Strategic

Episodic

Static

Semi

Discrete

Discrete

DiscreteStatic

Single

Single

可見度掌握度關聯性變動度



In the table on previous slide, the answersare not always fixed. e.g. Exceptions to observability of chess

Some other answers depend on how thetask environment is defined. e.g. Each game can be viewed as an episode, but

decision making within a single game is certainlysequential.


Tea Time (Roomba)

Officialhttp://www.youtube.com/watch?v=HqhIMFQNGCg

Unofficialhttp://www.youtube.com/watch?v=D0DEPpFL9OY


Agent Structure

Agent = architecture + programThe job of AI is to design the agent

program that implements the agentfunction mapping percepts to actions.

The architecturemakes the percepts from the sensors available

to the program, runs the program, and feeds the program’s action choices to the

actuators.

硬體

軟體


Agent Structure

Agent program vs. agent functionAgent program takes the current percept as

input, and agent function takes the entirepercept history.

If the agent’s actions depend on the entirepercept sequence, the agent will have toremember the percepts.


Agent Structure

A trivial agent program that keeps track of the percept sequence, and then uses it to index into a table of actions

Artificial Intelligence: A Modern Approach, 2nd ed., Figure 2.7


Agent Structure

Why is the table-driven approach doomedto failure?There will be P + P2 + P3 + … + PT entries, where

P is the number of possible percepts and T is the lifetime of the agent.

e.g. The table of chess has at least 10150 entires.

The daunting size of these tables means no space to store the table no time to create the table no agent could ever learn the table from experience no guidance about how to fill in the table entries


Agent Structure

The TABLE-DRIVEN-AGENT doesimplement the desired agent function.

However, the key challenge of AI is to findout

“how to write programs that producerational behavior from a small amount ofcode rather than from a large number oftable entries.”


Agent Structure

Four basic kinds of agent programsSimple reflex agentsModel-based reflex agentsGoal-based agentsUtility-based agents

After introducing the above four kinds ofagent programs, we then explain how toconvert all these into learning agents.


Simple Reflex Agents

They select actions based only on thecurrent percept, ignoring the rest of thepercept history.


Example:



The simple reflex vacuum agent program isvery small compared to the table-drivenagent.

Reduction1: Ignoringthe percept history.

Reduction2: Ignoringthe location when it isdirty.




Condition-action rules (if-then rules)if car-in-front-is-brakingthen initiate-braking

Human have many such connections, some of which are learned responses (driving) some of which are innate reflexes (blinking

when something approaches the eye)



Schematic diagram




Agent program




They are simple but of very limitedintelligence.

They will work only if the environment is fullyobservable.

Infinite loops are often unavoidable for them inpartially observable environments. e.g. the vacuum agent without a

location sensor

看到黑影就開槍




Escape from infinite loop is possible if theagent can randomize its actions.

A randomized simple reflex agent mightoutperform a deterministic simple reflex agent.

The randomized behavior can be rational insome multi-agent environment.

But in most single-agent cases we can do muchbetter with more sophisticated deterministicagents.


Model-based Reflex Agents

The most effective way to handle partialobservability is for the agent to

“keep track of the part of the world itcan’t see now –maintain some sort ofinternal state that depends on the percepthistory.”



Updating the internal state requires theknowledge about how the world evolves independently of the

agent how the agent’s own actions affect the world

This knowledge about “how the worldworks”is called a model of the world.



Schematic diagram




Agent program



Goal-based Agents

Knowing about the current state is notalways enough to decide what to do.

The agent needs some sort of goalinformation that describes situations thatare desirable.


Goal-based Agents

Schema diagram



Goal-based Agents

The goal-based decision making isdifferent from the condition-action rulesin that the future is considered.What will happen if I do that?Will that make me satisfied?


Goal-based Agents

Reflex vs. goal-basedThe reflex agent brakes when it sees brake

lights.

A goal-based agent could reason that if the carin front has its brake lights on, it will slow down.Given the way the world usually evolves, theonly action that will achieve the goal of nothitting other cars is to brake.


Goal-based Agents

Although the goal-based agent appears lessefficient, it is more flexible because

“the knowledge that supports its decisionsis represented explicitly and can bemodified.”

The goal-based agent’s behavior can easily be changed togo to a different location, but the reflex agent’s ruleswork only for a single destination.


Goal-based Agents

Sometimes goal-based selection is nottrivial when the agent has to find a longsequence of actions to achieve the goal.

Search and planning are devoted to this.


Utility-based Agents

Goals alone are not really enough togenerate high-quality behavior in mostenvironments.Goals just roughly distinguish between “happy”

and “unhappy.”A performance measure should reflect the

degree of happiness.

Utility functions tradeoff between conflicting objectives likelihood of success


Utility-based Agents

Schema diagram



Learning Agents

Learning is now the preferred method forcreating state-of-the-art systems.Manual programming is too slow and tiring. Learning allows the agent to operate in initially

unknown environments and to become morecompetent.


Learning Agents

Four conceptual components learning element: making improvement performance element: aforementioned agent critic: evaluating how the agent is doing problem generator: suggesting actions that will

lead to new and informative experiences


Learning Agents

Schema diagram



Learning Agents

The design of the learning elementdepends very much on the design of theperformance element.

Given an agent design, learning mechanismcan be constructed to improve every partof the agent. how the world evolveswhat my actions do if-then rules etc.


Learning Agents

The critic is necessary because thepercepts themselves provide no indicationof the agent’s success.The performance standard distinguishes part of

the incoming percept as a reward (or penalty).

The problem generator is responsible forsuggesting actions that will lead to new andinformative experiences.


Learning Agents

Example: the automated driver

Make a quickleft turnacross threelanes

Shockinglanguage

Bad action

Try out thebrakes ondifferent roadsurfaces


Learning Agents

Learning is a process of modification ofeach component to bring them into closeragreement with the available feedbackinformation.The simplest case is to learn from the percept

sequence.Observation of pairs of successive states of

the environment can allow the agent to learn“How the world evolves.”

…


Summary

An agent is something that perceives andacts in an environment.

The agent function specifies the action inresponse to any percept sequence.

A rational agent acts to maximize theexpected value of the performancemeasure given the percept sequence andcandidate actions.


Summary

A task environment specifies theperformance measure, environment,actuators, and sensors.

Task environments vary along severaldimensions.

The agent program implements the agentfunction.


Summary

There exists a variety of agent-programdesigns. simple reflex agentsmodel-based agents goal-based agents utility-based agents

All agents can improve their performancethrough learning.

Intelligent Agentsweb.ntnu.edu.tw/~tcchiang/ai/1_Agents (S).pdf · 12 “Agents,”Artificial...

Documents

Transcript of Intelligent Agentsweb.ntnu.edu.tw/~tcchiang/ai/1_Agents (S).pdf · 12 “Agents,”Artificial...