Laird ibm-small

26
A Cognitive Architecture Approach to Interactive Task Learning John E. Laird University of Michigan 1 Students: James Kirk, Shiwali Mohan (PARC), Aaron Mininger

Transcript of Laird ibm-small

A Cognitive Architecture Approach to Interactive Task Learning

John E. Laird

University of Michigan

1

Students: James Kirk, Shiwali Mohan (PARC), Aaron Mininger

Interactive Task Learning An agent that • learns new task specifications

objects, features, relations, goals and subgoals, possible actions (physical and conceptual), situational constraints on behavior, policy for behavior, and when task is appropriate;

• using natural interaction: language, gestures, demonstrations;

• comprehends task description and uses its cognitive and physical capabilities to perform task;

• learns fast (small numbers of experiences);

• learns native representation (assimilate, fast execution).

2

3

Cognitive Architecture • Fixed computational structures underlying intelligent behavior

– Representations of knowledge

– Memories that hold knowledge

– Processors that manipulate knowledge

• Supports end-to-end behavior

– Includes integration with perception and action

• General across tasks

– Architectural mechanisms are reused across every task and subtask

– Task-specific knowledge guides task behavior

• Complete

– No “escape” to additional specialized programming

4

Different Goals of Cognitive Architecture Research

Biological modeling: – Model what we know about the brain: neurons, neural circuits, … – Predict neural activity and cognitive behavior – Examples: LEABRA, SPAUN

Psychological modeling: – Model human performance in a wide range of cognitive tasks – Predict human reaction time and error rates for psychological tasks – Examples: ACT-R, EPIC, CLARION, LIDA, CHREST, 4CAPS

AI Functionality: – Toward human-level intelligence inspired by psychology and biology – Emphasizes more complex cognitive processing – Examples: Soar, Companions, Sigma, ICARUS, Polyscheme, CogPrime

5

Newell’s Time Scale of Human Action

6

Scale (sec) Time Units System Band 107 months 106 weeks Social 105 days 104 hours Task 103 10 min Task Rational 102 minutes Task 101 10 sec Unit task 100 1 sec Operations Cognitive 10-1 100 ms Deliberate act 10-2 10 ms Neural Circuit 10-3 1 ms Neuron Biological 10-4 100 µs Organelle

Soar A

CT-R

LEAB

RA

SPA

UN

Co

mp

anio

ns EP

IC

Sigma

Standard Model: Commonalities Across Architectures

• Organization • Modular architecture: WM, LTM, procedural, perceptual/motor…

• Representation of information • Probabilistic/statistical representation of perceptual data

• Symbolic relational structures in short and long-term memories

• Non-symbolic representations of meta-data • Used for retrieval from long-term memory, decision making, learning

• Processing • Complex behavior arises from simple decisions controlled by knowledge

• Significant internal asynchronous parallelism

• ~50msec is basic cycle time to achieve human real-time cognition

• Learning: Multiple types of increment & on-line

• Skill learning, reinforcement learning, activation adjustment, declarative learning

7

Common Structures of many Cognitive Architectures

Short-term Memory

Procedural Long-term Memory

Declarative Long-term Memory

Perception Action

Action Selection

Procedure Learning

Declarative Learning

Goals

Soar Structure

Symbolic Long-Term Memories

Symbolic Working Memory

Procedural

Decis

ion

Pro

cedure

Chunking Reinforcement

Learning

Action

Semantic

Semantic

Learning

Episodic

Episodic

Learning

9

Spatial Visual System (SVS)

Object-based continuous metric space

Supports mental imagery

Perception

controller

Interactive Task Learning Workshop

May 12-13, 2014: Ann Arbor, MI

John Anderson (CMU), Ken Forbus (Northwestern U), Kevin Gluck (AFRL), Chad Jenkins (Brown), John Laird (UM), Christian Lebiere (CMU),

Dario Salvucci (Drexel), Matthias Scheutz (Tufts), Andrea Thomaz (Georgia Tech), Greg Trafton (NRL), Robert Wray (Soar Tech), Shiwali Mohan (UM), James Kirk (UM)

Report: http://soar.eecs.umich.edu/publications

What isn’t Interactive Task Learning?

• Not just Interactive Task Learning – Not just interpret and execute commands – Learns multiple tasks that it can perform in the future

• Not just Interactive Task Learning – Not just policy learning – Learns task specification/formulation

• Not just Interactive Task Acquisition – Not offline learning from observation or compilation of a high-

level language: TAQL, HERBAL, HLSR, GDL – Learns through natural mixed initiative interaction with a human.

11

Big Picture

13

Acquire task description via language

Big Picture

14

Acquire task description via language

Construct internal task representation

Game

A1

C1

Tic-Tac-Toe

P1

block location C11 C12

place move

Extract internal representation of objects in the world

Big Picture

15

Acquire task description via language

Construct internal task representation

Reason over objects, relationships to determine available actions

Extract internal representation of objects in the world

Big Picture

16

Acquire task description via language

Construct internal task representation

Reason over objects, relationships to determine available actions

Search for solution by internally simulating actions

Extract internal representation of objects in the world

Big Picture

17

Acquire task description via language

Construct internal task representation

Reason over objects, relationships to determine available actions

Manipulate environment based on discovered solution

Search for solution by internally simulating actions

Perception

Word – Category Mapping

Parsing Knowledge

Interaction

Indexing

Verb Learning

Noun Learning

Prep Learning

Action Knowledge

Procedural Memory

Preposition – Spatial Relation Mapping

Verb – Operator Mapping

Noun/Adjective – Perceptual Symbol

Mapping

Semantic Memory

Locations

Primitive Verb – Operator Mapping

Episodic Memory

Agent’s Experiences

Working Memory

Spatial Visual System

Spatial Primitives Action

Soar

Desiderata D1. Learning Competent

D2. Task Competent

D3. Task General

D4. Easy to Teach

D5. Efficient Execution

D6. Organically integrates with other agent activities

19

Transfer Results between Games

20

0

10

20

30

40

50

60

70

80

Connect-3 Tic-Tac-Toe 4-Queens

Nu

mb

er o

f In

tera

ctio

ns

No transfers

After Connect-3

After Connect-3 and Tic-Tac-Toe

Transfer Results between Games

21

0

10

20

30

40

50

60

70

80

Connect-3 Tic-Tac-Toe 4-Queens

Nu

mb

er o

f In

tera

ctio

ns

No transfers

After Connect-3

After Connect-3 and Tic-Tac-Toe

Transfer Results between Games

22

0

10

20

30

40

50

60

70

80

Connect-3 Tic-Tac-Toe 4-Queens

Nu

mb

er o

f In

tera

ctio

ns

No transfers

After Connect-3

After Connect-3 and Tic-Tac-Toe

Tic-Tac-Toe after 4-Queens and Connect-3

If Tic-Tac-Toe is taught last, only 5 interactions are needed.

M: the game is tic-tac-toe

A: I do not know how to play tic-tac-toe. Tell me the name of a legal action in this game.

M: place (learned in 4 Queens)

A: Ok tell me the name of another legal action in this game, or finished if no more actions.

M: finished

A: Ok tell me the name of the goal in the game.

M: three-in-a-row (learned in Connect-3)

A: Ok tell me the name of a failure state in the game. (or none)

Mentor: none

A: Ok I have now learned the basics of the game.

23

Efficiency of Communication

24

0

100

200

300

400

500

600

700

800

NL average Rosie+ Rosie Soar GDL

Toke

ns

Method for Specifying Instructions

ToH

Tic-Tac-Toe

8-puzzle

Efficiency of Communication

25

0

100

200

300

400

500

600

700

800

NL average Rosie+ Rosie Soar GDL

Toke

ns

Method for Specifying Instructions

ToH

Tic-Tac-Toe

8-puzzle

Future Work on Taskability • More generality and complexity

– More complex games and concepts (hidden state, dynamic action, …)

– Beyond games to more real-world applications (mobile robots)

• More accessible communication

– More natural language, gestures, …

• Learn new language constructions

– Extend syntactic structures through instruction

• Informed by available background knowledge

– Take advantage of available knowledge bases

26

Etc. • Workshop on Interactive Task Learning in April at the International Conference on

Cognitive Modelling (ICCM-2015) in Groningen, Netherlands. • Advances in Cognitive Systems (ACS-2015) Conference in May at Georgia Tech.

• My web site: http://ai.eecs.umich.edu/people/laird/ • Soar web site: http://soar.eecs.umich.edu/

• References: – Kirk, J., Laird, J. E. 2014: Interactive task learning for simple games. Advances in Cognitive

Systems 3, 11-28 – Mohan, S., Laird, J.: Learning Goal-Oriented Hierarchical Tasks from Situated Interactive

Instruction. Proceedings of the 27th AAAI Conference on Artificial Intelligence (AAAI). – Laird et al.: Report on the NSF-funded Workshop on Interactive Task Learning (2014). – Mohan, S., Mininger, A., Laird, J. E. 2013: Towards an Indexical model of situated

comprehension for real-world cognitive agents. Advances in Cognitive Systems 3, 163-182. – Kirk, J. and Laird J. E.: Learning Task Formulations through Situated Interactive Instruction.

Proceedings of the 2nd Conference on Advances in Cognitive Systems (2013). Baltimore, Maryland

– Mohan, S., Mininger, A., Kirk, J. and Laird, J. E.: (2012). Acquiring Grounded Representations of Words with Situated Interactive Instruction, Advances in Cognitive Systems, Volume 2, December 2012, Palo Alto, California.

27