DARPA ITO/MARS Project Update Vanderbilt University
description
Transcript of DARPA ITO/MARS Project Update Vanderbilt University
![Page 1: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/1.jpg)
DARPA ITO/MARS Project UpdateVanderbilt University
A Software Architecture and Tools for Autonomous
Robots that Learn on MissionK. Kawamura, M. Wilkes, R. A. Peters II, D. Gaines
Vanderbilt UniversityCenter for Intelligent Systems
http://shogun.vuse.vanderbilt.edu/CIS/IRL/
12 January 2000
![Page 2: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/2.jpg)
Vanderbilt MARS Team
• Kaz Kawamura, Professor of Electrical & Computer Engineering. MARS responsibility - PI, Integration
• Dan Gaines, Asst. Professor of Computer Science. MARS responsibility - Reinforcement Learning
• Alan Peters, Assoc. Professor of Electrical Engineering. MARS responsibility - DataBase Associative Memory, Sensory EgoSphere
• Mitch Wilkes, Assoc. Professor of Electrical Engineering. MARS responsibility - System Status Evaluation
• Jim Baumann, Nichols ResearchMARS responsibility - Technical Consultant
Sponsoring AgencyArmy Strategic Defense Command
![Page 3: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/3.jpg)
IMPACT:
NEW IDEAS:GRAPHIC:
SCHEDULE:
Learning with a DataBase Associative Memory
Sensory EgoSphere
Attentional Network
Robust System Status Evaluation
Mission-level interaction between the robot and a Human Commander.
Enable automatic acquisition of skills and strategies.
Simplify robot training via intuitive interfaces - program by example.
A Software Architecture and Tools for Autonomous Mobile Robots That Learn on Mission
Year 1 Year 2
IMA agents and schema
Learning algorithms
Test Demo
Final Demo
Demo III
COMM
LEARNING
CMDR SQUAD 1
SQUAD 2
SQUAD N
...SELF
ENVIR
IMA
![Page 4: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/4.jpg)
Project Goal
1. Develop a software control system for autonomous mobile robots that can:
2. accept mission-level plans from a human commander,
3. learn from experience to modify existing behaviors or to add new behaviors, and
4. share that knowledge with other robots.
![Page 5: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/5.jpg)
Project Approach
• Use IMA, to map the problem to a set of agents.
• Develop System Status Evaluation (SSE) for self diagnosis and to assess task outcomes for learning.
• Develop learning algorithms that use and adapt prior knowledge and behaviors and acquire new ones.
• Develop Sensory EgoSphere, behavior and task descriptions, and memory association algorithms that enable learning on mission.
![Page 6: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/6.jpg)
MARS Project: The Robots
ISAC HelpMate
ATRV-Jr.
![Page 7: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/7.jpg)
CommunicationsAgent
Act./Learning Agent
Commander Agent
Squad Agent1
Squad Agent2
Squad Agentn
...Self
Agent
EnvironmentAgent
IMA
The IMA Software Agent Structure of a Single Robot
![Page 8: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/8.jpg)
Robust System Status Analysis
• Timing information from communication between components and agents will be used.
• Timing patterns will be modeled.
• Deviations from normal indicate “discomfort.”
• Discomfort measures will be combined to provide system status information.
![Page 9: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/9.jpg)
What Do We Measure?
• Visual Servoing Component– error vs. time
• Arm Agent– error vs. time, proximity to unstable points
• Camera Head Agent– 3D gaze point vs. time
• Tracking Agent– target location vs. time
• Vector Signals/Motion Links– log when data is updated
![Page 10: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/10.jpg)
Update Delay Histogram (Arm Agent)
0
100
200
300
4001 9 17 25 33 41 49 57 65 73 81 89 97
Delay (10ms)
Freq
uenc
yUpdate Delay Histogram (Arm Agent)
0
50
100
150
200
1 9 17 25 33 41 49 57 65 73 81 89 97
Delay (10ms)
Freq
uenc
y
Update Delay Histogram (Arm Agent)
0
50
100
150
1 9 17 25 33 41 49 57 65 73 81 89 97
Delay (10ms)
Freq
uenc
y
Update Delay Histogram (Hand Agent)
0
500
1000
1500
1 10 19 28 37 46 55 64 73 82 91 100
Delay (10ms)
Freq
uenc
y
![Page 11: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/11.jpg)
Commander Interface
![Page 12: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/12.jpg)
Commander Interface
![Page 13: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/13.jpg)
Commander Interface
![Page 14: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/14.jpg)
Obstacle Avoidance
![Page 15: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/15.jpg)
Planning/Learning Objectives• Integrated Learning and Planning
– learn skills, strategies and world dynamics
– handle large state spaces
– transfer learned knowledge to new tasks
– exploit a priori knowledge
• Combine Deliberative and Reactive Planning
– exploit predictive models and a priori knowledge
– adapt given actual experiences
– make cost-utility trade-offs
![Page 16: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/16.jpg)
Overview of Approach
![Page 17: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/17.jpg)
Example: Different Terrains
![Page 18: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/18.jpg)
Generate Abstract Map
• Nodes selected based on learned action models • Each node represents a navigation skill
![Page 19: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/19.jpg)
Generate Plan in Abstract Network
• Plan makes cost-utility trade-offs
• Plans updated during execution
![Page 20: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/20.jpg)
• Action Model Learning– adapted MissionLab to allow experimentation (terrain conditions)– using regression trees to build action models
• Plan Generation– developed prototype Spreading Activation Network– using to evaluate potential of SAN for plan generation
Planning/Learning Status
![Page 21: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/21.jpg)
Role of ISAC in MARS
• Inspired by the structure of vertebrate brains
• a fundamental human-robot interaction model
• sensory attention and memory association
• learning sensory-motor coordination (SMC) patterns
• learning the attributes of objects through SMC
ISAC is a testbed for learning complex, autonomous behaviors by a robot under human tutelage.
![Page 22: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/22.jpg)
System Architecture
AA
A
AA
A
A
A
HumanAgent
RobotHuman
RobotSelfAgent
Software System
IMA PrimitiveAgent
HardwareI/O
![Page 23: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/23.jpg)
Next Up: Peer Agent
We are currently developing the peer agent.
The peer agent encapsulates the robot’s understanding of and interaction with other (peer) robots.
![Page 24: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/24.jpg)
System Architecture: High Level Agents
humanagent
selfagent
peeragent
peeragent
environmentagent
objectagent
objectagent
Due to the flat connectivity of IMA primitives, all high level agents can communicate directly if desired.
![Page 25: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/25.jpg)
Robot Learning Procedure• The human programs a task by sequencing component
behaviors via speech and gesture commands.
• The robot records the behavior sequence as a finite state machine (FSM) and all sensory-motor time-series (SMTS).
• Repeated trials are run. The human provides reinforcement feedback.
• The robot uses Hebbian learning to find correlations in the SMTS and to delete spurious info.
![Page 26: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/26.jpg)
Robot Learning (cont’d)• The robot extracts task dependent SMC info from the
behavior sequence and the Hebbian-thinned data.
• SMC occurs by associating sensory-motor events with behaviors nodes in the FSMs.
• The FSM is transformed into a spreading activation network (SAN).
• The SAN becomes a task record in the database associated memory (DBAM) and is subject to further refinements.
![Page 27: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/27.jpg)
Human Agent: Human Detection
![Page 28: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/28.jpg)
Human Agent: Recognition
![Page 29: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/29.jpg)
Human Agent: Face Tracking
![Page 30: DARPA ITO/MARS Project Update Vanderbilt University](https://reader036.fdocuments.in/reader036/viewer/2022062500/56814f1c550346895dbcad09/html5/thumbnails/30.jpg)
Schedule
YEAR ONE 1 2 3 4 5 6 7 8 9 10 11 12
Requirement Analysis/Concept Development
IMA (A/C) Deployment for HelpMate
IMA (A/C) Deployment for ATRV Jr.
Robust System Status Analysis
Reinforcement Learning
Develop Egosphere and DBAM
Demo Scenario – Simple HR interaction