Autonomous Mobile Robots CPE 470/670 Lecture 12 Instructor: Monica Nicolescu.
1/24 Human-Robot Interaction and Learning From Human Demonstration Maja J Matarić Chad Jenkins,...
-
date post
19-Dec-2015 -
Category
Documents
-
view
215 -
download
0
Transcript of 1/24 Human-Robot Interaction and Learning From Human Demonstration Maja J Matarić Chad Jenkins,...
1/24
Human-Robot Interaction and Learning From Human Demonstration
Maja J MatarićChad Jenkins, Monica Nicolescu,
Evan Drumwright, and Chi-Wei Chu
University of Southern CaliforniaInteraction Lab / Robotics Research Lab
Center for Robotics and Embedded Systems
http://robotics.usc.edu/~agents/Mars/mars.html
1/24
Motivation & Approach Goals:
Natural human-robot interaction in various domains. Robot programming & learning by imitation;
(mobile & humanoid).
General Approach: Use intrinsic behavior repertoire to facilitate control,
human-robot interaction, and learning. Use human interactive training method. Use human data-driven training methods.
1/24
Philosophy: Modularity & Interaction Complex control is represented as a combination of lower-
dimensionality, composable building blocks: behaviors. These are the abstraction for interaction. Representation is deictic & action-embedded. Perception is classification, learning is refinement,
enhancement & composition of building blocks. Interaction is action-embedded:
humans know the robot’s behavior repertoire robots map human input onto the repertoire, for prediction &
learning
Human-robot communication is action-based. Intervention is treated as high-priority perceptual input.
1/24
Learning From People
Learn from humans in two “natural” modes:
Human teacher/trainer demonstrates a skill to a robot, which learns from one or a few (in single digits) trials, by mapping observations to existing behaviors.
Corpus of human data is provided off-line, statistical learning is used to derive new behaviors (humanoid).
1/241/44
Previous Developments
A Hierarchical Abstract Behavior Architecture Representation & execution of complex, sequential,
hierarchically structured tasks
An algorithm for on-line learning of task representations from experienced demonstrations
Validated the architecture and learning algorithm Execution of tasks with hierarchical structure and long
behavioral sequences Learning of complex tasks from both human and robot teachers
1/241/44
Environment
sensory input
M. N. Nicolescu, M. J Matarić, “A hierarchical architecture for behavior-based robots", International Conference of Autonomous Agents and Multiagent Systems, July 15- July 19, 2002.
Hierarchical Abstract Behavior Architecture Extended behavior-based architecture
Flexible activation conditions (dependency
links between behaviors) allow for
behavior reusability
Representation of tasks as (hierarchical)
behavior networks
Sequential & opportunistic execution
Support for automated generation (task
learning)
1/241/44
Learning from Experienced Demonstrations Goal: Learn a high-level task representation in
terms of the robot’s own skills Approach:
Teacher-following strategy, active participation
in the demonstration The robot is equipped with a set of basic skills The teacher is aware of these skills and also about what observations the
robot can gather Mapping between what the robot sees and what it can perform The status (met/not met) of all behavior’s goals is continuously monitored:
Teacher may signal moments of time relevant to the task
goals met behavior fire observation-behavior mapping
M. N. Nicolescu, M. J Mataric, "Experience-based representation construction: learning from human and robot teachers", IEEE/RSJ International Conference on Intelligent Robots and Systems, October 29 - November 3, 2001
1/241/44
Recent Developments
Motivation Current approach leads to a correct but possibly
overspecialized task representation Problem in changing environments
Approach Refine the learned task representations through:
Generalizing over multiple (but few) demonstrations New demonstrations are “incorporated” into the existing task
representation Providing feedback during task execution
Unnecessary/missing parts of the task
1/241/44
Generalization Problem
Hard to learn a task from only one trial: Limited sensing capabilities, quality of
teacher’s demonstration, particularities of the environment
Similar to inferring a regular expression (FSA equivalent) from examples
Small number of demonstrations desired Statistical techniques not applicable
Main learning inaccuracies: Learning irrelevant steps (false positives) Omission of steps that are relevant (false
negatives)
A
C
B
F
A
A
B
F
C
A
B
F
A
Training examples
Generalization
?
1/241/44
Generalization Approach
Demonstrate the same task in different/similar environments Construct a task representation that:
Encodes the specifics of each given example Captures the common parts between all demonstrations
Compute a measure of similarity (common steps) between different examples The longest common sequence (LCS) between the topological
representations O(nm) Merge the common nodes
M. N. Nicolescu, M. J Matarić, ”Natural Methods for Robot Task Learning: Instructive Demonstrations, Generalization and Practice", Second International Joint Conference on Autonomous Agents and Multi-Agent Systems , July 14-18, 2003
1/241/44
Illustration of Approach
A
C
B
F
A
A
B
F
C
A
C
B
F
A
A
B
F
C
A
B
F
A
B
F
A
C
B
F
A C
Generalized network
Longest common sequence
3322A
3 321F
22 21B
2111C
111 1A
CFBA
XY
1/241/44
A
C
B
F
A C
A
B
F
A
The dynamic programming method computes the LCS at each level (depth) in the graph
The LCS is computed only for the different parts of the paths
The LCS table is kept as a linked list of arrays
The longest of the paths is selected merge the common nodes
new exampleexisting graph
Merging Additional Demonstrations For subsequent examples compute the LCS between
the new example and all possible paths in the graph
1/241/44
A
C
B
F
A C
A
AC + A
(AC + A)B
(AC + A)BF
(AC + A)BF
Behavior Network Execution
Computing preconditions for each behavior is similar to computing the regular expression from a FSA representation
Added capability for disjunctive & conjunctive activation conditions
Computing the types of dependencies between behaviors (ordering, enabling, permanent) from the two merged behavior networks
1/241/44
Generalization
Task: Go to either the Green or Light Green targets, pick up the Orange box, go to the Yellow and Red targets, go to the Pink target, drop the box there, come back to the Light Green target
None of the demonstrations corresponds to the desired task Contain incorrect steps and inconsistencies
1/241/44
Generalization ExperimentsFirst demonstration
Robot performance
Learned topologyEnvironment
All observations relevantNo trajectory learning
Not reactive policy
1/241/44
Generalization Experiments (II)3rd Human demonstration
Robot performance
3rd 2nd 1st
1/241/44
Refining Task Representation Through Feedback Feedback given through speech
Unnecessary task steps (“bad”) – remove steps from network Missing task steps (”new” ”continue”) – add new steps to the network
A
B
A C
Deleteunnecessarysteps
A
B
A C
M
N
Include newlydemonstratedsteps
A
C
B
F
A C
BAD
BAD
NEW
M
N
CONTINUE
A
B
A C
M. N. Nicolescu, M. J Matarić, ”Natural Methods for Robot Task Learning: Instructive Demonstrations, Generalization and Practice", Second International Joint Conference on Autonomous Agents and Multi-Agent Systems , July 14-18, 2003
1/241/44
Practice and Feedback Experiments3rd demonstration Practice run & feedback
Robot performance
Topology refinement
1/241/44
Practice and Feedback Experiments(II)Practice run & feedback1st demonstration
Robot performancePractice run
Topology refinement
1/241/44
Summary
Generalization method incorporates multiple demonstrations into a unique behavior network representation Helps detect relevant/irrelevant observations
Simple feedback cues can be used for: Providing instructive demonstrations
Refining the task representations learned from direct
demonstration or generalization
1/24
Learning from Motion Data Goal:
To automatically derive both primitive and high-level behaviors from human motion data.
Use behaviors as a substrate for generating robot motion and predicting/classifying human activity.
Method: Corpus of human motion data (motion capture). Dimensional reduction to extract behaviors.
1/24
Motion Segmentation Extract short motion sequences. Previous methods:
Manual (slow, tedious).
Z-function (discrete motion only). !!!!Modified by Peters for Robonaut data
New method: kinematic centroid: Assume limbs are pendulums.
Greedy method determines “end” of
pendulum swing
Appropriate for highly dynamic motion
exhibiting large swings!!!! O. C. Jenkins, M. J Matarić, “Automated Modularization of Human Motion into Actions and Behaviors", USC Center for Robotics and Embedded Systems Technical Report No. CRES-02-002.
1/24
Previous Work Earlier efforts involved:
Application of PCA dimension reduction to arm motion data. K-means clustering to uncover primitive behaviors.
Limitations: Linear PCA applied to nonlinear motion data.
PCA does not capture temporal dependencies.
Clustering for PCA decomposes space, but: Primitives have no intuitive meaning or theme.
Difficult to compose these primitives into higher-level behaviors.
1/24
Current Work Use Isomap for non-linear DR [Tenenbaum et al 2000].
Extend Isomap to handle temporal dependencies. Cluster separable motion groups with bounding-box
clustering.
O. C. Jenkins, M. J Matarić, “Deriving Action and Behavior Primitives from Human Motion Data", IEEE International Conference on Intelligent Robots and Systems, September 30- October 4, 2002.
Interpolate within cluster to represent new motion. Use further DR iterations to derive high-level behaviors.
1/24
A
AA
B
B B
CCF
F
DD
D
E
E E
CF
A
Example of two high-level behaviors...
AB
C DE
FX Y
PCA, spatial IsomapA
B
C/FD
E
Spatio-temporal Isomap
AB
C DE
F
tA
Spatio-temporal Isomap, iter 2
X YtA
performing this sequence:
An example (???)
1/24
Spatio-temporal Dimension Reduction Isomap extracts underlying (non-linear) structure of data.
E.g. 2D spherical manifold from 3D position data.
Extend Isomap for temporal data using common temporal neighbors (CTN): CTN observes that sequence B is preceded by sequence A and
followed by sequence C (resolves spatially similar sequences)
1/24
High-level Behaviors Extract high-level behaviours by applying spatio-
temporal isomap again to sequence of primitives.
1/24
6
11
817
33
39
Arm Waving Punching
Derived High-level Behaviors
1/24
Primitive Motion Synthesis Use interpolation between motion sequences to generate
new variations. Interpolation provides a form of parameterization for a
primitive.
Trajectories of hand positions produced by interpolation. Blue/Red are motions grouped into a primitive. Black/magenta are motions are new motion variations.
1/24
High-level Motion Synthesis A behavior can be used to synthesize a variation on the
input motion. Synthesis uses segment concatentation
PunchingArm Waving Dancing: “Cabbage Patch”
1/24
Primitives as Forward Models Through eager evaluation, the span of motion variations
can be realized for each primitive Consequently, a nonlinear forward model can be
produced for each primitive Used for motion synthesis, given initial posture Experimenting with motion classification via Kalman gains
PCA-view of primitive flow field in joint angle space
1/24
Summary: Learning from Motion Data Strengths of current approach
Derives suitable behaviors for nonlinear motion data with temporal dependencies.
Segmentation techniques allow for full automation. New variations on derived behaviors can be synthesized. Flow field forward models can be produced for each primitive Primitive forward models allow for smooth motion synthesis and
motion classification
Future work: Validation on better motion data (always). Derivation of primitives from NASA Robonaut motion. Integration with task-directed control mechanisms. Posture-atomic primitive derivation.
1/24
Humanoid control via parameterized trajectories
● Free-space control of humanoid robots● Set of exemplar trajectories
● represent Cartesian extrema of single behavior
● trajectories are in joint-space● New movements produced via interpolation
● representative of the original behavior● selected by mixing parameter
1/24
What's good about this?● Very few exemplars of a behavior may be
needed to model that behavior●For dextrous robotic control, easier than explicit programming or optimal control methods●Trajectories can be represented compactly
●RBF approximation can represent complex (i.e. very non-linear) trajectories with high-fidelity using little storage
1/24
Robotic control via parametric primitives
● Precondition(s) for primitive must first be met
● Time duration for primitive then selected● Primitive then executed open-loop
●closed-loop controllers investigated in future
● Control operates at kinematic level only● position and/or velocity commands sent to
low-level controller
1/24
Activity recognition via primitives
● Primitives serve to model a behavior● This model can be used to recognize the behavior● We built a Bayesian classifier to recognize a set of five primitives from mocap & simulator data
● rate of false negatives: 3.39%● rate of false positives: 0.06%● more data needed for validation
1/24
Markerless Kinematic Model and Motion Capture In addition, a kinematic model is estimated for the subject We leverage recent voxel carving techinques for
constructing 3D point volumes of moving subjects in multiple calibrated cameras
Voxel Carving
1/24
Nonlinear Spherical Shells NSS is a simple means for volume skeletonization
Pose-independent principal curve
Captured volume skeleton curve
Original Volume Pose-independent “Da Vinci” Zero Posture
Spherical Shells
Dimension Reduction Partitioning
Clustering and Linking
Projection on originalvolume
1/24
Model and Pose Estimation Using each volume and its skeleton curve, a kinematic
model for each frame in a motion A single model is estimated for the sequence by
aligning frame-specific models identifying common joints using density of aligned joints
Alignment
1/24
Result for Human Waving
1/24
Result for Synthetic Volumes
Original Kinematics and Motion
Derived Kinematics and Motion
Snapshot of the Synethtic Volume for a Single Frame
1/24
Conclusions Goal: use the behavior substrate to facilitate action-
embedded human-robot interaction, control, and learning. Recent successes:
Generalization of multiple (but few) demonstrations into a unique behavior network representation.
Use of simple feedback cues for refining of learned tasks and faster learning.
Automatic derivation of behaviors from human motion data. Some work in progress:
Validation of generalization and human-robot interaction methods in elaborated experimental setups.
Validation of the derivation method on Robonaut data.
Info, videos, papers: http://robotics.usc.edu/projects/mars/