Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline...
Transcript of Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline...
Deep Robotic Learning
Sergey LevineUC Berkeley Google Brain
roboticcontrolpipeline
observationsstate
estimation(e.g. vision)
modeling & prediction
planninglow-level control
controls
standardcomputervision
features(e.g. HOG)
mid-level features(e.g. DPM)
classifier(e.g. SVM)
deeplearning
Felzenszwalb ‘08
roboticcontrolpipeline
observationsstate
estimation(e.g. vision)
modeling & prediction
planninglow-level control
controls
deeproboticlearning
observationsstate
estimation(e.g. vision)
modeling & prediction
planninglow-level control
controls
end-to-end training
end-to-end training
no direct supervision
actions have consequences
1. Does end-to-end learning produce bettersensorimotor skills?
2. Can we apply sensorimotor skill learning to a wide variety of robots & tasks?
3. Can we scale up deep robotic learning and produce skills that generalize?
4. How can we learn safely and efficiently in safety-critical domains?
5. Can we transfer skills from simulation to the real world, and from one robot to another?
1. Does end-to-end learning produce bettersensorimotor skills?
2. Can we apply sensorimotor skill learning to a wide variety of robots & tasks?
3. Can we scale up deep robotic learning and produce skills that generalize?
4. How can we learn safely and efficiently in safety-critical domains?
5. Can we transfer skills from simulation to the real world, and from one robot to another?
Chelsea Finn
end-to-end training
0%successrate
96.3%successrate
pose prediction
(trained on pose only)
L.*, Finn*, Darrell, Abbeel, ‘16
1. Does end-to-end learning produce bettersensorimotor skills?
2. Can we apply sensorimotor skill learning to a wide variety of robots & tasks?
3. Can we scale up deep robotic learning and produce skills that generalize?
4. How can we learn safely and efficiently in safety-critical domains?
5. Can we transfer skills from simulation to the real world, and from one robot to another?
Deep Robotic Learning Applications
manipulation
locomotion
with N. Wagener, P. Abbeel with V. Kumar, A. Gupta, E. Todorov
with V. Koltun
aerial vehicles
with G. Kahn, T. Zhang, P. Abbeel
tensegrity robot
with X. Geng, M. Zhang, J. Bruce, K. Caluwaerts,M. Vespignani, V. SunSpiral, P. Abbeel
dexterous hands
with C. Eppner, A. Gupta, P. Abbeel
soft hands
1. Does end-to-end learning produce bettersensorimotor skills?
2. Can we apply sensorimotor skill learning to a wide variety of robots & tasks?
3. Can we scale up deep robotic learning and produce skills that generalize?
4. How can we learn safely and efficiently in safety-critical domains?
5. Can we transfer skills from simulation to the real world, and from one robot to another?
ingredients for success in learning:
supervised learning: learning robotic skills:
computation
algorithms
data
computation
algorithms~data?
monocularRGB camera
7 DoF arm
2-fingergripper
objectbin
Grasping with Learned Hand-Eye Coordination
• monocular camera (no depth)• no camera calibration either
• 2-5 Hz update• continuous arm control
• servo the gripper to target
• fix mistakes
• no prior knowledge
L., Pastor, Krizhevsky, Quillen ‘16
Peter PastorAlex
Krizhevsky Deirdre Quillen
Grasping Experiments
Policy Learning with Multiple Robots
Local policy optimization Global policy optimization
Rollout execution
MrinalKalakrishnan Yevgen ChebotarAdrian LiAli Yahya
Yahya, Li, Kalakrishnan, Chebotar, L., ‘16
Policy Learning with Multiple Robots: Deep RL with NAF
Gu*, Holly*, Lillicrap, L., ‘16
Shane Gu Ethan Holly Tim Lillicrap
Learning a Predictive Model of Natural Images
originalvideo
predictions
Chelsea Finn
1. Does end-to-end learning produce bettersensorimotor skills?
2. Can we apply sensorimotor skill learning to a wide variety of robots & tasks?
3. Can we scale up deep robotic learning and produce skills that generalize?
4. How can we learn safely and efficiently in safety-critical domains?
5. Can we transfer skills from simulation to the real world, and from one robot to another?
unknown environment
1. Learn a collision prediction model
command velocities
raw image
neural network ensemble
3. Iteratively train with on-policy samples
2. Speed-dependent, uncertainty-awarecollision cost
Key idea: To learn about collisions,must experience collisions (but safely!)
Safe Uncertainty-Aware Learning
Kahn, Pong, Abbeel, L. ‘16
Greg Kahn
Safe Uncertainty-Aware Learning
Kahn, Pong, Abbeel, L. ‘16
1. Does end-to-end learning produce bettersensorimotor skills?
2. Can we apply sensorimotor skill learning to a wide variety of robots & tasks?
3. Can we scale up deep robotic learning and produce skills that generalize?
4. How can we learn safely and efficiently in safety-critical domains?
5. Can we transfer skills from simulation to the real world, and from one robot to another?
Training in Simulation: CAD2RL
Sadeghi, L. ‘16
Fereshteh Sadeghi
Training in Simulation: CAD2RL
Sadeghi, L. ‘16
Training in Simulation: CAD2RL
Sadeghi, L. ‘16
Sadeghi, L. ‘16
Learning with Transfer in Mind: Ensemble Policy Optimization (EPOpt)
train test
adapt
training on single torso mass training on model ensemble
unmodeled effectsensemble adaptation
Aravind Rajeswaran
1. Does end-to-end learning produce bettersensorimotor skills?
2. Can we apply sensorimotor skill learning to a wide variety of robots & tasks?
3. Can we scale up deep robotic learning and produce skills that generalize?
4. How can we learn safely and efficiently in safety-critical domains?
5. Can we transfer skills from simulation to the real world, and from one robot to another?
6. How can we get sufficient supervision to learn in unstructured real-world environments?
Learning what Success Means
can we learn the goalwith visual features?
Finn, Abbeel, L. ‘16
Learning what Success Means
Sermanet, Xu, L. ‘16
ingredients for success in learning:
supervised learning: learning robotic skills:
computation
algorithms
data
computation
algorithms~data?
Announcement: New ConferenceConference on Robotic Learning (CoRL)www.robot-learning.org
Goal: bring together robotics & machine learning in a focused conference format
Conference: November 2017Papers deadline: late June 2017Steering committee: Ken Goldberg (UC Berkeley), Sergey Levine (UC Berkeley), Vincent Vanhoucke (Google), Abhinav Gupta (CMU), Stefan Schaal (USC, MPI), Michael I. Jordan (UC Berkeley), RaiaHadsell (DeepMind), Dieter Fox (UW), Joelle Pineau (McGill), J. Andrew Bagnell (CMU), Aude Billard (EPFL), Stefanie Tellex (Brown), Minoru Asada (Osaka), Wolfram Burgard (Freiburg), Pieter Abbeel(UC Berkeley)
Chelsea Finn
Peter PastorAlex
Krizhevsky Deirdre Quillen
MrinalKalakrishnan Yevgen ChebotarAdrian LiAli Yahya Shane Gu Ethan Holly Tim Lillicrap
Greg Kahn Fereshteh Sadeghi Aravind Rajeswaran
Pieter AbbeelTrevor Darrell