SA-1 Probabilistic Robotics Tutorial AAAI-2000 Sebastian Thrun Computer Science and Robotics...
-
Upload
brittney-rose -
Category
Documents
-
view
218 -
download
2
Transcript of SA-1 Probabilistic Robotics Tutorial AAAI-2000 Sebastian Thrun Computer Science and Robotics...
SA-1
Probabilistic RoboticsTutorial AAAI-2000
Sebastian ThrunComputer Science and Robotics
Carnegie Mellon University
© Sebastian Thrun, CMU, 2000 2SA-1
Recommended Readings
Probabilistic Algorithms in Robotics
(basic survey, 95 references)
AI Magazine (to appear Dec 2000)
Also: Tech Report: CMU-CS-00-126
http://www.cs.cmu.edu/~thrun/papers/thrun.probrob.html
© Sebastian Thrun, CMU, 2000 3SA-1
Collaborators and Funding
Anita ArendraMichael BeetzMaren BennewitzEric BauerJoachim BuhmannWolfram BurgardArmin B. CremersFrank DellaertDieter FoxDirk HähnelJohn Langford
Gerhard Lakemeyer
Dimitris MargaritisMichael MontemerloSungwoo ParkFrank PfenningJoelle PineauMartha PollackCharles RosenbergNicholas RoyJamieson SchulteReid SimmonsDirk SchulzWolli Steiner
Special thanks: Kurt Konolige, John Leonard, Andrew Moore, Reid Simmons
Sponsored by: DARPA (TMR, CoABS, MARS), NSF (CAREER, IIS, LIS),
EC, Daimler Benz, Microsoft + others
© Sebastian Thrun, CMU, 2000 4SA-1
Tutorial Goal
To familiarize you with probabilistic paradigm in robotics
• Basic techniques• Advantages• Pitfalls and limitations
• Successful Applications• Open research issues
© Sebastian Thrun, CMU, 2000 5SA-1
Tutorial Outline
Introduction Probabilistic State Estimation
• Localization• Mapping
Probabilistic Decision Making• Planning• Exploration
Conclusion
© Sebastian Thrun, CMU, 2000 6SA-1
Robotics Yesterday
© Sebastian Thrun, CMU, 2000 7SA-1
Robotics Today
© Sebastian Thrun, CMU, 2000 8SA-1
Robotics Tomorrow?
© Sebastian Thrun, CMU, 2000 9SA-1
Current Trends in Robotics
Robots are moving away from factory floors to
• Entertainment, toys• Personal services• Medical, surgery• Industrial automatization (mining, harvesting, …)• Hazardous environments (space, underwater)
© Sebastian Thrun, CMU, 2000 10SA-1
Robots are Inherently Uncertain
Uncertainty arises from four major factors:• Environment stochastic, unpredictable• Robot stochastic• Sensor limited, noisy• Models inaccurate
© Sebastian Thrun, CMU, 2000 11SA-1
Probabilistic Robotics
)(
)()|()|(
bp
apabpbap
© Sebastian Thrun, CMU, 2000 12SA-1
Probabilistic Robotics
Key idea: Explicit representation of uncertainty (using the calculus of probability theory)
Perception = state estimation Action = utility optimization
© Sebastian Thrun, CMU, 2000 13SA-1
Advantages of Probabilistic Paradigm
Can accommodate inaccurate models Can accommodate imperfect sensors Robust in real-world applications Best known approach to many hard robotics
problems
© Sebastian Thrun, CMU, 2000 14SA-1
Pitfalls
Computationally demanding False assumptions Approximate
© Sebastian Thrun, CMU, 2000 15SA-1
Trends in Robotics
Reactive Paradigm (mid-80’s)• no models• relies heavily on good sensing
Probabilistic Robotics (since mid-90’s)• seamless integration of models and sensing• inaccurate models, inaccurate sensors
Hybrids (since 90’s)• model-based at higher levels• reactive at lower levels
Classical Robotics (mid-70’s)• exact models• no sensing necessary
© Sebastian Thrun, CMU, 2000 16SA-1
Example: Museum Tour-Guides Robots
Rhino, 1997 Minerva, 1998
© Sebastian Thrun, CMU, 2000 17SA-1
Rhino (Univ. Bonn + CMU, 1997)
W. Burgard, A.B. Cremers, D. Fox, D. Hähnel, G. Lakemeyer, D. Schulz, W. Steiner, S. Thrun
© Sebastian Thrun, CMU, 2000 18SA-1
Minerva (CMU + Univ. Bonn, 1998)
Minerva
S. Thrun, M. Beetz, M. Bennewitz, W. Burgard, A.B. Cremers, F. Dellaert, D. Fox, D. Hähnel, C. Rosenberg, N. Roy, J. Schulte, D. Schulz
© Sebastian Thrun, CMU, 2000 19SA-1
“How Intelligent Is Minerva?”
fish dog monkey humanamoeba
5.7%
29.5%
25.4%
36.9%
2.5%
© Sebastian Thrun, CMU, 2000 20SA-1
“Is Minerva Alive?"
undecided noyes
3.2%
27.0%
69.8%
© Sebastian Thrun, CMU, 2000 21SA-1
“Is Minerva Alive?"
undecided noyes
3.2%
27.0%
69.8%
“Are You Under 10 Years of Age?”
© Sebastian Thrun, CMU, 2000 22SA-1
Nature of Sensor Data
Odometry Data Range Data
© Sebastian Thrun, CMU, 2000 23SA-1
Technical Challenges
Navigation• Environment crowded, unpredictable• Environment unmodified• “Invisible” hazards• Walking speed or faster• High failure costs
Interaction• Individuals and crowds• Museum visitors’ first encounter• Age 2 through 99• Spend less than 15 minutes
© Sebastian Thrun, CMU, 2000 24SA-1
© Sebastian Thrun, CMU, 2000 25SA-1
Tutorial Outline
Introduction Probabilistic State Estimation
• Localization• Mapping
Probabilistic Decision Making• Planning• Exploration
Conclusion
© Sebastian Thrun, CMU, 2000 26SA-1
The Localization Problem
Estimate robot’s coordinates s=(x,y,) from sensor data• Position tracking (error bounded)• Global localization (unbounded error)• Kidnapping (recovery from failure)
Ingemar Cox (1991): “Using sensory information to locate the robot in its environment is the most fundamental problem to provide a mobile robot with autonomous capabilities.”
see also [Borenstein et al, 96]
© Sebastian Thrun, CMU, 2000 27SA-1
s
p(s)
Probabilistic Localization
[Simmons/Koenig 95][Kaelbling et al 96][Burgard et al 96]
© Sebastian Thrun, CMU, 2000 28SA-1
Bayes Filters
)|()( 0 ttt dspsb
1011011 ),,|(),,,|()|( tttttttt dsoaspoasspsop
1111 )(),|()|( ttttttt dssbasspsop
),,,,|( 011 ooaosp tttt
),,,|(),,,,|( 011011 ooaspooasop ttttttt Bayes
),,,|()|( 011 ooaspsop ttttt Markov
110111 )|(),|()|( tttttttt dsdspasspsop
[Kalman 60, Rabiner 85]
d = datao = observationa = actiont = times = state
Markov
1021111 ),,|(),|()|( ttttttttt dsoaospasspsop
© Sebastian Thrun, CMU, 2000 29SA-1
Bayes Filters are Familiar to AI!
Kalman filters Hidden Markov Models Dynamic Bayes networks Partially Observable Markov Decision Processes
(POMDPs)
1111 )(),|()|()( tttttttt dssbasspsopsb
© Sebastian Thrun, CMU, 2000 30SA-1
Markov Assumption
)|(),,,,|( 011 tttttt sopooasop ),|(),,,,|( 110111 ttttttt asspooassp
)|,()|,,()|,,,,( 0101 ttttTtttT soapsoopsoaoop
} used above
Knowledge of current state renders past, future independent:
• “Static World Assumption”• “Independent Noise Assumption”
© Sebastian Thrun, CMU, 2000 31SA-1
Localization With Bayes Filters
1111 )(),|()|()( tttttttt dssbasspsopsb
1111 )|(),,|(),|()|( tttttttt dsmsbmasspmsopmsb
map m
s’a
p(s|a,s’,m)
a
s’
laser data p(o|s,m)p(o|s,m)observation o
© Sebastian Thrun, CMU, 2000 32SA-1
Xavier: (R. Simmons, S. Koenig, CMU 1996)
Markov localization in a topological map
© Sebastian Thrun, CMU, 2000 33SA-1
Markov Localizationin Grid Map
[Burgard et al 96] [Fox 99]
© Sebastian Thrun, CMU, 2000 34SA-1
What is the Right Representation?
Kalman filter
[Schiele et al. 94], [Weiß et al. 94], [Borenstein 96], [Gutmann et al. 96, 98], [Arras 98]
Piecewise constant(metric, topological)
[Nourbakhsh et al. 95], [Simmons et al. 95], [Kaelbling et al. 96], [Burgard et al. 96], [Konolige et al. 99]
Variable resolution(eg, trees)
[Burgard et al. 98]
Multi-hypothesis
[Weckesser et al. 98], [Jensfelt et al. 99]
© Sebastian Thrun, CMU, 2000 35SA-1
Idea: Represent Belief Through Samples
• Particle filters[Doucet 98, deFreitas 98]
• Condensation algorithm[Isard/Blake 98]
• Monte Carlo localization[Fox/Dellaert/Burgard/Thrun 99]
1111 )|(),,|(),|()|( tttttttt dsmsbmasspmsopmsb
Monte Carlo Localization (MCL)
MCL: Importance Sampling)(),|()( tttt sbmsopsb
),|( msop tt
tttttt ssbmsaspsb d)(),,|()( 11
MCL: Robot Motion
motion
)|( loP t
MCL: Importance Sampling)(),|()( 1111 tttt sbmsopsb
© Sebastian Thrun, CMU, 2000 40SA-1
1111 )|(),,|(),|()|( tttttttt dsmsbmasspmsopmsb
Particle Filters
draw s(i)t1 from b(st1)
draw s(i)t from p(st | s(i)
t1,at1,m)
Represents b(st) by set of weighted particles {s(i)t,w(i)
t}
Importance factor for s(i)t:
ondistributi proposal
ondistributitarget )( itw
),|( )( msop itt
)(),,|(
)(),,|(),|()(11
)(1
)(
)(11
)(1
)()(
itt
it
it
itt
it
it
itt
sBelmassp
sBelmasspmsop
© Sebastian Thrun, CMU, 2000 41SA-1
Monte Carlo Localization
© Sebastian Thrun, CMU, 2000 42SA-1
Monte Carlo Localization, cont’d
© Sebastian Thrun, CMU, 2000 43SA-1
Performance Comparison
Monte Carlo localizationMarkov localization (grids)
© Sebastian Thrun, CMU, 2000 44SA-1
Monte Carlo Localization
Approximate Bayes Estimation/Filtering• Full posterior estimation• Converges in O(1/#samples) [Tanner’93]• Robust: multiple hypothesis with degree of belief• Efficient: focuses computation where needed• Any-time: by varying number of samples• Easy to implement
© Sebastian Thrun, CMU, 2000 45SA-1
Pitfall: The World is not Markov!
99.0)(),|()short is (?
ttt
oo
tt dssbdomsopopt [Fox et al 1998]
Distance filters:
© Sebastian Thrun, CMU, 2000 46SA-1
Avoiding Collisions with Invisible Hazards
Raw sensors
ttamst dssbIaopt
)()( ),raytrace(
99.0)(sup* aopa ta
Virtual sensors added
© Sebastian Thrun, CMU, 2000 49SA-1
Multi-Robot Localization
Robots can detect each other (using cameras)
[Fox et al, 1999]
© Sebastian Thrun, CMU, 2000 50SA-1
Probabilistic Localization: Lessons Learned
Probabilistic Localization = Bayes filters Particle filters: Approximate posterior by random
samples Extensions:
• Filter for dynamic environments• Safe avoidance of invisible hazards• People tracking• Multi-robot localization• Recovery from total failures [eg Lenser et al, 00, Thrun et al 00]
© Sebastian Thrun, CMU, 2000 51SA-1
Tutorial Outline
Introduction Probabilistic State Estimation
• Localization• Mapping
Probabilistic Decision Making• Planning• Exploration
Conclusion
© Sebastian Thrun, CMU, 2000 52SA-1
The Problem: Concurrent Mapping and Localization
70 m
© Sebastian Thrun, CMU, 2000 53SA-1
The Problem: Concurrent Mapping and Localization
© Sebastian Thrun, CMU, 2000 54SA-1
On-Line Mapping with Rhino
© Sebastian Thrun, CMU, 2000 55SA-1
Concurrent Mapping and Localization
Is a chicken-and-egg problem• Mapping with known poses is “simple”• Localization with known map is “simple”• But in combination, the problem is hard!
Today’s best solutions are all probabilistic!
© Sebastian Thrun, CMU, 2000 56SA-1
Mapping: Outline
Posterior estimationwith known poses:Occupancy grids
Posterior estimationwith known poses:Occupancy grids
Maximum likelihood:ML*
Maximum likelihood:ML*
Maximum likelihood:EM
Maximum likelihood:EM
Posterior estimation:EKF (SLAM)
Posterior estimation:EKF (SLAM)
© Sebastian Thrun, CMU, 2000 57SA-1
Mapping as Posterior Estimation
1111 )(),|()|()( tttttttt dssbasspsopsb
1111111 ),(),,|,(),|(),( tttttttttttttt dmdsmsbamsmspmsopmsb
1 tt mmAssume static map
1111 ),(),,|(),|(),( tttttttt dsmsbmasspmsopmsb
1111 ),(),|(),|(),( tttttttt dsmsbasspmsopmsb
[Smith, Self, Cheeseman 90, Chatila et al 91, Durrant-Whyte et al 92-00, Leonard et al. 92-00]
© Sebastian Thrun, CMU, 2000 58SA-1
Kalman Filters
N-dimensional Gaussian
Can handle hundreds of dimensions
2
2
2
2
2
2
2
1
21
21
21
21
2222221
1111211
,),(
yxlll
yyxyylylyl
xxyxxlxlxl
lylxllllll
lylxllllll
lylxllllll
Nt
N
N
N
NNNNNN
N
N
y
x
l
l
l
msb
© Sebastian Thrun, CMU, 2000 59SA-1
Underwater Mapping
By: Louis L. Whitcomb, Johns Hopkins University
© Sebastian Thrun, CMU, 2000 60SA-1
Underwater Mapping - Example
“Autonomous Underwater Vehicle Navigation,” John Leonard et al, 1998
© Sebastian Thrun, CMU, 2000 61SA-1
Mapping with Extended Kalman Filters
Courtesy of [Leonard et al 1998]
© Sebastian Thrun, CMU, 2000 62SA-1
The Key Assumption Inverse sensor model p(st|ot,m) must be Gaussian.
Main problem: Data association
Posterior multi-modal
Undistinguishable features
In practice: • Extract small set of highly distinguishable features from sensor data• Discard all other data• If ambiguous, take best guess for landmark identity
Posterior uni-modal
Distinguishable features
© Sebastian Thrun, CMU, 2000 63SA-1
Mapping Algorithms - Comparison
SLAM
(Kalman)
Output Posterior
Convergence Strong
Local minima No
Real time Yes
Odom. Error Unbounded
Sensor Noise Gaussian
# Features 103
Feature uniq Yes
Raw data No
© Sebastian Thrun, CMU, 2000 64SA-1
Mapping: Outline
Posterior estimationwith known poses:Occupancy grids
Posterior estimationwith known poses:Occupancy grids
Maximum likelihood:ML*
Maximum likelihood:ML*
Maximum likelihood:EM
Maximum likelihood:EM
Posterior estimation:EKF (SLAM)
Posterior estimation:EKF (SLAM)
© Sebastian Thrun, CMU, 2000 65SA-1
Mapping with Expectation Maximization
Idea: maximum likelihood (with unknown data association)
t
tt
dsdsdsmasspmsopmpmb 211110
),,|(),|()()(
1111 ),(),|(),|(),( tttttttt dsmsbasspmsopmsb
EM: Maximize log-likelihood by iterating
]|[argmax ][]1[ k
m
k mmQm
)]|)|,,,([log]|[ 0][
00][
][ tk
ttm
k dmdsspEmmQ k E-step:
M-step:
[Dempster et al. 77]
Mapping with known poses
Markov localization (bi-directional)
[Thrun et al. 98]
© Sebastian Thrun, CMU, 2000 66SA-1
map(1)
© Sebastian Thrun, CMU, 2000 67SA-1
backward
forward
map(2)map(1)
© Sebastian Thrun, CMU, 2000 68SA-1
backward
forward
map(10)
© Sebastian Thrun, CMU, 2000 69SA-1
CMU’s Wean Hall (80 x 25 meters)
15 landmarks 16 landmarks
17 landmarks 27 landmarks
© Sebastian Thrun, CMU, 2000 70SA-1
EM Mapping, Example (width 45 m)
© Sebastian Thrun, CMU, 2000 71SA-1
Mapping Algorithms - Comparison
SLAM
(Kalman)
EM
Output Posterior ML/MAP
Convergence Strong Weak?
Local minima No Yes
Real time Yes No
Odom. Error Unbounded Unbounded
Sensor Noise Gaussian Any
# Features 103
Feature uniq Yes No
Raw data No Yes
© Sebastian Thrun, CMU, 2000 72SA-1
Mapping: Outline
Posterior estimationwith known poses:Occupancy grids
Posterior estimationwith known poses:Occupancy grids
Maximum likelihood:ML*
Maximum likelihood:ML*
Maximum likelihood:EM
Maximum likelihood:EM
Posterior estimation:EKF (SLAM)
Posterior estimation:EKF (SLAM)
© Sebastian Thrun, CMU, 2000 73SA-1
Incremental ML Mapping, Online
Idea: step-wise maximum likelihood
1111111 ),(),,|,(),|(),( tttttttttttttt dmdsmsbamsmspmsopmsb
),,|,(),|(argmax, 111,
ttttms
tt amsmspmsopms
Incremental ML estimate:
© Sebastian Thrun, CMU, 2000 74SA-1
Incremental ML: Not A Good Idea
path
robot
mismatch
© Sebastian Thrun, CMU, 2000 75SA-1
ML* Mapping, Online
Idea: step-wise maximum likelihood
111111 )(),,|(),|()( tttttttttt dssbmasspmsopsb
2. Posterior:
[Gutmann/Konolige 00, Thrun et al. 00]
1111111 ),(),,|,(),|(),( tttttttttttttt dmdsmsbamsmspmsopmsb
),,|,(),|(argmax, 111,
ttttms
tt amsmspmsopms
1. Incremental ML estimate:
© Sebastian Thrun, CMU, 2000 76SA-1
ML* Mapping, OnlineCourtesy of Kurt Konolige, SRI
[Gutmann & Konolige, 00]
© Sebastian Thrun, CMU, 2000 77SA-1
ML* Mapping, Online
Yellow flashes:
artificially distorted map (30 deg, 50 cm)
[Thrun et al. 00]
© Sebastian Thrun, CMU, 2000 78SA-1
Mapping withPoor Odometry
map andexploration path
raw data
DARPA Urban Robot
© Sebastian Thrun, CMU, 2000 79SA-1
Mapping Without(!) Odometry
mapraw data (no odometry)
© Sebastian Thrun, CMU, 2000 80SA-1
Localization in Multi-Robot Mapping
© Sebastian Thrun, CMU, 2000 81SA-1
Localization in Multi-Robot MappingCourtesy of Kurt Konolige, SRI
[Gutmann & Konolige, 00]
© Sebastian Thrun, CMU, 2000 82SA-1
3D Mapping
two laser range finders
© Sebastian Thrun, CMU, 2000 83SA-1
3D Structure Mapping (Real-Time)
© Sebastian Thrun, CMU, 2000 84SA-1
3D Texture Mapping
raw image sequencepanoramic camera
© Sebastian Thrun, CMU, 2000 85SA-1
3D Texture Mapping
© Sebastian Thrun, CMU, 2000 86SA-1
Mapping Algorithms - Comparison
SLAM
(Kalman)
EM ML*
Output Posterior ML/MAP ML/MAP
Convergence Strong Weak? No
Local minima No Yes Yes
Real time Yes No Yes
Odom. Error Unbounded Unbounded Unbounded
Sensor Noise Gaussian Any Any
# Features 103
Feature uniq Yes No No
Raw data No Yes Yes
© Sebastian Thrun, CMU, 2000 87SA-1
Mapping: Outline
Posterior estimationwith known poses:Occupancy grids
Posterior estimationwith known poses:Occupancy grids
Maximum likelihood:ML*
Maximum likelihood:ML*
Maximum likelihood:EM
Maximum likelihood:EM
Posterior estimation:EKF (SLAM)
Posterior estimation:EKF (SLAM)
© Sebastian Thrun, CMU, 2000 88SA-1
Occupancy Grids: From scans to maps
© Sebastian Thrun, CMU, 2000 89SA-1
Occupancy Grid Maps
1111 )(),|()|()( tttttttt dssbasspsopsb
Assumptions: poses known, occupancy binary, independenttss 0
[Elfes/Moravec 88]
][1
][11
][1
][][][ )(),|()|()( xyt
xytt
xyt
xyt
xytt
xyt dmmbammpmopmb
)()()|( ][1][][ xyxyt
xy mbmpomp
)()|()( ][][][ xyxyt
xy mbmopmb
][ xytm
][1
][ xyt
xyt mm Assume
© Sebastian Thrun, CMU, 2000 90SA-1
Example
CAD map occupancy grid map
The Tech Museum, San Jose
© Sebastian Thrun, CMU, 2000 91SA-1
Mapping Algorithms - Comparison
SLAM
(Kalman)
EM ML* Occupan. Grids
Output Posterior ML/MAP ML/MAP Posterior
Convergence Strong Weak? No Strong
Local minima No Yes Yes No
Real time Yes No Yes Yes
Odom. Error Unbounded Unbounded Unbounded None
Sensor Noise Gaussian Any Any Any
# Features 103
Feature uniq Yes No No No
Raw data No Yes Yes Yes
© Sebastian Thrun, CMU, 2000 92SA-1
Mapping: Lessons Learned
Concurrent mapping and localization: hard robotics problem
Best known algorithms are probabilistic1. EKF/SLAM: Full posterior estimation, but restrictive
assumptions (data association)
2. EM: Maximum Likelihood, solves data association
3. ML*: less robust but online
4. Occupancy grids: Binary Bayes filter, assumes known poses (= much easier)
© Sebastian Thrun, CMU, 2000 93SA-1
Tutorial Outline
Introduction Probabilistic State Estimation
• Localization• Mapping
Probabilistic Decision Making• Planning• Exploration
Conclusion
© Sebastian Thrun, CMU, 2000 94SA-1
The Decision Making Problem
Central Question: What should a robot do next?
Embraces • control (short term, tight feedback) • planning (longer term, looser feedback)
Probabilistic Paradigm: Considers uncertainty• current• future
© Sebastian Thrun, CMU, 2000 95SA-1
Planning under Uncertainty
Environment State Model
Classical Planning
deterministic observable Deterministic, accurate
MDP, universal plans
stochastic observable stochastic, accurate
POMDPs stochastic partially observable
stochastic, inaccurate
© Sebastian Thrun, CMU, 2000 96SA-1
Classical Situation
hellheaven
• World deterministic• State observable
© Sebastian Thrun, CMU, 2000 97SA-1
MDP-Style Planning
hellheaven
• World stochastic• State observable
[Koditschek 87, Barto et al. 89]
• Policy• Universal Plan• Navigation function
© Sebastian Thrun, CMU, 2000 98SA-1
Stochastic, Partially Observable
sign
hell?heaven?
[Sondik 72] [Littman/Cassandra/Kaelbling 97]
© Sebastian Thrun, CMU, 2000 99SA-1
Stochastic, Partially Observable
sign
hellheaven
sign
heavenhell
© Sebastian Thrun, CMU, 2000 100SA-1
Stochastic, Partially Observable
sign
heavenhell
sign
??
sign
hellheaven
start
50% 50%
© Sebastian Thrun, CMU, 2000 101SA-1
Outline
Deterministic, fully observable
Stochastic, fully observable, discrete states/actions (MDPs)
Stochastic, partially observable, discrete (POMDPs, Augmented MDPs)
Stochastic, partially observable, continuous (Monte Carlo POMDPs)
© Sebastian Thrun, CMU, 2000 102SA-1
Robot Planning FrameworksClassicalAI/robotplanning
State/actions discrete & continuous
State observable
Environment deterministic
Plans Sequences of actions
Completeness Yes
Optimality Rarely
State space size
Huge, often continuous, 6 dimensions
Computational Complexity
varies
© Sebastian Thrun, CMU, 2000 103SA-1
MDP-Style Planning
hellheaven
• World stochastic• State observable
[Koditschek 87, Barto et al. 89]
• Policy• Universal Plan• Navigation function
© Sebastian Thrun, CMU, 2000 104SA-1
Markov Decision Process (discrete)
s2
s3
s4s5
s1
0.7
0.3
0.90.1
0.3
0.3
0.4
0.99
0.1
0.2
0.8 r=10
r=0
r=0
r=1
r=0
[Bellman 57] [Howard 60] [Sutton/Barto 98]
© Sebastian Thrun, CMU, 2000 105SA-1
Value Iteration Value function of policy
Bellman equation for optimal value function
Value iteration: recursively estimating value function
Greedy policy:
)(,|)()( iitt
sasssrEsV
')'(),|'(max)()( dssVasspsrsVa
')'(),|'(argmax)( dssVasspsa
')'(),|'(max)()( dssVasspsrsVa
[Bellman 57] [Howard 60] [Sutton/Barto 98]
© Sebastian Thrun, CMU, 2000 106SA-1
Value Iteration for Motion Planning(assumes knowledge of robot’s location)
© Sebastian Thrun, CMU, 2000 107SA-1
Continuous Environments
From: A Moore & C.G. Atkeson “The Parti-Game Algorithm for Variable Resolution Reinforcement Learning in Continuous State spaces,” Machine Learning 1995
© Sebastian Thrun, CMU, 2000 108SA-1
Approximate Cell Decomposition [Latombe 91]
From: A Moore & C.G. Atkeson “The Parti-Game Algorithm for Variable Resolution Reinforcement Learning in Continuous State spaces,” Machine Learning 1995
© Sebastian Thrun, CMU, 2000 109SA-1
Parti-Game [Moore 96]
From: A Moore & C.G. Atkeson “The Parti-Game Algorithm for Variable Resolution Reinforcement Learning in Continuous State spaces,” Machine Learning 1995
© Sebastian Thrun, CMU, 2000 110SA-1
Robot Planning FrameworksClassicalAI/robotplanning
Value Iteration in
MDPs
Parti-Game
State/actions discrete & continuous
discrete continuous
State observable observable observable
Environment deterministic stochastic stochastic
Plans Sequences of actions
policy policy
Completeness Yes Yes Yes
Optimality Rarely Yes No
State space size
Huge, often continuous, 6 dimensions
millions n/a
Computational Complexity
varies quadratic n/a
© Sebastian Thrun, CMU, 2000 111SA-1
Stochastic, Partially Observable
sign
??
start
sign
heavenhell
sign
hellheaven
50% 50%
sign
??
start
© Sebastian Thrun, CMU, 2000 112SA-1
A Quiz
-dim continuous*stochastic1-dimcontinuous
stochastic
actions# states size belief space?sensors
3: s1, s2, s3deterministic3 perfect
3: s1, s2, s3stochastic3 perfect
23-1: s1, s2, s3, s12, s13, s23, s123deterministic3 abstract states
deterministic3 stochastic
2-dim continuous*: p(S=s1), p(S=s2)stochastic3 none
2-dim continuous*: p(S=s1), p(S=s2)
*) countable, but for all practical purposes
-dim continuous*deterministic1-dimcontinuous
stochastic
aargh!stochastic-dimcontinuous
stochastic
© Sebastian Thrun, CMU, 2000 113SA-1
Introduction to POMDPs
80100
ba
0
ba
40
s2s1
action a
action b
p(s1)
[Sondik 72, Littman, Kaelbling, Cassandra ‘97]
s2s1
100
0
100
action aaction b
Value function (finite horizon): Piecewise linear, convex Most efficient algorithm today: Witness algorithm
© Sebastian Thrun, CMU, 2000 114SA-1
Value Iteration in POMDPs Value function of policy
Bellman equation for optimal value function
Value iteration: recursively estimating value function
Greedy policy:
)(,|)()( iitt
babbbrEbV
')'(),|'(max)()( dbbVabbpbrbVa
')'(),|'(argmax)( dbbVabbpba
')'(),|'(max)()( dbbVabbpbrbVa
Substitute b for s
© Sebastian Thrun, CMU, 2000 115SA-1
Missing Terms: Belief Space
Expected reward:
Next state density:
dssbsrbr )()()(
')(),|'()'|'(),|'( dsdssbasspsopabop
'),|'(),,'|'(),|'( doabopabobpabbp
Bayes filters!(Dirac distribution)
© Sebastian Thrun, CMU, 2000 116SA-1
Value Iteration in Belief Space
. ...
next belief state b’
observation o
. ...
belief state b
max Q(b’, a)
next state s’, reward r’state s
Q(b, a)value function
© Sebastian Thrun, CMU, 2000 117SA-1
Why is This So Complex?
State Space Planning(no state uncertainty)
Belief Space Planning(full state uncertainties)
?
© Sebastian Thrun, CMU, 2000 118SA-1
Augmented MDPs:
s
sHsbb ][);(argmax
[Roy et al, 98/99]
conventional state space
uncertainty (entropy)
© Sebastian Thrun, CMU, 2000 119SA-1
Path Planning with Augmented MDPs
information gainConventional planner Probabilistic Planner
[Roy et al, 98/99]
© Sebastian Thrun, CMU, 2000 120SA-1
Robot Planning FrameworksClassicalAI/robotplanning
Value Iteration in
MDPs
Parti-Game POMDP Augmented MDP
State/actions discrete & continuous
discrete continuous discrete discrete
State observable observable observable partially observable
partially observable
Environment deterministic stochastic stochastic stochastic stochastic
Plans Sequences of actions
policy policy policy policy
Completeness Yes Yes Yes Yes No
Optimality Rarely Yes No Yes No
State space size
Huge, often continuous, 6 dimensions
millions n/a dozens thousands
Computational Complexity
varies quadratic n/a exponential O(N4)
© Sebastian Thrun, CMU, 2000 121SA-1
Decision Making: Lessons Learned
Four sources of uncertainty• Environment unpredictable• Robot wear and tear• Sensors limitations• Models inaccurate
Two implications• Need policy instead of simple (open-loop) plan• Policy must be conditioned on belief state
Approaches• MDP: Only works with perfect sensors, models• POMDPs: general framework, but scaling limitations• Augmented MDPs: lower computation, but approximate
© Sebastian Thrun, CMU, 2000 122SA-1
Tutorial Outline
Introduction Probabilistic State Estimation
• Localization• Mapping
Probabilistic Decision Making• Planning• Exploration
Conclusion
© Sebastian Thrun, CMU, 2000 123SA-1
Exploration: Maximize Knowledge Gain
Pick action a that maximizes knowledge gain. Constant time actions:
Variable time actions:
[Thrun 93] [Yamauchi 96] [Burgard et al 00] + many others
dodssbasspsopomH )(),|'()'|(] with [entropy of map
max]|[][ amHmH
max)(timeexpected
]|[][
a
amHmH
© Sebastian Thrun, CMU, 2000 124SA-1
Practical Implementation
For each location <x,y>• estimate number of cells robot can sense• estimate costs of getting there (value iteration)
[Simmons et al 00]
© Sebastian Thrun, CMU, 2000 125SA-1
Real-Time Exploration
© Sebastian Thrun, CMU, 2000 126SA-1
Coordinated Multi-Robot Exploration
Robots place “bids” for target areas Greedy assignment of robots to areas Exploration strategies and assignments
continuously re-evaluated while robots in motion
[Burgard et al 00] [Simmons et al 00]
© Sebastian Thrun, CMU, 2000 127SA-1
Collaborative Exploration and Mapping
© Sebastian Thrun, CMU, 2000 128SA-1
San Antonio Results
© Sebastian Thrun, CMU, 2000 129SA-1
Benefit of Cooperation
[Burgard et al 00]
© Sebastian Thrun, CMU, 2000 130SA-1
Exploration: Lessons Learned
Exploration = greedily maximize knowledge gain Greedy methods can be very effective Facilitates multi-robot coordination
© Sebastian Thrun, CMU, 2000 131SA-1
Tutorial Outline
Introduction Probabilistic State Estimation
• Localization• Mapping
Probabilistic Decision Making• Planning• Exploration
Conclusion
© Sebastian Thrun, CMU, 2000 132SA-1
Problem Summary
In Robotics, there is no such thing as• A perfect sensor• A deterministic environment• A deterministic robot• An accurate model
Therefore: Uncertainty inherent in robotics
© Sebastian Thrun, CMU, 2000 133SA-1
Key Idea
Probabilistic Robotics: Represents and reasons with uncertainty, represented explicitly
• Perception = posterior estimation• Action = optimization of expected utility
© Sebastian Thrun, CMU, 2000 134SA-1
Examples Covered Today
Localization Mapping Planning Exploration Multi-robot
© Sebastian Thrun, CMU, 2000 135SA-1
Successful Applications of Probabilistic Robotics
Industrial outdoor navigation [Durrant-Whyte, 95] Underwater vehicles [Leonard et al, 98] Coal Mining [Singh 98] Missile Guidance Indoor navigation [Simmons et al, 97] Robo-Soccer [Lenser et al, 00] Museum Tour-Guides [Burgard et al, 98, Thrun 99] + many others
© Sebastian Thrun, CMU, 2000 136SA-1
Relation to AI
Probabilistic methods highly successful in a range of sub-fields of AI
• Speech recognition• Language processing• Expert systems• Computer vision• Data Mining
• (and many others)
© Sebastian Thrun, CMU, 2000 137SA-1
Open Research Issues
Better representations, faster algorithms Learning with domain knowledge (eg, models,
behaviors) High-level reasoning and robot programming
using probabilistic paradigm Theory: eg, surpassing the Markov assumption Frameworks for probabilistic programming Innovative applications