Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · •...

35
1 Quadruped Robots and Legged Locomotion J. Zico Kolter Computer Science Department Stanford University Joint work with Pieter Abbeel, Andrew Ng Why legged robots?

Transcript of Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · •...

Page 1: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

1

Quadruped Robots and Legged Locomotion

J. Zico KolterComputer Science Department

Stanford University

Joint work with Pieter Abbeel, Andrew Ng

Why legged robots?

Page 2: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

2

Why Legged Robots?

“There is a need for vehicles that can

travel in difficult terrain, where existing

vehicles cannot go … Only about half of

the earth’s landmass is accessible to

existing wheeled and tracked vehicles, whereas a much larger fraction can be

reached by animals on foot.”

– Marc Raibert, Legged Robots that Balance, 1986

Why Legged Robots?

Page 3: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

3

Why Legged Robots?

… but, we aren’t quite there yet with legged robots.

The Potential Versus the Reality

Page 4: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

4

The Potential Versus the Reality

“… Although we take motivation from the

need to travel on rough terrain, the

running experiments reported here have

not yet ventured beyond our very flat

laboratory floor.”

– Marc Raibert, Legged Robots that Balance, 1986

Hardware Versus Software

• Although inferior to biological animals, current legged robot

hardware is very

capable

• The challenge is designing software to realize this potential The LittleDog robot, designed and

built by Boston Dynamics, Inc.

Page 5: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

5

The Quadruped Locomotion Task

The Quadruped Locomotion Task

• Our goal is to design a software system

that enables a quadruped robot to climb

over a wide variety of challenging,

previously unseen terrain

Page 6: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

6

The Quadruped Locomotion Task

• Our goal is to design a software system

that enables a quadruped robot to climb

over a wide variety of challenging,

previously unseen terrain

The Quadruped Locomotion Task

Perception

Using vision systems, build a model of the

terrain in front of the

robot and determine

position of the robot in

this model

Control

Generate a sequence of control inputs (i.e.,

commands to robot’s

joints) that move the

robot over the terrain

• Two distinct subtasks of overall problem:

Page 7: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

7

The Quadruped Locomotion Task

Perception

Using vision systems, build a model of the

terrain in front of the

robot and determine

position of the robot in

this model

Control

Generate a sequence of control inputs (i.e.,

commands to robot’s

joints) that move the

robot over the terrain

• Two distinct subtasks of overall problem:

Use motion

capture system and scanned

models of terrain

Control Task

Control

Generate a sequence of control inputs (i.e.,

commands to robot’s

joints) that move the

robot over the terrain

18 dimensional state space

(3-D position, 3-D orientation,

12-D joint angles)

Page 8: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

8

Control Task

• How do we apply dynamic programming to

large, continuous state spaces?

• Simple method: discretize the state space

x

y

Control Task

• How do we apply dynamic programming to

large, continuous state spaces?

• Simple method: discretize the state space

x

y

“Curse of Dimensionality”

Number of states grows exponentially

in the number of dimensions

Page 9: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

9

Control Task

Control

Generate a sequence of control inputs (i.e.,

commands to robot’s

joints) that move the

robot over the terrain

Footstep Planning

Plan sequence of footsteps across the

terrain.

Low-Level Control

Move joints to achieve these footsteps

Control Task

Control

Generate a sequence of control inputs (i.e.,

commands to robot’s

joints) that move the

robot over the terrain

Footstep Planning

Plan sequence of footsteps across the

terrain.

Low-Level Control

Move joints to achieve these footsteps

Page 10: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

10

Footstep Planning via Value Iteration

The Footstep Planning Problem

• Given an initial position, a goal position,

and a model of the terrain, plan footsteps

that move the robot to the goal

GoalInitial Position

Page 11: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

11

The Footstep Planning Problem

• Given an initial position, a goal position,

and a model of the terrain, plan footsteps

that move the robot to the goal

GoalInitial Position

Outline of approach:

Frame footstep planning problem as

a Markov Decision Process, and use

Value Iteration to plan footsteps

MDP Review

• Markov Decision Process (MDP):

M = (S,A,T , γ,D,R)

Page 12: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

12

MDP Review

• Markov Decision Process (MDP):

Set of states

M = (S,A,T , γ,D,R)

MDP Review

• Markov Decision Process (MDP):

Set of states

Set of actions

M = (S,A,T , γ,D,R)

Page 13: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

13

MDP Review

• Markov Decision Process (MDP):

Set of states

Set of actions

System dynamics

M = (S,A,T , γ,D,R)

MDP Review

• Markov Decision Process (MDP):

Set of states

Set of actions

System dynamics

Discount factor

M = (S,A,T , γ,D,R)

Page 14: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

14

MDP Review

• Markov Decision Process (MDP):

Set of states

Set of actions

System dynamics

Discount factor

Initial state

distribution

M = (S,A,T , γ,D,R)

MDP Review

• Markov Decision Process (MDP):

Set of states

Set of actions

System dynamics

Discount factor

Initial state

distribution

M = (S,A,T , γ,D,R)

Reward function

Page 15: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

15

State Space

M = (S,A,T , γ,D,R)

Set of states

State Space

• For footstep planning, state is X-Y location

of the feet on terrain

M = (S,A,T , γ,D,R)

State ∈ R8 =

(front-left-x, front-left-y,

front-right-x, front-right-y,

back-left-x,back-left-y,

back-right-x,back-right-y)

Page 16: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

16

State Space

• Discretize terrain (e.g. 3cm grid squares)

• For 60cm x 60cm terrain:

M = (S,A,T , γ,D,R)

|S| = 208 ≈ 2.5× 1010

State Space

M = (S,A,T , γ,D,R)

• But not all footstep combinations possible

Page 17: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

17

State Space

M = (S,A,T , γ,D,R)

• But not all footstep combinations possible

How do we find

the “legal” foot

positions?

Robot Kinematics

• Problem: “Natural” robot

foot state is joint positions,

but we want Cartesian

coordinates

• Forward Kinematics: convert from joint

angles to 3-D coordinates of the foot

• Inverse Kinematics: convert from 3-D

coordinates of foot to joint angles (or

indicate that foot location is infeasible)

Page 18: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

18

State Space

M = (S,A,T , γ,D,R)

• To determine if footsteps feasible:

– Pick location for body (e.g., center of feet)

– Inverse kinematics to see if all feet feasible

State Space

M = (S,A,T , γ,D,R)

• To determine if footsteps feasible:

– Pick location for body (e.g., center of feet)

– Inverse kinematics to see if all feet feasible

With a few additional

modifications, reduces state

space to ~1 million, suitable

for Value Iteration

Page 19: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

19

Action Space

M = (S,A,T , γ,D,R)

Set of actions

• Move one foot at a time

• For 60cm x 60cm terrain:

Action Space

M = (S,A, T , γ,D,R)

|A| = 4(202) = 1600

Action =

(foot,new-x,new-y)

Page 20: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

20

System Dynamics

M = (S,A,T , γ,D,R)

System dynamics

• If initial and next states are both feasible,

then action succeeds, fails otherwise

System Dynamics

M = (S,A, T , γ,D,R)

Valid Action

Page 21: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

21

• If initial and next states are both feasible,

then action succeeds, fails otherwise

System Dynamics

M = (S,A, T , γ,D,R)

Invalid Action

System Dynamics

M = (S,A,T , γ,D,R)

Discount factor

Page 22: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

22

Discount Factor

• No discount factor, corresponds to

shortest path problem

• Converges for non-positive reward in all

states, zero reward in goal states

M = (S,A, T , γ,D,R)

γ = 1

Initial State Distribution

M = (S,A,T , γ,D,R)

Initial state

distribution

Page 23: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

23

Initial State Distribution

• Initial state distribution contains only the

initial pose of the robot (no stochasticity)

M = (S,A, T , γ,D,R)

Initial Position

Initial State Distribution

M = (S,A,T , γ,D,R)

Reward function

Page 24: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

24

Reward Function

• Footsteps must trade off different features

– Slope of terrain, proximity to drop-offs, stability of robot’s pose, etc.

• (Negative) reward function specifies

relative weights for these features

M = (S,A, T , γ,D,R)

GoalInitial Position

Reward Function

M = (S,A, T , γ,D,R)

• Example (cost for a single footstep):

Page 25: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

25

Value Iteration

• Fully defined MDP

• Run value iteration to plan footsteps

M = (S,A, T , γ,D,R)

V (s)← R(s) + γmaxa∑s′ P (s

′|s, a)V (s′)

Performance

System without planned footsteps

Page 26: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

26

Performance

System after planning footsteps

Another Terrain

System without planned footsteps

Page 27: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

27

Another Terrain

System after planning footsteps

Extensions and Related Topics

Page 28: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

28

Extensions

• Problem: Number of states grows too

large with more terrain, finer resolution

• Solution: Plan a general path for the body,

then plan footsteps along path

Extensions

• Problem: Reward function needs to trade

off many features, hard to hand-specify

• Solution: Learn reward by demonstrating

good footsteps (“Apprenticeship Learning”)

Page 29: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

29

Extensions

• Problem: Reward function needs to trade

off many features, hard to hand-specify

• Solution: Learn reward by demonstrating

good footsteps (“Apprenticeship Learning”)

Demonstrated

foot positions

Control Task

Control

Generate a sequence of control inputs (i.e.,

commands to robot’s

joints) that move the

robot over the terrain

Footstep Planning

Plan sequence of footsteps across the

terrain.

Low-Level Control

Move joints to achieve these footsteps

Page 30: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

30

Low-Level Control

Initial setup of the robot

Low-Level Control

Direction

of Travel

Initial setup of the robot

Page 31: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

31

Low-Level Control

Back Left Front Left

Front RightBack Right

Direction

of Travel

Initial setup of the robot

Low-Level Control

Back Left Front Left

Front RightBack Right

Direction

of Travel

Desired Footstep

Initial setup of the robot

Page 32: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

32

Low-Level Control

• Supporting triangle: If robot’s center of

gravity (COG) in this triangle, will not fall

Low-Level Control

• Supporting triangle: If robot’s center of

gravity (COG) in this triangle, will not fall

Page 33: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

33

Low-Level Control

• First move COG into supporting triangle

• Then move foot

Fast Movement on Flat Ground

• Switching gears: previously focused on

slow motion over challenging terrain, now

looking at fast motion on flat ground

• To achieve faster speed, want to move

two feet at once (trot gait)

– Primary challenge is balance: when only two feet are on the ground, robot is always falling

Page 34: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

34

Learning to Balance

• Want to move robot’s center of gravity to

keep it as stable as possible

• But, very hard to hand-specify, a priori, a

good location for the center of gravity

• Learning: find a good location for the

center of gravity by adjusting it in response

to robot performance

Learning to Balance

Page 35: Quadruped Robots and Legged Locomotion - Peoplepabbeel/cs287-fa09... · 2009. 11. 24. · • Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain

35

References

• Kolter, Rodgers and Ng, A Control Architecture for Quadruped Locomotion over Rough Terrain, ICRA 2008

• Kolter, Abbeel and Ng, Hierarchical Apprenticeship Learning with Application to Quadruped Locomotion, NIPS 2008

• Kolter and Ng, Learning Omnidirectional Path Following Using Dimensionality Reduction, RSS 2007

Thank you

Papers and videos available at:

http://cs.stanford.edu/groups/littledog