Near-optimal Character Animation with Continuous Control
Transcript of Near-optimal Character Animation with Continuous Control
Near-optimal Character Anima-tion with Continuous Control
Adrien Treuille, Yongjoon Lee, Zoran Popović
2008.10.14 HA SE HOON
Prerequisite Introduction Related Works Motion Model Control Policies Results Conclusion
Outline
Motion Graph◦ A directed graph to synthesize a new motion◦ Node = Pose◦ Edge = Motion Clip
We already discussed in 4th presentation◦ “Group Motion Graph”, Yu-Chi Lai, Stephen Chen-
ney, ShaoHua Fan◦ Presented by Heo Jae Pil
We will use a similar but different method!
Prerequisite #1: Motion Graph
S A B F
N
Jumping Walking Running
Short Motion
Short MotionShort Motion
Short Motion
Drawn by Heo Jae Pil
A sub-area of machine learning How agent should take action?
◦ Goal: Maximize reward Model of reinforcement learning
◦ A set of environment states S◦ A set of action A◦ A set of scalar reward in R
We will use reinforcement learning to find a near-optimal action
Prerequisite #2: Reinforcement Learning
Finding a “near optimal” character anima-tion◦ With real-time continuous controller◦ Tasks
Navigation (Walking) Spinning Navigation Fixed Object Avoidance (FOA) Moving Object Avoidance (MOA)
Introduction: The goal of the paper
Real Time Continuous Controller(Arrows under the char.)
You should know…◦ How to represent motions, states, policies◦ How to blend motion clips◦ How to define the cost function◦ How to find the near optimal policy
Introduction
Motion Graph ◦ KOVAR, L., GLEICHER, M., AND PIGHIN, F. 2002
Motion synthesis from annotations◦ ARIKAN, O., FORSYTH, D. A., AND O’BRIEN, J. F. 2003
Precomputing avatar behavior from hu-man motion data◦ LEE, J., AND LEE, K. H. 2004
Related Works
A set of motion clips С
One of each clip C = (p1, p2, …, pm)
Each pose p = ℝn
◦ A Vector specifying all joint positions
Motion Model: Definition of terms
Divided into two subsequences ◦ Cin and Cout
◦ Each subsequence covers one foot plant frame◦ This frame became a ‘constraint’ frame
Motion Model:Motion ClipsAssumption: A clip C represents one walk cy-cle
Cin Cout
Constraint frames: (b) for Cin, (d) for Cout
Allow any transition between two clips◦ Unlike “motion graph”!
Algorithm◦ Step 1: C is mirrored if necessary◦ Step 2: Overlap constraint frames of Cout and C’in
◦ Step 3: Rotate C’ to match its foot (C’ is reoriented so that its ground-contact foot coincides with that of C)
◦ Step 4: Blend Cout to C’in
◦ (With ground-contact foot as the root of kinematic skeleton)
Motion Model: Blending motion clips
Motion Model: Example of motion blending
Constraint frames (foot-planted frames)
Prevent foot-skating!!! Why?
Control: Goal tasks
NavigationUser controls
gait, path, torso orienta-
tion(example of
gait: walking, running,…)
Spinning NavigationUser controls the motion di-
rection as character spins
Fixed Obstacle Avoid-anceThe character
follows a line, avoiding fixed planar object
Moving Obstacle Avoid-anceThe character
follows a line, avoiding fixed planar object
with linear mo-tion
State X
◦ C: current clip◦ x, z, Ɵ : position and orientation◦ u, v, u’, v’: relative position and speed of obstacle◦ G, T, W: desired Gait, Torso Orientation, Spin
Control: Definition of the state
Intention of user
Transition function f: S x C S◦ X = (C, …) X’ = (C’, …)
Control: Transition function
State Cost Cs: S ℝ◦ If current clip is not desired gait then Cs(X) = ∞
(If X = (C, …) but C ∉ G, then Cs(X) = ∞)
Transition Cost Ct : S x S ℝ
Control: Costs
Weights of each term.
Policy ∏ : S C◦ Decide next clip C from given state S◦ Then we can move to next state S’ with transition
function f(X, C) X’ (X/X’ = current/next state)
Policies: Definition
Definition of state and transition func-tion
Greedy Policy ∏greedy
◦ Find the clip that minimize the cost!◦ Just look one state
But we should consider the following case:
Policies: Greedy Policy
We will minimize the entire cost◦ Taking into account the future
Redefine the cost function◦ For given some policy ∏ that produces (X1, X2, …
Xn)
◦ α ∈ [0, 1) : future uncertainty factor
Policies: Optimal Policy
Value function V: S ℝ◦ Long term cost under optimal policy ∏*
◦ V(X) = C∏*(X)
Def: Optimal policy is
Thus, if we can calculate value function V(X), We can find long-term optimal next state!
Policies: Value function
Now issue is approximation of V(X)
Linear approximation of basis functions◦ Basis functions Ф = (Ф1, Ф2, … ,Фn)◦ Each basis functions Ф : S R
Polynomials or Gaussians◦ Then, Approximation of V is
◦ Basis functions are pre-selected by human! So, how can we calculate weights r1, r2, … rn?
Policies: Near optimal policy
Draw set of sample clips Init state transition pairs and r
◦ Def: r = a vector of weights (r1, r2, … rn)
Step1
◦ X’ is the optimal next state from X under current V
Step 2◦ Recalculate r by solving the linear program
Policies: Near optimal policy (Algorithm)
Policies: Near optimal policy
Switchability◦ Switching between tabulated value function◦ When transition from a walking to a running, the
algorithm still picks near-optimal
Seperability◦ Learn separate value function for each task◦ Ex) MOA and FOA
Policies: Dimensionality
Let’s see the video
Results
Presents a new control model◦ High dimensional◦ Continuous◦ Real time◦ Near optimal
But…◦ Needs a large amount of database
Conclusion
Any question?