Interactive Control of Avatars Animated with Human Motion Data By: Jehee Lee, Jinxiang Chai, Paul S....
-
Upload
bruno-townsend -
Category
Documents
-
view
221 -
download
0
description
Transcript of Interactive Control of Avatars Animated with Human Motion Data By: Jehee Lee, Jinxiang Chai, Paul S....
Interactive Control of Avatars Animated with Human Motion DataBy: Jehee Lee, Jinxiang Chai, Paul S. A. Reitsma, Jessica K. Hodgins, Nancy S. Pollard
Presented by: Nathan Hoobler
Why do we use motion capture?
Get realistic behavior “for free” An easy interface for generating control for
high DOF models Can capture behavior far too complicated
to model by handKung Fu, Acrobatics, other stylized motion
What is the problem with motion capture? Motion capture data is inherently
complicatedUsually far more degrees of freedom than can
be easily controlled by hand Not trivial to synthesize new behaviors
Transitions between different types of motion are hard
Often there are redundant behaviors
What does this paper do?
Identify distinct behaviors in the motion capture data
Allow intuitive control of high DOF data with a small DOF interface
Allow seamless transitions between different behaviors
System Overview
Loosely-patterned data comes in
A probabilistic transition matrix is built
Simplified transition graph is used to determine motion
System Overview
Various datasets come in
What kind of data can we use?
Long, consistent motion recordings are required for good transition generation
Does not handle sensor noise well
System Overview
Various datasets come in
Low-Level transitions are generated
Low-Level Representation
At this level, the system is very similar to the Video Textures techniqueFor each frame, find any other frames in the
dataset that are similarCalculate the probability of a transition from
frame j to frame k based on how closely the two frames match
Low-Level: Building the Matrix
The probability of transitioning from frame i to frame j is computed as
Where D(i, j) is the weighted “distance” from frame i to frame j
And d(pi, pj) is
So, how efficient is this?
Since the matrix is just a 2D mapping from any one frame to any other, the number of transitions is O(n^2)…
So, how efficient is this?
Since the matrix is just a 2D mapping from any one frame to any other, the number of transitions is O(n^2)…… For 4000-12000 frames per dataset (!)
So, how efficient is this?
Since the matrix is just a 2D mapping from any one frame to any other, the number of transitions is O(n^2)…… For 4000-12000 frames per dataset (!)
We need to reduce the number of transitions
Low-Level: Pruning
We can take advantage of a few useful features of the Motion Capture data Contact with the world should be similar between
transitioning frames Any interesting data is going to have mostly low-
probability transitions There are many frames that are very similar to others We want to avoid going down dead-end routes
Low-Level: Pruning (Contact)
Criteria 1: ContactEven if frames are very similar, so not
transition if the contact states are different(Strict interpretation) Only allow transitions
during contact states
Low-Level: Pruning (Likelihood)
Criteria 2: LikelihoodThrow away transitions whose probability is
less than some threshold value
Low-Level: Pruning (Similarity)
Criteria 3: Similarity If a frame has many transitions to states that
are all very similar to each other as well, throw away all but the best fitting transition
Low-Level: Pruning (SCC)
Criteria 4: Connectedness In theory, we want to avoid transitions that
don’t lead to well-connected nodesOnly add transitions that remain within the
largest Strongly Connected Component of the graph
“A maximal subgraph of a directed graph such that for every pair of vertices u, v in the subgraph, there is a directed path from u to v and a directed path from v to u.” (Mathworld)
Low-Level: Blending
Need interpolation to avoid discontinuities Problem: sharp changes are allowed at
contact points
Low-Level: Blending
Need interpolation to avoid discontinuities Problem: sharp changes are allowed at
contact points Solution: use a non-linear blend function
centered on the contact point and a moving average
Low-Level: Blending
Case 1: Follow the incoming frame
Case 2: Follow the outgoing frame
Case 3: Choose the side closest to the contact point
Case 4: Just let the foot slide; it’ll look bad no matter what
Low-Level: Coordinate System
Fixed/Global versus RelativeEach has an advantage, depending on the
situationThe paper uses both, depending on the
example
Fixed/Global Coordinates
AdvantagesGood for spatial data (the recording
environment corresponds strongly with the simulated environment)
DisadvantagesNot good for synthesizing motion in new
environments
Relative Coordinates
AdvantagesMuch easier to synthesize motions from
anywhere in the environment into new behaviors
Disadvantages Ignores orientation and position in three-
space, which may be important for some actions
High-Level Representation
Low-level representation is far too complicated to interact with
Simplify the data by grouping like frames into clusters
For each frame, find the possible clusters that can be transitioned to in the near term
High-Level Representation
Various datasets come in
Low-Level transitions are generated
Frames are grouped into clusters
Building Clusters
We want a simplified data setWeight important joints (arms, legs, pelvis,
etc.) highWeight less important joints (neck, etc.) low
Using weighted values, find similar frames and group them into clusters
High-Level Representation
Various datasets come in
Low-Level transitions are generated
Frames are grouped into clusters
A transition tree is built for each frame
Building the Cluster Forest
Each frame has a tree of clusters representing its valid transitionsFind the most probable transition from the
current frame to another cluster If the number of frames required to reach that
cluster is within a time threshold, add it to the forest
Repeat
Caveats about Clustering
Clustering is not always extremely useful Mostly a user interface issue
Useful for directly selecting the next motion (Direct Choice)
Not as useful for procedurally determining behavior (Path Sketching, Mimic)
Control Methods
Several interface methods were used, depending on how well they suited the exampleDirect ChoiceSketchingVideo-Capture
Direct Choice
Display valid states for the avatar, and let the user choose
Path Sketching
Allow the user to specify a path to follow Find motions that will put the avatar in the
right place
Video Mimic
Determine limb and body orientation from video input
Find closest matching frame(s), and imitate the user
Results
TerrainPath Sketching
Step StoolPath SketchingDirect Choice
PlaygroundDirect Choice
Any Questions?