Umbra Ignite 2015: Rulon Raymond – The State of Skinning – a dive into modern approaches to...

The State of Skinning

… Or How To Maintain Your Physique

Welcome!Tervetuloa!

Rulon RaymondSr. Engine Programmer

Introduction

1) Review2) Evolution of techniques on console HW3) The new hotness (hint: it’s a Clifford

Algebra)4) Extensions

DISCLAIMER: All screenshots and techniques presented are not associated with any specific title, project, or oragnization, unless otherwise

stated.

Outline

What is Skinning?

I Was Skinning Long Before 3D Animated Models Were All The Rage

Step 1: Generate a cool animated pose. Step 2: ??? Step 3: Use fancy lighting and shaders to

draw an animated model on-screen (i.e. profit)

What is Skinning?

Step2: Skinning!

What is Skinning?

Skinned Model, ready for drawing

Model Vertices

Bone Weights

Bone Transform

What is Skinning?

: The initial vertex transform Array of bone weighting values Array of bone transforms: The final vertex transform

Skinning on Consoles

• Sony Playstation (1995)• Geometry Transform Engine (GTE)

• Sony Playstation2 (2000)• Vector Unit 0 (VU0)

Microsoft Xbox (2001) NVIDIA GPU (DirectX 8.x)

Microsoft Xbox 360 (2005) PowerPC CPU

Skinning Implementation

Sony PS3 ( 2006) Synergistic Processing Units (SPU’s)

Why not use the GPU for skinning on Xbox 360 and PS3?

The CPU’s/SPU’s are actually quite fast.

Skinning Implementation

3X@3.2Ghz

6X@3.2Ghz(with many restrictions…)

Split Vertex Streams

VertexPosition, Tangent Space• Skinned

Colors, UV’s, etc.• Constant – sent straight to GPU

Stream 0

Stream 1

Unified Memory Architecture

// Just skinned a vertex. Now write it out as// three 16-byte vectors__stvx( skinnedVertexData0, vertsOutBuffer, 0 );__stvx( skinnedVertexData1, vertsOutBuffer, 16 );__stvx( skinnedVertexData2, vertsOutBuffer, 32 );// Gah – why’d that take so long?

// ~20% faster!// (F*&^% write-combine memory)__stvx( skinnedVertexData0, vertsOutBuffer, 0 );_WriteBarrier();__stvx( skinnedVertexData1, vertsOutBuffer, 16 );_WriteBarrier();__stvx( skinnedVertexData2, vertsOutBuffer, 32 );

So you can use the GPU for other things.

Microsoft Xbox One (2013) Sony PS4 (2013) AMD GCN GPU

GPU FrameDraw Calls

IDLE Draw Calls

Post FX

IDLEGCN Compute Unit

GCN Compute Unit

Async Compute Skinning

GPU FrameDraw Calls

Skinning

Draw Calls

Post FX

Skinning

GPU Compute Unit

• Generate Draw List (frame N)

Visible Models

• Async Compute Dispatch Thread.

Model Skinning

Workloads • GPU rendering (frame N-1)

Skinned Model

(frame N)

Async Compute Skinning

MATH WARNING!

The standard approach to real-time skinning, used in almost every modern 3D game.

Linear Matrix Blend Skinning

Suffers from some well-documented problems...

The “candy wrapper” effect

Mesh Volume Preservation

Example: “flat ass syndrome”

Q: Why do these problems exist?A: Let’s take a closer look at the underlying math…

Apply the property of distrubutivity:

To keep it simple: Let represent a rigid transform. No scale, shear, … Most common scenario for skinning in games.

A linear combination of rigid transforms DOES NOT yield a rigid transform! Orthonormal matrices aren’t

closed under addition. Scaling values can creep into

the final vertex transforms. Extreme cases can result in

rank-deficient matrices.

𝑣 ′

𝑀 𝑗1𝑣

𝑀 𝑗2𝑣

Example: The “candy wrapper” artifact

The most common workaround to these issues is the addition of new bones. Hand-animated or procedural. Split the rotation of a joint, relative to its parent, into even increments –

for a single axis only. Example: Arm Twist Bone

Parented to the shoulder and consistently represents exactly half its twist(roll) motion.

Adding these bones is not free!

Memory and processing overhead.

Exact amount depends on actual implementation.

Dual Quaternions to the rescue! But what exactly are they? Let’s start with a quick review of the vanilla

variety of quaternions…

Quaternions

Hamilton - 1843

A 4D extension of complex numbers

For our purposes all we care about is unit quaternions. Conveniently represent rotations. Conjugate:

Quaternions

𝑞∗=𝑞−1 ,‖𝑞‖=1

One important quaternion equation to note:

Applies a rotation to a 3D point

Quaternions

Similar in form to complex numbersStored as:

Dual Numbers

Conjugate

Multiplication

Dual Numbers

Basically a quaternion whose elements are dual numbers (quaternion form)

is the scalar part (dual number) is the vector part (dual vector)

(dual number form) : “non-dual part” : “dual part” Most useful for skinning.

Dual Quaternions

Multiplication:

Quaternion Conjugate:

Dual Conjugate:

Quaternion & Dual Conjugate:

Dual Quaternions

𝑁𝑜𝑟𝑚 (�̂�)=‖𝑞𝑎‖+⟨𝑞𝑎 ,𝑞𝑏 ⟩‖𝑞𝑎‖

Dual Quaternions

�̂�∗=�̂�−1 ,‖�̂�‖=1

Rigid Transforms:

Dual Quaternions

Transforming a 3D point

Dual Quaternions

Geometric Interpretation Recall:

Dual Quaternions : dual quaternion representing only a rotation

• : translation vector, in quaternion form

• : angle of rotation• : translation along

: unit dual quaternion with a 0 scalar part

• : direction of axis of rotation• : moment of rotation axis

Screw Transform! Rotation about an axis followed by translation

along that axis. All rigid transforms can be described this way.

Dual Quaternions

Simple Case:

Dual Quaternion Blend Skinning

𝑞0 𝑞1

𝑞𝐷𝑄𝐵

Unlike with matrix blending, the result is always a rigid transform!

Very accurate, but not perfect. Can introduce accelerations when input dual

quaternions differ greatly. 8.15 degrees : Maximum rotational deviation 15.1% : Maximum translational deviation

Modified SLERP can be used if absolute accuracy is required.

Efficiency tradeoff usually not worth it.

Must handle antipodality! Polarity rule:

We want: Fix up all dual quaternions prior to skinning.

�̂�

−�̂�

for ( all bones’ unit dual quaternions, dq[i] )if ( InnerProduct( dq[i], dq[parent[i]] ) <

0.0 )Negate( dq[i] );

// Input: unit quaternion 'q0', translation vector 't' // Output: unit dual quaternion 'dq' static void QuatTrans2UDQ( const float q0[4], const float t[3], float dq[2][4] ) {

// Non-Dual Part: dq[0] = q0 for ( int i=0; i<4; i++ )

dq[0][i] = q0[i];

// Dual Part: dq[1] = ((0,t[0],t[1],t[2])/2)*q0dq[1][3] = -0.5f*(t[0]*q0[0] + t[1]*q0[1] + t[2]*q0[2]); // Scalar

Componentdq[1][0] = 0.5f*( t[0]*q0[3] + t[1]*q0[2] - t[2]*q0[1]); // Vector

Component 0dq[1][1] = 0.5f*(-t[0]*q0[2] + t[1]*q0[3] + t[2]*q0[0]); // Vector

Component 1dq[1][2] = 0.5f*( t[0]*q0[1] - t[1]*q0[0] + t[2]*q0[3]); // Vector

Component 2}

Generating a Dual Quaternion

Dual Quaternion Blending

// Input: array of dual quaternions 'dqIn'// Input: array of weights 'w‘, totaling 1.0// Input: size of the above two arrays (> 1)// Output: the blended dual quaternion 'dqOut' static void DQB( const float dqIn[][2][4], float w[], int numDQ, float dqOut[2][4] ){ // dqOut = w[0]*dqIn[0] Vec4Scale( dqIn[0][0], w[0], dqOut[0] ); Vec4Scale( dqIn[0][1], w[0], dqOut[1] ); for( int i = 1; i < numDQ; ++i ) { // dqOut += w[i]*dqIn[i] Vec4Mad( dqOut[0], w[i], dqIn[i][0], dqOut[0] ); Vec4Mad( dqOut[1], w[i], dqIn[i][1], dqOut[1] ); }}

Transformation Using a Dual Quaternion

// Input: unit dual quaternion 'dq' // Input: input position 'vecIn' // Output: rigidly transformed position 'vecOut' static void DQTransform( const float dq[2][4], const vec3_t vecIn, vec3_t vecOut ){ vec4_t q0, q1; float a0, ae, recipDeLen; vec3_t d0, de, temp1, temp2, temp3, temp4, temp5; vec3_t temp6, temp7, temp8, temp9, temp10, temp11;

recipDeLen = 1.0f / I_sqrt( dq[0][3]*dq[0][3] + dq[0][0]*dq[0][0] + dq[0][1]*dq[0][1] + dq[0][2]*dq[0][2] );

// Normalize both parts of the dual quaternion, based // on the length of the non-dual part. Vec4Scale( dq[0], recipDeLen, q0 ); Vec4Scale( dq[1], recipDeLen, q1 );

// Isolate the scalar and vector parts of both // quaternions. This is just for code clarity and can // be omitted for SIMD optimization. a0 = q0[3]; ae = q1[3]; memcpy( d0, &q0[0], sizeof( d0 )); memcpy( de, &q1[0], sizeof( de ));

// Transform 'vecIn' by the dual quaternion // to produce 'vecOut'. vecOut = dq*v*dq^-1 Vec3Cross( d0, vecIn, temp1 ); Vec3Mad( temp1, a0, vecIn, temp2 ); Vec3Scale( de, a0, temp3 ); Vec3Scale( d0, ae, temp4 ); Vec3Cross( d0, de, temp5 ); Vec3Sub( temp3, temp4, temp6 ); Vec3Add( temp6, temp5, temp7 ); Vec3Scale( temp7, 2.0f, temp8 ); Vec3Scale( d0, 2.0f, temp9 ); Vec3Cross( temp9, temp2, temp10 ); Vec3Add( vecIn, temp10, temp11 ); Vec3Add( temp11, temp8, vecOut );}

ing (2

ing (3

ing (4

Transf

orm Po

Transf

orm Ve

101520253035

Matrix Skinning (column-major)DQB Skinning

Instruction Counts (XB360 VMX )

101520253035

Matrix Skinning (row-major)DQB Skinning

Instruction Counts (XB360 GPU)

On GCN GPU DQ Skinning

Matrix Skinning

Aggregate $ Efficiency VGPR Count Memory Stalls DRAM Footprint

DQ vs. Matrix Skinning

DQ Skinning is ~24% faster***

***: Depends heavily on vertex layout, tangent space quality, number of bones, and weighting distributions.

Optional Optimizations: Compress quaternions

10:10:10:2 format for non-dual component Tune max waves/SIMD Generate skinning transforms on the GPU

Procedural Motions

Especially when animations are played on characters with different or custom proportions.

Ragdolls: Can you spot all the artifacts DQB would resolve?

Pros GPU/SIMD friendly

No asset changes required Cheaper transform blending

More cache friendly Requires less memory/constants Conducive to procedural motions (Mostly) replaces the need for

the rotational split bones mentioned earlier.

Can be enabled selectively (per-LOD, per-submesh, high end

machines only)

Cons Less intuitive than matrices

Local scaling must be handled separately

Actual vertex transform is more ALU

Still not 100% accurate Potential bulge artifacts

Not widely adopted in games (yet)

No more flat asses!

Skinning

Blend Shapes

Skinning

Geometry Caching

Skinning

“Bulging-free dual quaternion skinning” (Kim, 2014)

Skinning

1.Solve for: Bone weights on to minimize for all t.

2.Re-weight artists-selected vertices in Maya/Max.

Skinning

The optimal model skinning approach can vary per platform.

Give dual quaternion skinning a look. Don’t assume skinning is a “solved

problem”.(Unless you’re Leatherface)

Conclusion

Rulon@InfinityWard.com

Questions?

Umbra Ignite 2015: Rulon Raymond – The State of Skinning – a dive into modern approaches to...

Technology

Transcript of Umbra Ignite 2015: Rulon Raymond – The State of Skinning – a dive into modern approaches to...

Skinning · 2016. 11. 21. · Skinning •Skinning is the process of attaching a renderable skin to an underlying articulated skeleton. •There are several approaches to skinning

Umbra Home Products

Butchering Skinning & Tanning

John Saul - Umbra

09 Skinning

Implicit skinning

Skinning MindTouch Deki

Cognos Skinning

UMBRA Software reference

Umbra Catalog 2013

Umbra Les 01

UMBRA GROUP - it

Umbra Turris

Skinning Tut

Advanced Skinning

> UMBRA Software reference

Umbra Les 15

DotNetNuke Skinning

Skinning - web.stanford.edu

Skinning and Javascript