Statistical models of shape and appearance
description
Transcript of Statistical models of shape and appearance
1
Tim Cootes 2014
Statistical Models of Shape and
Appearance
Tim Cootes
Imaging Sciences,
The University of Manchester
http://personalpages.manchester.ac.uk/staff/timothy.f.cootes
(or search for “Tim Cootes”)
Aim
Interpret images using models of appearance
– ‘explain’ the image
Fit Model Model
Parameters
Why is it difficult?
• Wide variation in shape and appearance
Tim Cootes 2013
Overview
• Statistical Shape Models
• Combined Appearance Models
• Matching Algorithms:
– Active Shape Models
– Active Appearance Models
– Constrained Local Models
• Groupwise registration
– Obtain correspondences to build models
Model Building
Example Images
Shape Models
Appearance Models
Statistical
Analysis
Statistical Shape Models
• Statistical model of shape variation
2
Building Shape Models
• Require labelled training images
– landmarks represent correspondences
T
nn ,y,,y,x,(x )11 x
Shape
• Need to model the variability in shape
• What is shape?
– Geometric information that remains when
location, scale and rotational effects removed
(Kendall)
Same Shape Different Shape
Shape
• More generally
– Shape is the geometric information invariant
to a particular class of transformations
• Transformations:
– Euclidean (translation + rotation)
– Similarity (translation+rotation+scaling)
– Affine
Statistical Shape Models
Given a set of shapes:
• Align shapes into common frame
– Procrustes analysis
• Estimate shape distribution p(x)
– Single Gaussian often sufficient
• Approximate with parameterised model
Pbxx
Aligning Two Shapes
• Procrustes analysis:
– Find transformation which minimises
– Resulting shapes have
• identical CoG
• approximately the same scale and orientation
2
21 |)(| xx T
Aligning a Set of Shapes
• Generalised Procrustes Analysis
– Find the transformations Ti which minimise
– Where
– Under the constraint that
2|)(| iiT xm
)(1
iiTn
xm
1|| m
3
Aligning Shapes
• Generalised Procrustes Analysis
– Align each shape to the mean
Unaligned Align with translation Align with similarity
Aligned Shapes
• Need to model the aligned shapes
x
space shape
Dimensionality Reduction
• Co-ordinates often correlated
• Nearby points move together
11bpxx
1b
xx
1p
Principal Component Analysis
• Compute eigenvectors of covariance, S
• Eigenvectors : main directions
• Eigenvalues : variance along eigenvector
11p22 p
x
1x
2x
Dimensionality Reduction
• Data lies in subspace of reduced dim.
• However, for some t,
i
i
nnbb ppxx 11
tjb j if 0
t
) is of (Variance jjb
Building Shape Models
• Given aligned shapes, { }
• Apply PCA
– Compute mean and eigenvectors of covar.
• P – First t eigenvectors of covar. matrix
• b – Shape model parameters
ix
332211 pppx
Pbxx
bbb
4
Face Model Modes
• 100 people, 4 images each
1 Varying b 2 Varying b 3 Varying b 4 Varying b
332211 pppxx bbb
Hand Shape Model
1 Varying b2 Varying b
3 Varying b
Spine Shape Model
Knee shape model
1b 2b 3b
Xenios Milidonis & ARCOGEN
Prostate Model Mode 1
Danny Allen
Brain structure modes
5
Appearance Models
• Model both shape and texture
• Use all image information
+ =
Building Texture Models
• For each example, extract texture vector
• Normalise vectors
• Build eigen-model
Texture, g
Warp to
mean
shape
ggbPgg
gg /)( 1gg
Face Texture Model
1b12 12
2b22 22
3b32 32
Combined Models
• Shape and texture often correlated
– When smile, shadows change (texture) and
shape changes
• Learning this correlation leads to more
compact (and specific) model
Learning Correlations
sb
gbModel assuming
shape and texture
independent
Model accounting for
correlations between shape
and texture
Learning Correlations
• For each image in set we have best fitting
shape and texture param.s
• Construct new vector,
• Apply PCA (mean + eigenvec.s of covar.)
gs bb ,
g
s
c b
Wbb
cQgg
cQxx
g
s
Varying c changes both
shape and texture
6
Global Face Model
• 400 images from 100 different people
Note: Mixes ID, expression, head pose etc
Face Variation Gender Change: Ethnic Group:
Face Variation
Age Change:
Model Matching
Active Shape Models
• Match shape model to new image
• Require:
– Statistical shape model
– Model of image structure at each point
Model Point
Model of Profile
Model of Region
Placing model in image
• The model points are defined in a model
co-ordinate frame
• Must apply global transformation,T, to
place in image
Model Frame Image
Pbxx
)( PbxX T),,,;( sYXT ccx
7
ASM Search Overview
• Local optimisation
• Initialise near target
– Search nearby for best match, X’
– Update parameters to match to X’.
)','( ii YX
Local Structure Models
• Need to search for local match for each point
• Often sufficient to search along profile
• Model
– Strongest edge
– Correlation
– Statistical model of profile/patch
– Discriminative model
Profile Models
• Sometimes true point not on strongest
edge
• Model local structure to help locate the
point
Strongest edge True position
Profile Models
• For each point in model
– For each training image
• Sample values along profile
• Normalise
– Build statistical model
• eg Gaussian PDF using eigen-model approach
Or:
• Train classifier to distinguish true/false positions
Searching Along Profiles
• During search we look along a normal for
the best match for each profile
)(xg
x
))(( xp g
)(xg
Form vector from samples
about x
Search algorithm
• Search along profile
• Update global transformation, T, and
parameters, b, to minimise
2|)(| PbxX T
),( ii YX
8
Update step
• Hard constraints
• Soft constraints
• Can also weight by quality of local match
tp)p(T bPbxX subject to |)(| Minimise 2
)(log/|)()(| Minimise 2
r
21 )p( T bPbxX
Multi-Resolution Search
• Train models at each level of pyramid
– Gaussian pyramid with step size 2
– Use same points but different local models
• Start search at coarse resolution
– Refine at finer resolution
Example : Hip Radiograph
11bpxx
1 1 33 1 b
ASM Example: Spine
3D Face Model
Mean
1 2
3
Angela Caunce
Trained on approx. 1000 people Matching 3D ASMs
• Search for each point independently
– Local model depends on current pose
• Estimate shape param.s + projection
Angela Caunce
9
Face Tracking with a 3D Model
Angela Caunce
Active Shape Models
• Advantages
– Fast, simple, accurate
– Efficient to extend to 3D
• Disadvantages
– Only sparse use of image information
– Treat local models as independent
Active Appearance Models
• We have a statistical appearance model
– Trained from sets of examples
• How do we match to new images?
• Use an “Active Appearance Model”
– Efficient iterative matching method
Model Parameters
• Combined model parameters, c
– Shape parameters
– Texture model parameters
• Pose parameters
• Texture transformation 1gg mim
cQ
Q
b
b
2
1
g
s
),,,( sYX cc
),,,,,,( sYX cccp
Key Step in AAM
• Estimate parameter update from current
image sample
Diff. in Ref Frame
p d
p d
Estimating the update step
Linear:
R estimated
– Using Jacobian
– Using Linear Regression
– Using Canonical Correlation Analysis
Non-linear:
– Boosted classifiers
– Boosted regression
)(sp Fd
Rsp d
10
Justification for Jacobian
• Residual difference:
• Measurements in reference frame:
– Model:
– Sample warped to ref:
• Simplest form : Minimise
– Can add robust kernels
)()()( pIpIpr imm
cQgpI gm ˆ)(
)(pI im
2|)(|)( prp E
Prediction using the Jacobian
pp
rprppr δ
)()( d
)()()E( pprpprpp ddd T minimize To
TT
p
r
p
r
p
rJR
1
)(δ pRrp
p
rJ
j
iij
dp
drJ
Taylor expansion:
Estimating Jacobian (gradient)
• Need estimate of the Jacobian:
• Estimate each term by small
displacements on the training set
p
rJ
j
iij
dp
drJ
d
dd
2
)()(
jiji
ij
pdrpdrJ
Average over all examples
How Good are the Predictions?
• Predicted dx vs. actual dx over test set
Multi-Resolution Predictions
• Predicted dx vs. actual dx over test set
AAM Algorithm
• Initial estimate Im(p)
• Start at coarse resolution
• At each resolution
– Measure residual error, r(p)
– predict correction dp = Rr
– p p - dp
– repeat to convergence
11
Face Search Face Search
Sub-cortical Structures
Initial Position Converged
Brain search
3D Model-Based Segmentation
Model of femur Using the model to segment the femur
Regression Approach
• Assume functional relationship
s = normalised intensities, or residuals
• Learn function from displaced training
examples
)(sp Fd
}{ ipd )}({ itruei ppss d
Random perturbations Image samples
12
Linear Regression
Rsp d
Scale Rotation Y-trans Shape 1 X-trans Shape 2
Phil Tresadern
frame ref.in intensity Normaliseds
Rows of R from PC Regression
R from Linear Regression better than that from Jacobian
Non-linear Regression
• Boosted Haar features:
n
i iij Hfp1
))(( sd
featureHaar :)(siH
function 1DLearnt : )(xf i
Phil Tresadern
Linear vs Non-linear
– Non-linear methods perform better for poor
initializations but no better close to solution
Phil Tresadern
Jacobian
Linear Regression
Non-linear Regression
AAM Tracking
• Sequence of AAMs with increasing
resolution
• Generic – trained on range of people
• Search each frame in a few ms
Mircea Ionita,
Phil Tresadern
AAM Developments
• Improved optimisation [Matthews &Baker, IJCV2004]
• Improved features [Cootes:CVPR01]
• 2D+3D AAMs [Xiao:CVPR04]
• Canonical Correlation Analysis [Donner:PAMI06]
• Update matrix computation [Saragih:ICPR06]
• Boosted Appearance Models [Liu:CVPR07]
• Non-linear updates [Saragih:ICCV07]
• Fast boosted AAMs [Tresadern:BMVC10]
• Active Orientation Models [Tzimiropoulos:2013]
• Supervised Descent Method [Xiong:CVPR2013]
• Explicit Shape Regression [Cao:CVPR12]
(And lots of others)
Local Model Matching
Find shape and pose parameters to optimise
Shape param.s Pose params
13
Classification vs Regression
Classification:
• Is this the point?
Classification vs Regression
Classification:
• Is this the point?
Regression:
• Where is the point?
Finding Points using Regression Voting
• Train “Random Forest” to predict offsets
Each patch fed into each tree to predict offset and weight
Finding Points using Regression Voting
• Scan regressor across region
• Each patch produces 1 vote per tree
• Accumulate votes in an array
Fully automated femur detection • Tested on 839 radiographs (ARCOGEN)
Error <0.9mm for 99% of cases
Claudia Lindner
Knee Radiographs
Median 95%ile 99%ile
Mean point-to-curve error:
< 1mm for 99% of 500 images
Fully automatic search results
using AP knee radiographs.
Claudia Lindner, Xenios Milidonis
14
Groupwise Registration
• Accurate correspondences required to
build shape/appearance models
– Slow/hard to manually annotate data
…
…
Tim Cootes 2011
Goal: Fully Automatic System
• Build models from unlabelled images
… Model
Approach
…
Find the correspondences across the set of images
Construct model in an ‘average’ reference frame
Representing of Deformation
• Use piece-wise linear warp field
– Triangular/Tetrahedral mesh
– Simple, compact, efficient
– Trivially invertible
Groupwise Algorithm
• Initialisation: Affine registration
• Generate control points
• Repeat
– Build model from current data
– For each image
• Optimise positions of control points
• Until happy/dead/conference deadline etc
Image Features
Linear Normalisation
),max(
ˆ
min i
iii
ggg
Local z-normalisation
Raw Image
ggg i
i
ˆ
1) z-norm
2) Smoothed |Gx|
3) Smoothed |Gy|
15
Registering Faces
• 293 individuals from XM2VTS
• Evolution of robust estimate of mean:
Example Modes
Varying appearance model parameters (Note shadowing caused by glasses)
Modes of model built from images without glasses:
Hand Radiographs
Registered Mean
Zhang Pei
3D Registration
• Registering 270 3D MR Brain Images
• Evolution of the model mean:
Summary
• Statistical shape models
– Powerful representations of objects
– Usually only need small number of params
• Model matching methods – ASM/CLM: Local search + Shape Regularisation
– AAMs: Direct prediction of update steps
• Groupwise Registration
– Automatically build models from sets of images