MSRI Program: Mathematical, Computational and Statistical Aspects of Vision Introductory workshop,...

23
MSRI Program: Mathematical, Computational and Statistical Aspects of Vision Introductory workshop, Jan 24-28, 2005 Modeling Shape David Mumford

Transcript of MSRI Program: Mathematical, Computational and Statistical Aspects of Vision Introductory workshop,...

Page 1: MSRI Program: Mathematical, Computational and Statistical Aspects of Vision Introductory workshop, Jan 24-28, 2005 Modeling Shape David Mumford.

MSRI Program:Mathematical, Computational

and Statistical Aspects of Vision

Introductory workshop, Jan 24-28, 2005

Modeling Shape

David Mumford

Page 2: MSRI Program: Mathematical, Computational and Statistical Aspects of Vision Introductory workshop, Jan 24-28, 2005 Modeling Shape David Mumford.

The problem of shape

• 1D vs. 2D signals: in 1D, boundaries decompose the signal domain into intervals; in 2D, the boundaries and the parts are less trivial. One can often recognize an object in an image by its shape alone.

• By a ‘shape’ in n-dimensions, let’s mean an open subset SRn with not too convoluted a boundary and usually meaning topologically a ball.

• Remarkably, people find it natural to answer the question: are two shapes S and T similar?

• The first key problem in a computational theory is to define similarity, i.e. to put a metric on the space of all shapes.

Page 3: MSRI Program: Mathematical, Computational and Statistical Aspects of Vision Introductory workshop, Jan 24-28, 2005 Modeling Shape David Mumford.

What is the space of shapes?

2 ≈ set of all “smooth” connected plane curves, no self-intersections (“simple closed curves”)

• Infinite dimensional!• Not a vector space• BUT, locally linear, i.e. a manifold

• Tangent vector to n at shape R =

normal vector field along bdry of R

( ) ( ) ( ). ( ), ( ) unit normala s s a s n s n s

y f= + =

{ }( )

2 ,

smooth

v.sp.of fcns.

a

U

U a

a

ff

f y

=

=

Ì

S

( ) ( )a s n s

Page 4: MSRI Program: Mathematical, Computational and Statistical Aspects of Vision Introductory workshop, Jan 24-28, 2005 Modeling Shape David Mumford.

Think of n geometrically

• A curve on n is a warping of one shape to another.

• On 2, the set of ellipses forms a surface:

• The geometric heat equation:

is a vector field on n

, (where , , is a path in shape space,

the mean curvature and the unit normal to the shapes resp.)

t t t t

tR R t R R

Rn R n

t

Page 5: MSRI Program: Mathematical, Computational and Statistical Aspects of Vision Introductory workshop, Jan 24-28, 2005 Modeling Shape David Mumford.

2D shapes often come in categories

• Typical shapes and examples of desired clustering in computer vision experiments. Top right: samples from the NIST handwritten zip code database often used in statistical learning theory; the ‘hat’ is Saint-Exupery’s pattern recognition challenge. • Such categories should be subsets 2 and datasets give point clouds in 2

• One seeks probability measures on 2 to model these clouds, to do Bayesian inference.

Page 6: MSRI Program: Mathematical, Computational and Statistical Aspects of Vision Introductory workshop, Jan 24-28, 2005 Modeling Shape David Mumford.

Why more than one metric is needed

The central shape is similar in various respects to all 5 of the shapes around in – but in different metrics!

Distance between shapes can be measured via averages (L1), worst cases (L∞) or by mean squares (L2) and using points, 1-jets, 2-jets, etc.

•In L1, distances are: A < B,C < D,E

•In L, distances are: B < C,D <A,E

•In L with 1-jets: D < B,C < A,E

•In L1 with 2-jets: D < A,B < C,E

•To make E close, need ‘robust’ non-convex metrics that discard outliers (e.g. L1/2).

Page 7: MSRI Program: Mathematical, Computational and Statistical Aspects of Vision Introductory workshop, Jan 24-28, 2005 Modeling Shape David Mumford.

Two simple metrics

• L1-metric leading to set of meas. subsets S ⊂ R2:

infinitesimally:

leads to path

length:

leads to global metric:

• Frechet metric (like Hausdorff metric) on cont. maps f:S1→R2:

infinitesimally:

leads to path

length:

leads to global metric:

s Csup ( )a a sÎ=

( )C

a a s ds=ò{ }

1

0

path ( , )

area swept outt

t

C

C a s t dsæ ö÷ç ÷ç= ÷ç ÷ç ÷÷çè ø

=

ò ò

1 2 1 2( , ) area( )d S S S S= D

{ }1

0

path sup ( , )

max. dist. movedt

ts C

C a s t dtÎ

=

³

ò

1 2 2 1 h

( , ) inf sup ( ) ( ( ))idiffeo x S

d f f f x f h xÎ

= -

Neither metric has good geodesics – balls are like boxes, but they stack well, can measure ‘volume’ (K.Leonard, using -entropy)

Page 8: MSRI Program: Mathematical, Computational and Statistical Aspects of Vision Introductory workshop, Jan 24-28, 2005 Modeling Shape David Mumford.

Advantages of L2 metrics• Can define gradient flows of a function.

• Have a beautiful theory of locally unique geodesics, thus a warping of one shape to another.

• Can define the Riemannian curvature tensor. If non-positive, have a good theory of means.

• Can expect a theory of diffusion, of Brownian motion, hence Gaussian-type measures and their mixtures.

WHERE DO THEY COME FROM?

1. Local:

2. Global: use n= gp. of diffeomorphisms of Rn and n≈ n/ subgp fixing unit ball, take quotient of metric on n

3. Conformal (n=2): use 2 ~ diffeos of S1

2 2( ) .

R Ra n a s ds

Page 9: MSRI Program: Mathematical, Computational and Statistical Aspects of Vision Introductory workshop, Jan 24-28, 2005 Modeling Shape David Mumford.

The simplest local metric

• Define the function L(R)=(n-1)-volume of boundary, then

L = heat eqn vector fld.a = κ∂R, i.e.

BUT

These are inf.dim.Riem. manifolds

for which forward geodesics,

curvature, etc. work fine;

But 2 pt. bdry value problem

fails, geod eqn is hyperbolic.

2 2( ) .

R Ra n a s ds

( ) ( ). ,t

t RR

dL R a s ds a

dt

0 11/ 2

12

0

inf(length of path from to )

0 !!t

t

t t

C

C C C

a ds dt

Page 10: MSRI Program: Mathematical, Computational and Statistical Aspects of Vision Introductory workshop, Jan 24-28, 2005 Modeling Shape David Mumford.

The Michor metric – the simplest Riemannian metric

• Infinitesimally:

• Globally:

• If A=0, get ‘geodesic spray’, positive curvature, but infimum of path lengths is zero

• If A>0, the metric controls the change in length(C) and gives interesting geodesics – not always unique.

( )2 2 2( ) . 1 ( )C

C

a a s A s dsk= +ò

{ }1

2

0

path ( , ) .t

t

C

C a s t ds dt=ò ò

Page 11: MSRI Program: Mathematical, Computational and Statistical Aspects of Vision Introductory workshop, Jan 24-28, 2005 Modeling Shape David Mumford.

A geodesic triangleConsider an ellipse rotated through 0, 60 and 120 degrees. These 3 ellipses form a triangle in 2. Using the metrics with A=1.0, 0.1 and 0.01, we join them with 3 geodesics. The path in 2 forming one of these edges is shown in the first row for the 3 metrics. The second row shows the whole triangle of shapes. When A=1, we have negative curvature, the angle sum is 102o and the shapes on the edges fall back towards the unit circle. When A=.01, we have positive curvature, the angle sum is 207o and the edge shapes tend towards parallograms, away from the unit circle.

Page 12: MSRI Program: Mathematical, Computational and Statistical Aspects of Vision Introductory workshop, Jan 24-28, 2005 Modeling Shape David Mumford.

2nd Riemannian metric: diffeomorphisms of n

Write the space of shapes as a homogeneous space w.r.t. Diff(n):

subgp fixing unit sphere)

Put an inner product on vector fields:

2

1

22

= . , pos.def.diff.op.

typically ( ) ,

n

m

mm

v L dx dx L

L I v D v dx

and define the length of a path {θt} in by:

If {θt} is thought of a fluid flow, then νt(x) is its velocity.

This is right invariant,

so acts on right by isometries, so is a ‘Riemannian submersion’,

2 1( ) , ( ) ( )k

tt t tv x dx dt v x x

t

( , ) ( , )d d I

Page 13: MSRI Program: Mathematical, Computational and Statistical Aspects of Vision Introductory workshop, Jan 24-28, 2005 Modeling Shape David Mumford.

A fluid flow equation gives geodesics in these metrics

Geodesics on are geodesics on starting, and hence continuing, to cosets θIf {θt} is thought of a fluid flow, thenvt = velocity, ut = Lvt = ‘momentum’ in this metric.

( . ) div( ) ( ) (( ) )tt t t t t i t i

i

uv u v u u v

t

Geodesics now are solutions to a regularized compressible form of Euler’s equation (Arnold,Vishik):

Treating u as a section of 1 n (so <u,v> makes intrinsic sense), the equation says the momentum u is constant along the flow given by v.

The equation is linear in u, so u can be a generalized function! To get geodesics for shapes, u should be supported on their bdry To get geodesics for finite sets of points, u should be a sum of delta fcns

Page 14: MSRI Program: Mathematical, Computational and Statistical Aspects of Vision Introductory workshop, Jan 24-28, 2005 Modeling Shape David Mumford.

A geodesic in whose momentum is concentrated

at four points (Younes,Miceli)

This is a fast and efficient tool for dealing with shapes via finite sets of landmark points (Kendall).

Page 15: MSRI Program: Mathematical, Computational and Statistical Aspects of Vision Introductory workshop, Jan 24-28, 2005 Modeling Shape David Mumford.

Elasticity – the solid approach

• In a liquid, particles have no memory of where they were initially. In a solid, they do.

• On the group , consider the strain matrix:

• Depending on the material, there will be a strain energy density

and a total strain energy

a function on , measuring distance from I.• e may be inhomogeneous and anisotropic or it

may have a generic form:

• Minimizing the strain energy plus an image mismatch term used the face warping.

, , jacobian of t

S D D D G

( ) ( ( ), )R

E e S x x dx

( ( ), )e S x x

2 21 2 1 2

1 2

( ) ( )( 1) ,

where , are the sing.values of

s s s s

s s D

Page 16: MSRI Program: Mathematical, Computational and Statistical Aspects of Vision Introductory workshop, Jan 24-28, 2005 Modeling Shape David Mumford.

2 examples of image warping via diffeomorphisms

Faces warped using strain energy as prior (P.Hallinan, “2 and 3D Patterns of the Face”)

Heart warped using geodesic length as prior (Miller et al)

Page 17: MSRI Program: Mathematical, Computational and Statistical Aspects of Vision Introductory workshop, Jan 24-28, 2005 Modeling Shape David Mumford.

Shape via complex analysis

• In dimension 2 only, can replace the real coordinates x,y by a single complex coordinate z=x+iy. A basic construction from complex analysis puts nearly unique global coordinates on any shape:

• Apply this twice, to the inside and outside of a shape:

• The fingerprint of the shape is:

, : , and unique

up to , ( ) a Mobius map of

R R conformal

zA A z

z

C f

a bf

b a

»" Ì $ D ¾¾®

+= D

+

( ) ( )0 : ,

: { } { },

with ( ) , ( ) pos.real

R

R

f

f

ff

»

ȴ

¥ ¥

D ¾¾®

- D È ¥ ¾¾® - È ¥

¢¥ =¥ ¥ =

( 1) 10( ) ( ) , the circle z z z S

Page 18: MSRI Program: Mathematical, Computational and Statistical Aspects of Vision Introductory workshop, Jan 24-28, 2005 Modeling Shape David Mumford.

Two examplesAn ellipse and a kidney shaped object, with the conformal parametrization of their interiors and exteriors marked. The interior map has been chosen to carry 0 to 0, (but it may take 0 to any other interior point.)

Page 19: MSRI Program: Mathematical, Computational and Statistical Aspects of Vision Introductory workshop, Jan 24-28, 2005 Modeling Shape David Mumford.

Good things about the conformal approach

• The fingerprint determines the shape up to translation and scaling, i.e. there is a bijection:

• We get an action of the group Diff(S1) on the space of shapes, hence can approximate shapes via words in elementary diffeomorphisms.

• There is a unique Riemannian metric on for which the group action is made of isometries. (Note analogy with ordinary distances on Rn.)

• The curvature of this metric is non-positive, so we have unique geodesics, means, etc.

• This representation leads to a simple construction of an axis, hence to a decomposition of into cells.

1( ) Mobius maps transl.+scalingsS Diff S S

S

S

Page 20: MSRI Program: Mathematical, Computational and Statistical Aspects of Vision Introductory workshop, Jan 24-28, 2005 Modeling Shape David Mumford.

2 geodesics in the W-P metric (E.Sharon)

Page 21: MSRI Program: Mathematical, Computational and Statistical Aspects of Vision Introductory workshop, Jan 24-28, 2005 Modeling Shape David Mumford.

Axes: the royal road to shape description

Humans perceive shapes as having ‘parts’, linked in a combinatorial pattern. The axis gives this (and even bit length compression, Leonard 2004).

Page 22: MSRI Program: Mathematical, Computational and Statistical Aspects of Vision Introductory workshop, Jan 24-28, 2005 Modeling Shape David Mumford.

Axes in three dimensionsAxes in 3D are trickier: Yan Cao’s definition:

Given a shape S, or even an arbitrary measure m with support S, consider the functional on potential axes:

( ) dist( , ) ( ) .length( )p

S

E x dxm aG = G + Gò

An anatomical example:

Page 23: MSRI Program: Mathematical, Computational and Statistical Aspects of Vision Introductory workshop, Jan 24-28, 2005 Modeling Shape David Mumford.

Axes via the fingerprint• Minima of ’ correspond (roughly) to points

on C nearest to 0(0).

( )

{ }0

arg min

complex axis( ) ( (0)) # 1

A

def

A

M A

C A M

y

f

æ ö¢÷ç= ÷ç ÷çè ø

= >

Combinatorial structure of the axis leads to a natural cell decomposition of 2.