Representation and Detection of Shapes in Images

Representation and Detection

of Shapes in Images

Pedro F. Felzenszwalb

Department of Computer Science

University of Chicago

Introduction

Study of shape is a recurring theme in computer vision.

• It is important for object recognition.

• Useful for model-based segmentation.

Talk outline

1. Representation of objects using triangulated polygons.

2. Finding a non-rigid object in an image.

3. Learning a non-rigid shape model from examples.

4. Shape grammar for modeling generic objects.

Triangulated polygon representation

• Consider two-dimensional objects with piecewise-smooth

boundaries and no holes.

• Approximate object using a simple polygon P .

• A triangulation is a decomposition of P into triangles defined

by non-crossing line segments connecting vertices of P .

object polygon triangulation

Constrained Delauney triangulation

Natural decomposition of object into parts, closely related to the

medial axis transform.

Definition.The constrained Delauney triangulation contains the

edge ab if a is visible to b and there is a circle through a and b

that contains no vertex c visible to ab.

Structural properties

There are two graphs associated with a triangulated polygon.

Dual graph of triangulated simple polygon is a tree.

Graphical structure of triangulation is a 2-tree.

2-trees

A 2-tree is a graph defined by a set of “triangles” (3-cliques)

connected along edges in a tree structure.

Every 2-tree admits a perfect elimination order :

After eliminating the first i

vertices, the next one is in a

single triangle.

Two-dimensional shape

How does a triangulation help describe the shape of a polygon?

We say to objects have the same shape if they are related by a

similarity transformation (translation, rotation, scale change).

Different objects with the same shape.

Shape of triangulated polygons

Say we have an object defined by the location of n vertices V ,

and G = (V, E) is a 2-tree.

We can pick any shape for each “triangle” in G and obtain a

unique shape for the object.

⇒ Shape of object is a point in M1 × · · · × Mn−2, where each M

is a space of triangle shapes.

Deforming triangulated polygons

(x1, . . . , xi, . . . , xn−2) → (x1, . . . , x′i, . . . , xn−2)

The rabbit ear can be bent by changing

the shape of a single triangle.

Finding non-rigid objects in images

Deformable template matching

Find “optimal” map from a template to the image.

f : →

Quality of f depends on

• how much the template is deformed.

• correlation between the deformed template and image data.

Prior work

Most methods are based on local search techniques and depend

on initialization near the right answer.

[Grenander et al.] Deformable boundary models.

[Widrow] Rubber masks.

[Cootes et al.] Active shape and appearance models.

Few methods are based on global optimization.

[Amit, Kong] Sparse landmarks.

[Coughlan et al.] Open curves.

Major challenges

• Represent both the boundary and the interior of objects.

• Capture natural shape deformations.

• Efficient matching algorithms:

– Search for the optimal deformation - global minimum of

cost function.

– Initialization-free, invariant to rigid motions and scale.

Matching triangulated polygons

• Let T be a triangulation of a simple polygon P .

• Consider continuous maps f :P →R2

that are affine when restricted to each triangle.

– f takes triangles in the model to triangles in the image.

– f is defined by where it sends the vertices of P .

• Quality of f is given by a sum of costs per triangle,

C(f, I) =∑

Ct(ft, I)

Example cost function

• Deformation cost for each triangle.

• Shape boundary is attracted to high gradient areas.

C(f, I) =∑

def(ft) − λ∫∂P

‖(∇I ◦ f)(s) × f ′(s)‖

‖f ′(s)‖ds

def(ft) measures how far ft is from a similarity transformation.

Combinatorial optimization

• Restrict f(vi) to be a location li in a grid G.

• Dynamic programming algorithm using elimination order.

• Running time is O(n|G|3), where n is number of vertices.

At step i, find optimal location for

vi as a function of locations of two

other vertices.

Matching results

Nosy images.

Multiple instances.

Contrast with local search method

Initialization:

Result:

Learning models

Given multiple examples of an object,

• Pick a common triangulation.

• Learn shape model for each triangle (mean and variance).

Procrustes analysis

Given sample object configurations {X1, . . . , Xm}.

Assume each configuration comes from a mean by perturbation

ǫ and similarity transformation g,

Xj = gj(µ + ǫj)

Procrustes mean shape:

µ̂ = argmin||µ||=1

m∑j=1

||µ − gjXj||2

Learning triangulated models

• Use Procrustes analysis for each possible triangle.

– Local instead of global rigidity assumption.

• Select triangulation that can best represent examples.

– There is a cost associated with each triangle.

(corresponding to how rigid it is)

– Can use dynamic programming to select optimal one.

Local versus global rigidity assumption

Procrustes mean Triangulated model

A few of a total of 40 samples of hands from multiple people.

Typical deformations of learned model

Random samples from the prior model for hands.

Shape grammar

Finding objects in images without using specific models.

• Build a generic shape model to capture

properties of “typical” objects.

• Gestalt laws: continuity, smoothness,

closure, symmetry, etc.

Shape grammar

• Define a stochastic growth process that generates triangu-

lated polygons using a context free grammar.

• The grammar can generate any triangulated polygon, but it

tends to generate shapes with certain properties.

– Gives a generic model for objects.

– Captures which are good interpretations of a scene.

Shape tokens

t0 t1 t2

• t0 corresponds to ends of branches.

• sequences of t1 correspond to branches.

• t2 connects multiple branches together.

Growing a shape

• A root triangle of type i is selected with probability pi.

• Each dotted edge “grows” into a new triangle.

• Repeat until there are no more dotted edges.

• For each triangle type there is a distribution over its shape.

Structure

• Growth process always terminates when p2 < p0.

• Expected number of triangles is E[n] = 2/(p0 − p2).

• Expected number of branching points is E[j] = 2p2/(p0−p2).

• Together E[n] and E[j] define p0, p1 and p2.

• Typically p1 ≫ p0, p2.

Geometry

If t1 is skinny and isosceles,

• shapes have smooth boundaries almost everywhere.

• each branch tends to have axial symmetry.

Random shapes

Finding objects in images

• Grammar generates random objects, without taking into ac-

count image data.

• Look for triangulated polygons that align with image features

and would likely be generated by the grammar.

– These are good hypotheses for objects in the scene.

• This process generates possible interpretations of a scene

and a separate process can verify each one.

Example results

Summary

• Representation of objects using triangulated polygons.

– Detecting deformable objects.

– Learning deformable shape models from examples.

– Detecting generic objects.

• Future work:

– Richer grammars, with intermediate structure.

– Shape classification.

Representation and Detection of Shapes in Images

Documents

Transcript of Representation and Detection of Shapes in Images

Survey on Representation Techniques for Malware Detection ...

Deep Sliding Shapes for Amodal 3D Object Detection in RGB ... · [10]). To make good use of the depth information, Sliding Shapes [25] was proposed to slide a 3D detection window

Representation and Detection of Deformable Shapescs.brown.edu/people/pfelzens/papers/shapesJ.pdf · Deformable Shapes Pedro F. Felzenszwalb Department of Computer Science The University

Face detection using representation learningfcv2011.ulsan.ac.kr/files/announcement/656/Face detection... · 2018. 11. 24. · Face detection using representation learning Shu Zhann,

The detection and representation of foreground vs. background objects

Detection of Linear and Circular Shapes in Image Analysis · 2009-10-07 · Detection of Linear and Circular Shapes in Image Analysis ... with an application in biology. The problem

Representation and Detection of Shapes in Images Pedro F ...people.cs.uchicago.edu/~pff/papers/pff.pdf · Characterizing the shape of a speciﬁc rigid object is not a particularly

Road signs shapes detection based on Sobel phase analysis · only circular shapes are detected; the algorithm presented in [15] allows triangles, rectangles and octagons detection,

Representation and Detection of Shapes in Images Pedro F. …cs.brown.edu/people/pfelzens/papers/pff.pdf · meant to be a ﬁnal answer to an application, but the ﬁrst step in solving

TextSnake: A Flexible Representation for Detecting …TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes Shangbang Long 1,2, Jiaqiang Ruan , Wenjie Zhang ,

RepPoints: Point Set Representation for Object Detection · Figure 1. RepPoints is a new representation for object detection that consists of a set of points which indicate the spatial

Shape-Representation and Shape Similarity Dr. Rolf Lakaemper Part 1: Shapes.

Representation, Segmentation and Matching of 3D Visual Shapes ...

Neuromimetic Sound Representation for Percept Detection and …dz/pbpslist/eurasip03_vp.pdf · 2005. 12. 19. · Neuromimetic Sound Representation for Percept Detection and Manipulation

Learning Object-Wise Semantic Representation for Detection in …€¦ · Learning Object-Wise Semantic Representation for Detection in Remote Sensing Imagery Chengzheng Li1, Chunyan

Overlapping Community Detection via Edge-Space Representation · Overlapping Community Detection via Edge-Space Representation Fengjie Chen Yikun Zhang ... community detection algorithms

COMPUTATIONAL REPRESENTATION OF PROTEIN SEQUENCES …etd.lib.metu.edu.tr/upload/12606997/index.pdf · COMPUTATIONAL REPRESENTATION OF PROTEIN SEQUENCES FOR HOMOLOGY DETECTION AND

Deep Sliding Shapes for Amodal 3D Object Detection in …dss.cs.princeton.edu/paper.pdf · · 2016-06-19Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images ... Object

3D ShapeNets: A Deep Representation for Volumetric Shapes · 3D ShapeNets: A Deep Representation for Volumetric Shapes ... To train our 3D deep learning model, ... surface correspondence

Functional maps: A Flexible Representation of Maps Between Shapes