Computer Vision Segmentation Marc Pollefeys COMP 256 Some slides and illustrations from D. Forsyth,...
-
date post
22-Dec-2015 -
Category
Documents
-
view
217 -
download
1
Transcript of Computer Vision Segmentation Marc Pollefeys COMP 256 Some slides and illustrations from D. Forsyth,...
ComputerVision
Segmentation
Marc PollefeysCOMP 256
Some slides and illustrations from D. Forsyth, T. Darrel,...
ComputerVision
Aug 26/28 - Introduction
Sep 2/4 Cameras Radiometry
Sep 9/11 Sources & Shadows Color
Sep 16/18 Linear filters & edges
(hurricane Isabel)
Sep 23/25 Pyramids & Texture Multi-View Geometry
Sep30/Oct2 Stereo Project proposals
Oct 7/9 Tracking (Welch) Optical flow
Oct 14/16 - -
Oct 21/23 Silhouettes/carving (Fall break)
Oct 28/30 - Structure from motion
Nov 4/6 Project update Proj. SfM
Nov 11/13 Camera calibration Segmentation
Nov 18/20 Fitting Prob. segm.&fit.
Nov 25/27 Matching templates (Thanksgiving)
Dec 2/4 Matching relations Range data
Dec ? Final project
Tentative class schedule
ComputerVision Segmentation and Grouping
• Motivation: not information is evidence
• Obtain a compact representation from an image/motion sequence/set of tokens
• Should support application
• Broad theory is absent at present
• Grouping (or clustering)– collect together
tokens that “belong together”
• Fitting– associate a model
with tokens– issues
• which model?• which token goes
to which element?• how many
elements in the model?
ComputerVision General ideas
• tokens– whatever we need to
group (pixels, points, surface elements, etc., etc.)
• top down segmentation– tokens belong
together because they lie on the same object
• bottom up segmentation– tokens belong
together because they are locally coherent
• These two are not mutually exclusive
ComputerVision
Why do these tokens belong together?
ComputerVision
ComputerVision
Basic ideas of grouping in humans
• Figure-ground discrimination– grouping can be
seen in terms of allocating some elements to a figure, some to ground
– impoverished theory
• Gestalt properties– elements in a
collection of elements can have properties that result from relationships (Muller-Lyer effect)
• gestaltqualitat
– A series of factors affect whether elements should be grouped together
• Gestalt factors
ComputerVision
ComputerVision
ComputerVision
ComputerVision
ComputerVision
ComputerVision
ComputerVision
ComputerVision
Technique: Shot Boundary Detection
• Find the shots in a sequence of video– shot boundaries
usually result in big differences between succeeding frames
• Strategy:– compute interframe
distances– declare a boundary
where these are big
• Possible distances– frame differences– histogram differences– block comparisons– edge differences
• Applications:– representation for
movies, or video sequences
• find shot boundaries• obtain “most
representative” frame
– supports search
ComputerVision
Technique: Background Subtraction
• If we know what the background looks like, it is easy to identify “interesting bits”
• Applications– Person in an office– Tracking cars on a
road– surveillance
• Approach:– use a moving
average to estimate background image
– subtract from current frame
– large absolute values are interesting pixels
• trick: use morphological operations to clean up pixels
ComputerVision
ComputerVision
ComputerVision
ComputerVision
ComputerVision
ComputerVision
ComputerVision
ComputerVision
ComputerVision
ComputerVision
ComputerVision Segmentation as clustering
• Cluster together (pixels, tokens, etc.) that belong together
• Agglomerative clustering– attach closest to
cluster it is closest to– repeat
• Divisive clustering– split cluster along best
boundary– repeat
• Point-Cluster distance– single-link clustering– complete-link
clustering– group-average
clustering
• Dendrograms– yield a picture of
output as clustering process continues
ComputerVision Simple clustering algorithms
ComputerVision
ComputerVision K-Means
• Choose a fixed number of clusters• Choose cluster centers and point-cluster allocations
to minimize error
• can’t do this by search, because there are too many possible allocations.
x j i2
jelements of i'th cluster
iclusters
ComputerVision
K-means clustering using intensity alone and color alone
Image Clusters on intensity Clusters on color
ComputerVision
K-means using color alone, 11 segments
Image Clusters on color
ComputerVision
K-means usingcolor alone,11 segments.
ComputerVision
K-means using colour andposition, 20 segments
ComputerVision Graph theoretic clustering
• Represent tokens using a weighted graph.– affinity matrix
• Cut up this graph to get subgraphs with strong interior links
ComputerVision
ComputerVision
ComputerVision
ComputerVision
ComputerVision
ComputerVision Measuring Affinity
Intensity
Texture
Distance
aff x, y exp 12 i
2
I x I y 2
aff x, y exp 12 d
2
x y
2
aff x, y exp 12 t
2
c x c y 2
ComputerVision Scale affects affinity
ComputerVision Eigenvectors and cuts
• Simplest idea: we want a vector a giving the association between each element and a cluster
• We want elements within this cluster to, on the whole, have strong affinity with one another
• We could maximize
• But need the constraint
• This is an eigenvalue problem - choose the eigenvector of A with largest eigenvalue
aTAa
aTa1
ComputerVision Example eigenvector
points
matrix
eigenvector
ComputerVision
ComputerVision More than two segments
• Two options– Recursively split each side to get a tree,
continuing till the eigenvalues are too small
– Use the other eigenvectors
ComputerVision More than two segments
ComputerVision Normalized cuts
• Current criterion evaluates within cluster similarity, but not across cluster difference
• Instead, we’d like to maximize the within cluster similarity compared to the across cluster difference
• Write graph as V, one cluster as A and the other as B
• Maximize
• i.e. construct A, B such that their within cluster similarity is high compared to their association with the rest of the graph
assoc(A,A)assoc(A,V )
assoc(B,B)assoc(B,V )
ComputerVision
ComputerVision
ComputerVision Normalized cuts
• Write a vector y whose elements are 1 if item is in A, -b if it’s in B
• Write the matrix of the graph as W, and the matrix which has the row sums of W on its diagonal as D, 1 is the vector with all ones.
• Criterion becomes
• and we have a constraint
• This is hard to do, because y’s values are quantized
miny
yT D W yyTDy
yTD1 0
ComputerVision
ComputerVision Normalized cuts
• Instead, solve the generalized eigenvalue problem
• which gives
• Now look for a quantization threshold that maximises the criterion --- i.e all components of y above that threshold go to one, all below go to -b
maxy yT D W y subject to yTDy 1
D W yDy
ComputerVision
Figure from “Image and video segmentation: the normalised cut framework”, by Shi and Malik, copyright IEEE, 1998
ComputerVision
Figure from “Normalized cuts and image segmentation,” Shi and Malik, copyright IEEE, 2000
ComputerVision Image parsing:
Unifying Segmentation, Detection and RecognitionTu, Chen, Yuille & Zhu ICCV’03 (Marr prize!)
ComputerVision Image parsing
• Generative models for each classsome random samples for 5 and H
some random face samples (PCA model)
ComputerVision DDMCMC
Data Driven Markov Chain Monte Carlo
ComputerVision Image parsing
ComputerVision Image parsing
ComputerVision Next class:
Fitting
Reading: Chapter 15