Segmentation. (template) matching - UVic.caaalbu/computer vision 2010/L20... · 2010-02-26 · 19...
Transcript of Segmentation. (template) matching - UVic.caaalbu/computer vision 2010/L20... · 2010-02-26 · 19...
1
Segmentation. (template) matching
Announcements
Midterm March 3 2010, in class Duration: 45 min Weight: 20% of final course mark Closed book, closed notes Calculator allowed Practice midterm posted. Solution will
be posted on Sunday. Midterm Review: March 2 Office hours: March 2, 3-5 pm or by
appointment 2
3
Reading
6.4 Adelson and Bergen, Image
Pyramids
4
Template matching
Assumes you know what you are looking for (supervised process)
Copyright ©2008, Thomson Engineering, a division of Thomson Learning Ltd.
6-5
6
Comparing neighborhoods to templates
By linear filtering
Correlation can be considered as a dot product between two vectors:
- the pattern and the considered image region.
- The dot product is maximal (maximum correlation) when the pattern is very similar to the corresponding image region.
7
Optimality matching criterion evaluation
8
Challenge
We need scaled representations because the details of interest can occur at various scales
9
A bar in the big images is a hair on the zebra’s nose; in smaller images, a stripe; in the smallest, the animal’s nose
10
Aliasing Can’t shrink an image by taking
every second pixel If we do, characteristic errors
appear
11
12
Detecting a target pattern
The target pattern may appear at any scale We want to use only convolutions
Construct copies of the target at several expanded scales, and convolve them with the original image
13
Detecting a target pattern (cont’d)
Or maintain a fixed scale of the target and change the scale of the image
14
Detecting a target pattern
Both approaches should give equivalent results
The difference is in the computational complexity
A convolution with the target pattern expanded in scale by a factor s requires s2 more operations than the convolution with the image reduced in scale by s. s=2..32
A series of images at iteratively reduced scales will form a pyramid.
15
A Gaussian Pyramid
16
Levels of the Gaussian pyramid expanded to the size of the original image
17
How to construct a Gaussian pyramid
At each iteration: Filtering with a low-pass filter (ex: Gaussian with
constant σ or other) Subsampling
form the correlation kernel. The same kernel is used to produce all levels in the pyramid. Kernel should be small and separable
GL=Reduce(Gl-1)
18
The Laplacian Pyramid
series of band-pass images obtained by subtracting each Gaussian
(low-pass) pyramid level from the next-lower level in the pyramid.
19
Flexible templates Target might not be exactly the same in every image Idea: break the template into pieces and try to match
each piece Position the entire template over the neighborhood,
then search around the position of each subtemplate for the best match
Overall match is best combined match for all subtemplates
From B. Morse, http://morse.cs.byu.edu/650/
Evaluation issues in segmentation
Reading 6.5
20
21
Evaluating segmentation techniques
As in other areas of vision, evaluation is a problem
We need to know what the correct result is
We need some way to compare the result of each algorithm to the ideal situation
From Tony Pridmore’s Lecture Notes on Image Processing and Interpretation, University of Nottingham
22
Evaluating segmentation
Possible approaches Ground truth – get a ‘correct’
segmentation and compare the results of the algorithm to it
Evaluations based on region properties – we want the regions to be uniform, and for adjacent regions to be different
Evaluating robustness If we deliberately introduce noise or
partially mask the object of interest, how will the segmentation result be affected?
Adapted from Tony Pridmore’s Lecture Notes on Image Processing and Interpretation, University of Nottingham
23
Ground truth segmentation
Typically used in medical imaging applications
Issue: human segmentations can vary significantly
How do we build a ground truth segmentation from several human segmentations?
Copyright ©2008, Thomson Engineering, a division of Thomson Learning Ltd.
6-24
25
Statistical ground truth
26
Ground truth in other applications
Experiment: segmenting an image by hand
Adapted from Tony Pridmore’s Lecture Notes on Image Processing and Interpretation, University of Nottingham
27
Ground truth in other applications
Experiment: segmenting an image by hand
Adapted from Tony Pridmore’s Lecture Notes on Image Processing and Interpretation, University of Nottingham
28
Ground truth in other applications
Human segmentation of complex scenes is subjective; it depends on visual representation among many other things
Are human segmentations consistent?
Adapted from Tony Pridmore’s Lecture Notes on Image Processing and Interpretation, University of Nottingham
29
Comparing image segmentations
Suppose we have a agreed ground truth We need to compare two sets of regions What does it mean for two sets of regions
to be similar? Is the number of regions important? Does it matter if two regions are merged or
if one is split in two?
Ground truth partition Which result is better?
Adapted from Tony Pridmore’s Lecture Notes on Image Processing and Interpretation, University of Nottingham
Segmentation of complex scenes
30
31
Current measures of similarity: region-based
Applicable when only one region of interest in image Region-based: Mutual overlap
Limits Does not give any information about boundaries Conceals quality differences between
segmentations Assumes a closed contour Large errors for small objects
Current measures of similarity: border-based
32
33
Current measures of similarity: border-based
Hausdorff distance Idea: consider the two contours as two
finite sets of points
€
h(A,B) = maxa∈A
mind (a,b)b∈B
⎛
⎝ ⎜
⎞
⎠ ⎟
H (A,B) = max h(A,B),h(B,A)( )
34
Unsupervised evaluation
Haralick and Shapiro: Regions should be uniform and homogeneous with respect to some characteristic(s) Adjacent regions should have significant differences with respect to the characteristic on which they are
uniform Region interiors should be simple and without holes Boundaries should be simple, not ragged, and be
spatially accurate