Lecture 4: Feature Extraction II
Transcript of Lecture 4: Feature Extraction II
Lecture 4: Feature Extraction II
Searching for Patches
• sliding window• image pyramid
Local feature points
• Advantages of this approach• Harris point detector
Scale invariant region selection
• Searching over scale.• Characteristic scale.
Story so far
• Have introduced some methods to describe the appearance of an image/imagepatch via a feature vector. (SIFT HOG etc..)
• For patches of similar appearance their computed feature vectors should besimilar while dissimilar if the patches differ in appearance.
• Feature vectors are designed to be invariant to common transformations thatsuperficially change the pixel appearance of the patch.
Next Problem
We have an reference image patch which is described by a feature vector fr.
Face Finder: Training• Positive examples:
– Preprocess ~1,000 example face images into 20 x 20 inputs
– Generate 15 “clones” of each with small random rotations, scalings, translations, reflections
• Negative examples– Test net on 120 known “no-face” images
����!��������!���
����!������!��!���
⇒ fr
Given a novel image identify the patches in this image that correspond to thereference patch.
One part of the problem we have explored.
A patch from the novel image generates a feature vector fn. If ‖fr − fn‖ is
small then this patch can be considered an instance of the texture patternrepresented by the reference patch.
However, which and how many different image patches do we extractfrom the novel image ??
Remember..
The sought after image patch can appear at:
• any spatial location in the image
• any size, (the size of an imaged object depends on its from the camera)
• multiple locationsVariation in position and size- multiple detection windows
, 49
Sliding Window Technique
Must examine patches centered at many different pixel locations and sizes.Naive Option: Exhaustive search using original image
for j = 1:n sn = n min + j*n stepfor x=0:x maxfor y=0:y maxExtract image patch centred on pixel x, y of size n×n.Rescale it to the size of the reference patchCompute feature vector f.
This is computationally intensive especially if it is expensive to compute f as itcould be calculated upto n s × x max × y max.
Also if n is large then it is probably very costly to compute f .
What about a multi-scale search
Scale Pyramid Option
Construct an image pyramid that represents an image as several resolutions. Theneither
• Use the coarse scale to highlight promising image patches and then just explorethese area in more detail at the finer resolutions. (Quick but may miss bestimage patches)
• Visit every pixel in the fine resolution image as a potential centre pixel, butsimulate changing the window size by applying the same window size on thedifferent images in the pyramid.
Now will review construction of the image pyramid..
Image Scale Pyramids
Naive Subsampling
6
SMOOTHED IMAGE NAIVE SUBSAMPLING
Pick every other pixel in both directions
SUBSAMPLING ARTIFACTS
Particularly noticeable in high frequency areas, such as on the hair. The lowest resolution level
represents very poorly the highest one.
SYNTHETIC EXAMPLE
1—D ALIASING
High frequency signal sampled at a much lower frequency.
2—D ALIASING
Sampling frequency lower than that of the signal yields a poor representation.
! Must remove high frequencies before sub-sampling.
Pick every other pixel in both directions
Subsampling Artifacts
6
SMOOTHED IMAGE NAIVE SUBSAMPLING
Pick every other pixel in both directions
SUBSAMPLING ARTIFACTS
Particularly noticeable in high frequency areas, such as on the hair. The lowest resolution level
represents very poorly the highest one.
SYNTHETIC EXAMPLE
1—D ALIASING
High frequency signal sampled at a much lower frequency.
2—D ALIASING
Sampling frequency lower than that of the signal yields a poor representation.
! Must remove high frequencies before sub-sampling.
Particularlynoticeable in highfrequency areas,
such as on the hair.
Synthetic Example
6
SMOOTHED IMAGE NAIVE SUBSAMPLING
Pick every other pixel in both directions
SUBSAMPLING ARTIFACTS
Particularly noticeable in high frequency areas, such as on the hair. The lowest resolution level
represents very poorly the highest one.
SYNTHETIC EXAMPLE
1—D ALIASING
High frequency signal sampled at a much lower frequency.
2—D ALIASING
Sampling frequency lower than that of the signal yields a poor representation.
! Must remove high frequencies before sub-sampling.
Under-sampling
Undersampling
• Looks just like lower frequency signal!
Undersampling
• Looks like higher frequency signal!
Aliasing: higher frequency information can appear as lower frequency information
Undersampling
Good sampling
Bad sampling
Aliasing
AliasingInput signal:
x = 0:.05:5; imagesc(sin((2.^x).*x))
Matlab output:
Not enough samples
Aliasing in video
Slide credit: S. Seitz
Looks just like a lower frequency signal!
Under-samplingUndersampling
• Looks just like lower frequency signal!
Undersampling
• Looks like higher frequency signal!
Aliasing: higher frequency information can appear as lower frequency information
Undersampling
Good sampling
Bad sampling
Aliasing
AliasingInput signal:
x = 0:.05:5; imagesc(sin((2.^x).*x))
Matlab output:
Not enough samples
Aliasing in video
Slide credit: S. Seitz
Looks like higher frequency signal!
Aliasing: higher frequency information can appear as lower frequency information
2-D Aliasing
6
SMOOTHED IMAGE NAIVE SUBSAMPLING
Pick every other pixel in both directions
SUBSAMPLING ARTIFACTS
Particularly noticeable in high frequency areas, such as on the hair. The lowest resolution level
represents very poorly the highest one.
SYNTHETIC EXAMPLE
1—D ALIASING
High frequency signal sampled at a much lower frequency.
2—D ALIASING
Sampling frequency lower than that of the signal yields a poor representation.
! Must remove high frequencies before sub-sampling.
High frequency signalsampled lower than that
of the signal yields apoor representation.
Therefore mustremove high
frequencies beforesub-sampling.
Aliasing Summary
• Can’t shrink an image by taking every second pixel due to sampling below theNyquist rate
• If we do, characteristic errors appear such as
– jaggedness in line features– spurious highlights– appearance of frequency patterns not present in the original image
Gaussian Pyramid
7
GAUSSIAN PYRAMID
• Gaussian smooth• Pick every other pixel in both directions
LOSS OF DETAILS BUT NOT ARTIFACTS
!No aliasing but details are lost as high frequencies are progressively removed.
LAPLACIAN PYRAMID
Each level of the Laplacian pyramid is the difference between corresponding and next higher level of the Gaussian Pyramid.
LAPLACIAN RECONSTRUCTION
• Upsampling by interpolation.• Adding upsampled image and difference image.
P. Burt and E. Adelson, The Laplacian Pyramid as a Compact Image Code, IEEE Transactions on Communications, 1983.
LAPLACIAN PYRAMID
• Pixels in the difference images are relatively uncorrelated.
• Their values are concentrated around zero.
ENTROPY AND QUANTIZATION
! Effective compression through shortened and variable code words.
• Gaussian smooth image
• Pick every other pixel in both directions
Images in the Pyramid
7
GAUSSIAN PYRAMID
• Gaussian smooth• Pick every other pixel in both directions
LOSS OF DETAILS BUT NOT ARTIFACTS
!No aliasing but details are lost as high frequencies are progressively removed.
LAPLACIAN PYRAMID
Each level of the Laplacian pyramid is the difference between corresponding and next higher level of the Gaussian Pyramid.
LAPLACIAN RECONSTRUCTION
• Upsampling by interpolation.• Adding upsampled image and difference image.
P. Burt and E. Adelson, The Laplacian Pyramid as a Compact Image Code, IEEE Transactions on Communications, 1983.
LAPLACIAN PYRAMID
• Pixels in the difference images are relatively uncorrelated.
• Their values are concentrated around zero.
ENTROPY AND QUANTIZATION
! Effective compression through shortened and variable code words.
No aliasing but detailsare lost as highfrequencies are
progressively removed.
Scaled representation advantages
• Find template matches at all scales
– Template size is constant, but image size changes
• Efficient search for correspondence
– look at coarse scales, then refine with finer scales– much less cost, but may miss best match
• Examining of all levels of detail
– Find edges with different amounts of blur– Find textures with different spatial frequencies
Back to Sliding Windows
Summary: Sliding Windows
Pros
• Simple to implement• Good feature choices critical• Past successes for certain classes• Good detectors available
Cons/Limitations
• High computational complexity– 250,000 locations x 30 orientations x 4 scales = 30,000,000 evaluations!– Puts constraints on the type of classifiers we can use– If training binary detectors independently, this means cost increases linearly
with number of classes.• With so many windows, false positive rate better be low
Limitations of sliding windows
• Not all object are box shaped.
Limitations of sliding windows
• Non-rigid, deformable objects not captured well with representations assuminga fixed 2D structure; or must assume fixed viewpoint
• Objects with less-regular textures not captured well with holistic appearance-based descriptions.
Limitations of sliding windows
• If considering windows in isolation, context is lost.
Sliding Window Detector’s View
Limitations of sliding windows
• In practice, often entails large, cropped training set.
• Using a global appearance description can lead to sensitivity to partial occlusions.
Need lots of training data Partial occlusion a problem
Another Approach
Local invariant features: The Motivation
• Global representations have major limitations
• Instead, describe and match only local regions
• Increased robustness to:
occlusionsintra-category
variationArticulation
Applications: Image Matching
Applications: Image Stitching
Applications: Image Stitching
Procedure
• Detect feature points in both images.
Applications: Image Stitching
Procedure
• Detect feature points in both images.
• Find corresponding pairs.
Applications: Image Stitching
Procedure
• Detect feature points in both images.
• Find corresponding pairs.
• Use these pairs to align the images.
Applications: Object Recognition
• Image content is transformed into local features that are invariant to translation,rotation, and scale.• Goal: Verify if they belong to a consistent configuration.
Reference Image Local Features Test Image
General Approach
1. Find a set of distinctive keypoints.
2. Define a region around each keypoint.
3. Extract and normalize the region content.
4. Compute a local descriptor from the normalized regions.
5. Match local descriptors.
A1
Af
e.g. color
Bf
e.g. color
B1
A2
A3B2
B3
Similarity measure
Necessary conditions for this approach
Need a repeatable detectorDetect the same point independently in both images
In this case, No chance to match!
Necessary conditions for this approach
Need a reliable and distinctive descriptorFor each point correctly recognize the corresponding one.
?
Descriptor InvariancesGeometric Transformations
⇓ ⇓
An aside Levels of Geometric Invariance
Photometric Transformations
Often modelled as a linear transformation: Scaling + offset.
Requirements
Repeatable and Accurate Region Extraction
• Invariant to translation, rotation, scale changes
• Robust or covariant to out-of-plane (≈affine) transformations
• Robust to lighting variations, noise, blur, quantization
Locality Features are local, therefore robust to occlusion and clutter.
Quantity We need a sufficient number of regions to cover the object.
Distinctiveness The regions should contain interesting structure.
Efficiency Close to real-time performance.
Keypoint Localization
Goals• Repeatable detection• Precise localization• Interesting content.
Look for two-dimensional signal changes.
Find Corners
Key Property Region around a corner its image gradient has two or moredominant directions.
Corners are repeatable and distinctive
Corners as distinctive interest points
Design criteria:• Should easily recognize the point by looking through a small window. (locality)• Shifting a window in any direction should give a large change in intensity. (goodlocalization)
Harris Detector Formulation
Change in the intensity for a shift (u, v):
E(u, v) =∑
(x,y)∈W
w(x, y) [I(x+ u, y + v)− I(x, y)]2
where the window/weight mask w(x, y) is either:
! ⇥!⇤⌅⇧⌃⌥!⌅⌥! ⌦↵,⌅!.↵,✏⌦⌅⇣⌘!✓⌦◆!⇧⌃!56⇣↵⌫◆↵⌦!⇠⇧⌫⇡5⇣↵,⇢⌘!;,⌧⇣↵⇣5⇣!⌃=⌅!;,⌃⇧⌅⌫⌦⇣↵>⌘!?,↵@⌅⌧↵⇣!⇣!"5⇢⌧#5⌅⇢!!$↵%✏6↵⇣,⌅⌧⇣⌅⌥!&'⌘!()*+&,!"5⇢⌧#5⌅⇢⌘!-⌅⌫⌦,./!⌫⌦↵60! ⌦↵,⌅⌥.↵,✏⌦⌅⇣1↵,⌃⇧⌅⌫⌦⇣↵>⌥5,↵(⌦5⇢⌧#5⌅⇢⌥◆
!⇥#⇠⌦◆⇥#⌅⇧
*⇥⇤⇤⌅⇧⌃↵⌦#⌦#⇤/⌃
!⇥#⇠⌦◆⇥#⌅⇧
⇠✏⌦,⇢!⇧⌃!↵,⇣,⌧↵⇣.!⌃⇧⌅!⇣✏!⌧✏↵⌃⇣!6!⇥⇤70
!⇥⇤⌅⇥⇧⌃⇤⌥ ⌃⌦⇤⌅↵,⌃⇥⇤⌅⇥⇧⌃⇤⌥
⌃⇥↵.✏,⌦⇣⇥⌘⇤⌃.⇥
⇧⌅8↵,◆⇧9!⌃5,%⇣↵⇧,!⌅⇧⌃⇥⌥!:
-⌦5⌧⌧↵⌦,+!↵,!9↵,◆⇧9⌘!'!⇧5⇣⌧↵◆
! !⇥ ⇤ ⌅⇥⇤⌅⇧ ⇤ ⌃
⌥! ⇧ ⇤ ⌃⇥⇧  !⇧⌃⇥ ⇤ ⌃⌃⌅ ⇥⌥  ! ⇧ ⇤ ⌃⇥⇤
Harris Detector Formulation
This measure of change can be approximated by:
E(u, v) ≈ (u, v)M(uv
)where M is a 2× 2 matrix computed from image derivatives:
M =∑
(x,y)∈W
w(x, y)(
Ix(x, y)2 Ix(x, y)Iy(x, y)Ix(x, y)Iy(x, y) Iy(x, y)2
)
=∑
(x,y)∈W
w(x, y)(Ix(x, y)Iy(x, y)
) (Ix(x, y) Iy(x, y)
)where Ix is the gradient with respect to x.
Harris Detector Formulation
M is computed from these image derivatives:
Image I Ix Iy IxIy
What does this matrix reveal?
Consider an axis-aligned corner:
What does this matrix reveal?
For an axis-aligned corner:
M =( ∑
I2x
∑IxIy∑
IxIy∑I2y
)=(λ1 00 λ2
)• The dominant gradient directions align with the x and y axis.
• If either λ1 or λ2 is close to 0, then this is not a corner, so look for locationswhere both are large.
What if we have a corner that is not aligned with the image axes?
General caseSince M is symmetric, then
M = R−1
(λ1 00 λ2
)R
Can visualize M as an ellipse with axis lengths determined by its eigenvalues andorientation determined by R.
! ⇥!⇤⌅⇧⌃⌥!⌅⌥! ⌦↵,⌅!.↵,✏⌦⌅⇣⌘!✓⌦◆!⇧⌃!56⇣↵⌫◆↵⌦!⇠⇧⌫⇡5⇣↵,⇢⌘!;,⌧⇣↵⇣5⇣!⌃=⌅!;,⌃⇧⌅⌫⌦⇣↵>⌘!?,↵@⌅⌧↵⇣!⇣!"5⇢⌧#5⌅⇢!!$↵%✏6↵⇣,⌅⌧⇣⌅⌥!&'⌘!()*+&,!"5⇢⌧#5⌅⇢⌘!-⌅⌫⌦,./!⌫⌦↵60! ⌦↵,⌅⌥.↵,✏⌦⌅⇣1↵,⌃⇧⌅⌫⌦⇣↵>⌥5,↵(⌦5⇢⌧#5⌅⇢⌥◆
⌃"⇥$⇠⌦◆⇥$⌅⇧
;,⇣,⌧↵⇣.!%✏⌦,⇢!↵,!⌧✏↵⌃⇣↵,⇢!9↵,◆⇧90!↵⇢,@⌦65!⌦,⌦6.⌧↵⌧
λ+⌘!λ=!>!↵⇢,@⌦65⌧!⇧⌃!⇣
!⇥⇤⌅⇧⌃⇥⌥ ⌥⌦ ⌃↵⌅ ,⌥.⌅,⌃ ⇧↵✏⇣⌅
!⇥⇤⌅⇧⌃⇥⌥ ⌥⌦ ⌃↵⌅ ⌦✏,⌃⌅,⌃ ⇧↵✏⇣⌅
?λ⌫⌦<@(+A=
?λ⌫↵,@(+A=
$66↵⇡⌧!⌘⇧$⇥⇤✓:!%⇧,⌧⇣
⇤⌅⇧ ⇤ ⌃
⌥# ⇧ ⇤ ⌃⇥⇧  ⇧⇤
 ⇧  ⌃
 ⇧  ⌃  ⌃⇤ 
! ⇥!⇤⌅⇧⌃⌥!⌅⌥! ⌦↵,⌅!.↵,✏⌦⌅⇣⌘!✓⌦◆!⇧⌃!56⇣↵⌫◆↵⌦!⇠⇧⌫⇡5⇣↵,⇢⌘!;,⌧⇣↵⇣5⇣!⌃=⌅!;,⌃⇧⌅⌫⌦⇣↵>⌘!?,↵@⌅⌧↵⇣!⇣!"5⇢⌧#5⌅⇢!!$↵%✏6↵⇣,⌅⌧⇣⌅⌥!&'⌘!()*+&,!"5⇢⌧#5⌅⇢⌘!-⌅⌫⌦,./!⌫⌦↵60! ⌦↵,⌅⌥.↵,✏⌦⌅⇣1↵,⌃⇧⌅⌫⌦⇣↵>⌥5,↵(⌦5⇢⌧#5⌅⇢⌥◆
⌃"⇥$⇠⌦◆⇥$⌅⇧
λ+
λ=
3⇠⇧⌅,⌅4
λ+!⌦,◆!λ=!⌦⌅!6⌦⌅⇢⌘!λ+!B!λ=/⌘!↵,%⌅⌦⌧⌧!↵,!⌦66!◆↵⌅%⇣↵⇧,⌧
λ+!⌦,◆!λ=!⌦⌅!⌧⌫⌦66/⌘!↵⌧!⌦6⌫⇧⌧⇣!%⇧,⌧⇣⌦,⇣!↵,!⌦66!◆↵⌅%⇣↵⇧,⌧
3$◆⇢4!
λ+!CC!λ=
3$◆⇢4!
λ=!CC!λ+
3;6⌦⇣4!⌅⇢↵⇧,
⇠6⌦⌧⌧↵⌃↵%⌦⇣↵⇧,!⇧⌃!↵⌫⌦⇢!⇡⇧↵,⇣⌧!5⌧↵,⇢!↵⇢,@⌦65⌧!⇧⌃!⇣0
Corner response function
Measure of corner response:
R = det(M)− κ (trace(M))2
where
det(M) = λ1λ2
trace(M) = λ1 + λ2
κ is a constant whose value was determined empirically to give results in the range[.04, .06].
! ⇥!⇤⌅⇧⌃⌥!⌅⌥! ⌦↵,⌅!.↵,✏⌦⌅⇣⌘!✓⌦◆!⇧⌃!56⇣↵⌫◆↵⌦!⇠⇧⌫⇡5⇣↵,⇢⌘!;,⌧⇣↵⇣5⇣!⌃=⌅!;,⌃⇧⌅⌫⌦⇣↵>⌘!?,↵@⌅⌧↵⇣!⇣!"5⇢⌧#5⌅⇢!!$↵%✏6↵⇣,⌅⌧⇣⌅⌥!&'⌘!()*+&,!"5⇢⌧#5⌅⇢⌘!-⌅⌫⌦,./!⌫⌦↵60! ⌦↵,⌅⌥.↵,✏⌦⌅⇣1↵,⌃⇧⌅⌫⌦⇣↵>⌥5,↵(⌦5⇢⌧#5⌅⇢⌥◆
⌃"⇥$⇠⌦◆⇥$⌅⇧
λ+
λ= 3⇠⇧⌅,⌅4
3$◆⇢4!
3$◆⇢4!
3;6⌦⇣4
⇥!!◆⇡,◆⌧!⇧,6.!⇧,!↵⇢,@⌦65⌧!⇧⌃!
⇥!!↵⌧!6⌦⌅⇢!⌃⇧⌅!⌦!%⇧⌅,⌅
⇥!!↵⌧!,⇢⌦⇣↵@!9↵⇣✏!6⌦⌅⇢!⌫⌦⇢,↵⇣5◆!⌃⇧⌅!⌦,!◆⇢
⇥!EE!↵⌧!⌧⌫⌦66!⌃⇧⌅!⌦!⌃6⌦⇣!⌅⇢↵⇧,
!C!'
!F!'
!F!'""!⌧⌫⌦66
Window function
M =∑
(x,y)∈W
w(x, y)(I2x IxIy
IxIy I2y
)
Option 1: Uniform window Sum over a square window
M =∑
(x,y)∈W
(I2x IxIy
IxIy I2y
)
Problem: Not rotationally invariant.
Option 2: Smooth with Gaussian Gaussian already performs weighted sum
M =∑
(x,y)∈W
G(σ) ∗(I2x IxIy
IxIy I2y
)
Result is rotation invariant.
Summary: Harris Detector
Compute second moment matrix
M(σl, σD) = G(σl) ∗(I2x(σD) IxIy(σD)
IxIy(σD) I2y(σD)
)
Compute cornerness function
R = det(M(σl, σD))− κ trace(M(σl, σD))
= (G ∗ I2x)(G ∗ I2
y)− (G ∗ (IxIy))2 − κ(G ∗ I2x +G ∗ I2
y)2
Perform non-maximum suppression
Original Image
Harris response
Harris: In action
! ⇥!⇤⌅⇧⌃⌥!⌅⌥! ⌦↵,⌅!.↵,✏⌦⌅⇣⌘!✓⌦◆!⇧⌃!56⇣↵⌫◆↵⌦!⇠⇧⌫⇡5⇣↵,⇢⌘!;,⌧⇣↵⇣5⇣!⌃=⌅!;,⌃⇧⌅⌫⌦⇣↵>⌘!?,↵@⌅⌧↵⇣!⇣!"5⇢⌧#5⌅⇢!!$↵%✏6↵⇣,⌅⌧⇣⌅⌥!&'⌘!()*+&,!"5⇢⌧#5⌅⇢⌘!-⌅⌫⌦,./!⌫⌦↵60! ⌦↵,⌅⌥.↵,✏⌦⌅⇣1↵,⌃⇧⌅⌫⌦⇣↵>⌥5,↵(⌦5⇢⌧#5⌅⇢⌥◆
!⇥⇤⇤⌅⇧⌃↵⌦)⌦)⇤,⌃-⇤.!0"
Harris: Corner Response
! ⇥!⇤⌅⇧⌃⌥!⌅⌥! ⌦↵,⌅!.↵,✏⌦⌅⇣⌘!✓⌦◆!⇧⌃!56⇣↵⌫◆↵⌦!⇠⇧⌫⇡5⇣↵,⇢⌘!;,⌧⇣↵⇣5⇣!⌃=⌅!;,⌃⇧⌅⌫⌦⇣↵>⌘!?,↵@⌅⌧↵⇣!⇣!"5⇢⌧#5⌅⇢!!$↵%✏6↵⇣,⌅⌧⇣⌅⌥!&'⌘!()*+&,!"5⇢⌧#5⌅⇢⌘!-⌅⌫⌦,./!⌫⌦↵60! ⌦↵,⌅⌥.↵,✏⌦⌅⇣1↵,⌃⇧⌅⌫⌦⇣↵>⌥5,↵(⌦5⇢⌧#5⌅⇢⌥◆
!⇥⇤⇤⌅⇧⌃↵⌦)⌦)⇤,⌃-⇤.!0"⇠⇧⌫⇡5⇣!%⇧⌅,⌅!⌅⌧⇡⇧,⌧!
Harris: Threshold
! ⇥!⇤⌅⇧⌃⌥!⌅⌥! ⌦↵,⌅!.↵,✏⌦⌅⇣⌘!✓⌦◆!⇧⌃!56⇣↵⌫◆↵⌦!⇠⇧⌫⇡5⇣↵,⇢⌘!;,⌧⇣↵⇣5⇣!⌃=⌅!;,⌃⇧⌅⌫⌦⇣↵>⌘!?,↵@⌅⌧↵⇣!⇣!"5⇢⌧#5⌅⇢!!$↵%✏6↵⇣,⌅⌧⇣⌅⌥!&'⌘!()*+&,!"5⇢⌧#5⌅⇢⌘!-⌅⌫⌦,./!⌫⌦↵60! ⌦↵,⌅⌥.↵,✏⌦⌅⇣1↵,⌃⇧⌅⌫⌦⇣↵>⌥5,↵(⌦5⇢⌧#5⌅⇢⌥◆
!⇥⇤⇤⌅⇧⌃↵⌦)⌦)⇤,⌃-⇤.!0";↵,◆!⇡⇧↵,⇣⌧!9↵⇣✏!6⌦⌅⇢!%⇧⌅,⌅!⌅⌧⇡⇧,⌧0!"⇣✏⌅⌧✏⇧6◆
Harris: Local Maxima
! ⇥!⇤⌅⇧⌃⌥!⌅⌥! ⌦↵,⌅!.↵,✏⌦⌅⇣⌘!✓⌦◆!⇧⌃!56⇣↵⌫◆↵⌦!⇠⇧⌫⇡5⇣↵,⇢⌘!;,⌧⇣↵⇣5⇣!⌃=⌅!;,⌃⇧⌅⌫⌦⇣↵>⌘!?,↵@⌅⌧↵⇣!⇣!"5⇢⌧#5⌅⇢!!$↵%✏6↵⇣,⌅⌧⇣⌅⌥!&'⌘!()*+&,!"5⇢⌧#5⌅⇢⌘!-⌅⌫⌦,./!⌫⌦↵60! ⌦↵,⌅⌥.↵,✏⌦⌅⇣1↵,⌃⇧⌅⌫⌦⇣↵>⌥5,↵(⌦5⇢⌧#5⌅⇢⌥◆
!⇥⇤⇤⌅⇧⌃↵⌦)⌦)⇤,⌃-⇤.!0"G⌦>!⇧,6.!⇣✏!⇡⇧↵,⇣⌧!⇧⌃!6⇧%⌦6!⌫⌦<↵⌫⌦!⇧⌃!
Harris: Final Points
! ⇥!⇤⌅⇧⌃⌥!⌅⌥! ⌦↵,⌅!.↵,✏⌦⌅⇣⌘!✓⌦◆!⇧⌃!56⇣↵⌫◆↵⌦!⇠⇧⌫⇡5⇣↵,⇢⌘!;,⌧⇣↵⇣5⇣!⌃=⌅!;,⌃⇧⌅⌫⌦⇣↵>⌘!?,↵@⌅⌧↵⇣!⇣!"5⇢⌧#5⌅⇢!!$↵%✏6↵⇣,⌅⌧⇣⌅⌥!&'⌘!()*+&,!"5⇢⌧#5⌅⇢⌘!-⌅⌫⌦,./!⌫⌦↵60! ⌦↵,⌅⌥.↵,✏⌦⌅⇣1↵,⌃⇧⌅⌫⌦⇣↵>⌥5,↵(⌦5⇢⌧#5⌅⇢⌥◆
!⇥⇤⇤⌅⇧⌃↵⌦)⌦)⇤,⌃-⇤.!0"
Harris Detector: Responses
Effect: A very precise corner detector.
Harris Detector: Properties
Corner response R is invariant to image rotation
Ellipse rotates but its shape (i.e. eigenvalues) remains the same.
Harris Detector: Properties
Not invariant to changes in scale
corner
all points classified as edges
Scale Invariant Region Selection
From points to regions
• The Harris operator defines interest points
– precise localization– High repeatability
• In order to compare those points, we need to compute a descriptor over a region.
How can we define such a region in a scale invariant manner?
• How can we detect scale invariant interest regions?
Scale Invariant Detection
Consider regions of different sizes around a point. Regions of corresponding sizeswill look the same in both images.
Harris Detector: Some Properties
Rotation invariance
Ellipse rotates but its shape (i.e. eigenvalues)
remains the same
Corner response R is invariant to image rotation
Harris Detector: Some Properties
Partial invariance to affine intensity change
!! Only derivatives are used => invariance
to intensity shift I ! I + b
!! Intensity scale: I ! a I
R
x (image coordinate)
threshold
R
x (image coordinate)
Harris Detector: Some Properties
But: non-invariant to image scale!
All points will be
classified as edges Corner !
Scale Invariant Detection
Consider regions (e.g. circles) of different sizes around a point
Regions of corresponding sizes will look the same in both images
Naive Approach: Exhaustive SearchCompare descriptors while varying the patch size
Naive Approach: Exhaustive SearchCompare descriptors while varying the patch size
Naive Approach: Exhaustive Search
Compare descriptors while varying the patch size
Naive Approach: Exhaustive Search
Compare descriptors while varying the patch size
Naive Approach: Exhaustive Search
Compare descriptors while varying the patch size
• Computationally inefficient
• Inefficient but possible for matching
• Prohibitive for retrieval in large databases
• Prohibitive for recognition.
Scale Invariant Detection
Problem: How do we choose corresponding circles independently in each image?
Scale Invariant Detection
The problem: how do we choose corresponding circles
independently in each image?
Choose the scale of the “best” corner
Feature selection
Distribute points evenly over the image
Adaptive Non-maximal Suppression
Desired: Fixed # of features per image
•! Want evenly distributed spatially…
•! Search over non-maximal suppression radius
[Brown, Szeliski, Winder, CVPR’05]
Feature descriptors
We know how to detect points
Next question: How to match them?
?
Point descriptor should be:
1.! Invariant 2. Distinctive
Automatic Scale Detection
Design a function on the region, which is scale invariant (the same for correspondingregions, even if they are at different scales)
• Take a local maximum of this funcion.• Region size of the maximum should be invariant to image scale.
f
Region size
Image 1 f
Region size
Image 2
scale = .5
s1 = .5 s2s1 s2
Important: this scale invariant region size is found in each image independently.
Automatic Scale DetectionFunction responses for increasing scale (scale signature)
Automatic Scale Detection
Normalize: Rescale to fixed size
What is a useful signature function?Laplacian-of-Gaussian = blob detection
Characteristic Scale
We define the characteristic scale as the scale that produces peak of Laplacianresponse.
Summary: Scale invariant detector
Given Two images of the same scene with a large scale difference between them.
Goal Find the same interest points independently in each image.
Solution Search for maxima of suitable functions in scale and in space (over theimage).
What do we do now?
We know how to detect points.
Next question: How to match them?
Scale Invariant Detection
The problem: how do we choose corresponding circles
independently in each image?
Choose the scale of the “best” corner
Feature selection
Distribute points evenly over the image
Adaptive Non-maximal Suppression
Desired: Fixed # of features per image
•! Want evenly distributed spatially…
•! Search over non-maximal suppression radius
[Brown, Szeliski, Winder, CVPR’05]
Feature descriptors
We know how to detect points
Next question: How to match them?
?
Point descriptor should be:
1.! Invariant 2. Distinctive You know this, what are our options?
Today’s programming assignment
Programming Assignment
• Details available on the course website.
• You will write Matlab functions compute Harris interest points at different scalesand also estimate the dominant direction of the gradient around the interestpoint.
• Important If you have a lecture clash read the course website for a solution.
• Mail me about any errors you spot in the Exercise notes.
• I will notify the class about errors spotted and correstions via the course websiteand mailing list.