Lecture 4: Feature Extraction II

83
Lecture 4: Feature Extraction II Searching for Patches sliding window image pyramid Local feature points Advantages of this approach Harris point detector Scale invariant region selection Searching over scale. Characteristic scale.

Transcript of Lecture 4: Feature Extraction II

Page 1: Lecture 4: Feature Extraction II

Lecture 4: Feature Extraction II

Searching for Patches

• sliding window• image pyramid

Local feature points

• Advantages of this approach• Harris point detector

Scale invariant region selection

• Searching over scale.• Characteristic scale.

Page 2: Lecture 4: Feature Extraction II

Story so far

• Have introduced some methods to describe the appearance of an image/imagepatch via a feature vector. (SIFT HOG etc..)

• For patches of similar appearance their computed feature vectors should besimilar while dissimilar if the patches differ in appearance.

• Feature vectors are designed to be invariant to common transformations thatsuperficially change the pixel appearance of the patch.

Page 3: Lecture 4: Feature Extraction II

Next Problem

We have an reference image patch which is described by a feature vector fr.

Face Finder: Training• Positive examples:

– Preprocess ~1,000 example face images into 20 x 20 inputs

– Generate 15 “clones” of each with small random rotations, scalings, translations, reflections

• Negative examples– Test net on 120 known “no-face” images

����!��������!���

����!������!��!���

⇒ fr

Given a novel image identify the patches in this image that correspond to thereference patch.

One part of the problem we have explored.

A patch from the novel image generates a feature vector fn. If ‖fr − fn‖ is

Page 4: Lecture 4: Feature Extraction II

small then this patch can be considered an instance of the texture patternrepresented by the reference patch.

However, which and how many different image patches do we extractfrom the novel image ??

Page 5: Lecture 4: Feature Extraction II

Remember..

The sought after image patch can appear at:

• any spatial location in the image

• any size, (the size of an imaged object depends on its from the camera)

• multiple locationsVariation in position and size- multiple detection windows

, 49

Page 6: Lecture 4: Feature Extraction II

Sliding Window Technique

Must examine patches centered at many different pixel locations and sizes.Naive Option: Exhaustive search using original image

for j = 1:n sn = n min + j*n stepfor x=0:x maxfor y=0:y maxExtract image patch centred on pixel x, y of size n×n.Rescale it to the size of the reference patchCompute feature vector f.

This is computationally intensive especially if it is expensive to compute f as itcould be calculated upto n s × x max × y max.

Also if n is large then it is probably very costly to compute f .

Page 7: Lecture 4: Feature Extraction II

What about a multi-scale search

Page 8: Lecture 4: Feature Extraction II

Scale Pyramid Option

Construct an image pyramid that represents an image as several resolutions. Theneither

• Use the coarse scale to highlight promising image patches and then just explorethese area in more detail at the finer resolutions. (Quick but may miss bestimage patches)

• Visit every pixel in the fine resolution image as a potential centre pixel, butsimulate changing the window size by applying the same window size on thedifferent images in the pyramid.

Now will review construction of the image pyramid..

Page 9: Lecture 4: Feature Extraction II

Image Scale Pyramids

Page 10: Lecture 4: Feature Extraction II

Naive Subsampling

6

SMOOTHED IMAGE NAIVE SUBSAMPLING

Pick every other pixel in both directions

SUBSAMPLING ARTIFACTS

Particularly noticeable in high frequency areas, such as on the hair. The lowest resolution level

represents very poorly the highest one.

SYNTHETIC EXAMPLE

1—D ALIASING

High frequency signal sampled at a much lower frequency.

2—D ALIASING

Sampling frequency lower than that of the signal yields a poor representation.

! Must remove high frequencies before sub-sampling.

Pick every other pixel in both directions

Page 11: Lecture 4: Feature Extraction II

Subsampling Artifacts

6

SMOOTHED IMAGE NAIVE SUBSAMPLING

Pick every other pixel in both directions

SUBSAMPLING ARTIFACTS

Particularly noticeable in high frequency areas, such as on the hair. The lowest resolution level

represents very poorly the highest one.

SYNTHETIC EXAMPLE

1—D ALIASING

High frequency signal sampled at a much lower frequency.

2—D ALIASING

Sampling frequency lower than that of the signal yields a poor representation.

! Must remove high frequencies before sub-sampling.

Particularlynoticeable in highfrequency areas,

such as on the hair.

Page 12: Lecture 4: Feature Extraction II

Synthetic Example

6

SMOOTHED IMAGE NAIVE SUBSAMPLING

Pick every other pixel in both directions

SUBSAMPLING ARTIFACTS

Particularly noticeable in high frequency areas, such as on the hair. The lowest resolution level

represents very poorly the highest one.

SYNTHETIC EXAMPLE

1—D ALIASING

High frequency signal sampled at a much lower frequency.

2—D ALIASING

Sampling frequency lower than that of the signal yields a poor representation.

! Must remove high frequencies before sub-sampling.

Page 13: Lecture 4: Feature Extraction II

Under-sampling

Undersampling

• Looks just like lower frequency signal!

Undersampling

• Looks like higher frequency signal!

Aliasing: higher frequency information can appear as lower frequency information

Undersampling

Good sampling

Bad sampling

Aliasing

AliasingInput signal:

x = 0:.05:5; imagesc(sin((2.^x).*x))

Matlab output:

Not enough samples

Aliasing in video

Slide credit: S. Seitz

Looks just like a lower frequency signal!

Page 14: Lecture 4: Feature Extraction II

Under-samplingUndersampling

• Looks just like lower frequency signal!

Undersampling

• Looks like higher frequency signal!

Aliasing: higher frequency information can appear as lower frequency information

Undersampling

Good sampling

Bad sampling

Aliasing

AliasingInput signal:

x = 0:.05:5; imagesc(sin((2.^x).*x))

Matlab output:

Not enough samples

Aliasing in video

Slide credit: S. Seitz

Looks like higher frequency signal!

Aliasing: higher frequency information can appear as lower frequency information

Page 15: Lecture 4: Feature Extraction II

2-D Aliasing

6

SMOOTHED IMAGE NAIVE SUBSAMPLING

Pick every other pixel in both directions

SUBSAMPLING ARTIFACTS

Particularly noticeable in high frequency areas, such as on the hair. The lowest resolution level

represents very poorly the highest one.

SYNTHETIC EXAMPLE

1—D ALIASING

High frequency signal sampled at a much lower frequency.

2—D ALIASING

Sampling frequency lower than that of the signal yields a poor representation.

! Must remove high frequencies before sub-sampling.

High frequency signalsampled lower than that

of the signal yields apoor representation.

Therefore mustremove high

frequencies beforesub-sampling.

Page 16: Lecture 4: Feature Extraction II

Aliasing Summary

• Can’t shrink an image by taking every second pixel due to sampling below theNyquist rate

• If we do, characteristic errors appear such as

– jaggedness in line features– spurious highlights– appearance of frequency patterns not present in the original image

Page 17: Lecture 4: Feature Extraction II

Gaussian Pyramid

7

GAUSSIAN PYRAMID

• Gaussian smooth• Pick every other pixel in both directions

LOSS OF DETAILS BUT NOT ARTIFACTS

!No aliasing but details are lost as high frequencies are progressively removed.

LAPLACIAN PYRAMID

Each level of the Laplacian pyramid is the difference between corresponding and next higher level of the Gaussian Pyramid.

LAPLACIAN RECONSTRUCTION

• Upsampling by interpolation.• Adding upsampled image and difference image.

P. Burt and E. Adelson, The Laplacian Pyramid as a Compact Image Code, IEEE Transactions on Communications, 1983.

LAPLACIAN PYRAMID

• Pixels in the difference images are relatively uncorrelated.

• Their values are concentrated around zero.

ENTROPY AND QUANTIZATION

! Effective compression through shortened and variable code words.

• Gaussian smooth image

• Pick every other pixel in both directions

Page 18: Lecture 4: Feature Extraction II

Images in the Pyramid

7

GAUSSIAN PYRAMID

• Gaussian smooth• Pick every other pixel in both directions

LOSS OF DETAILS BUT NOT ARTIFACTS

!No aliasing but details are lost as high frequencies are progressively removed.

LAPLACIAN PYRAMID

Each level of the Laplacian pyramid is the difference between corresponding and next higher level of the Gaussian Pyramid.

LAPLACIAN RECONSTRUCTION

• Upsampling by interpolation.• Adding upsampled image and difference image.

P. Burt and E. Adelson, The Laplacian Pyramid as a Compact Image Code, IEEE Transactions on Communications, 1983.

LAPLACIAN PYRAMID

• Pixels in the difference images are relatively uncorrelated.

• Their values are concentrated around zero.

ENTROPY AND QUANTIZATION

! Effective compression through shortened and variable code words.

No aliasing but detailsare lost as highfrequencies are

progressively removed.

Page 19: Lecture 4: Feature Extraction II

Scaled representation advantages

• Find template matches at all scales

– Template size is constant, but image size changes

• Efficient search for correspondence

– look at coarse scales, then refine with finer scales– much less cost, but may miss best match

• Examining of all levels of detail

– Find edges with different amounts of blur– Find textures with different spatial frequencies

Page 20: Lecture 4: Feature Extraction II

Back to Sliding Windows

Page 21: Lecture 4: Feature Extraction II

Summary: Sliding Windows

Pros

• Simple to implement• Good feature choices critical• Past successes for certain classes• Good detectors available

Cons/Limitations

• High computational complexity– 250,000 locations x 30 orientations x 4 scales = 30,000,000 evaluations!– Puts constraints on the type of classifiers we can use– If training binary detectors independently, this means cost increases linearly

with number of classes.• With so many windows, false positive rate better be low

Page 22: Lecture 4: Feature Extraction II

Limitations of sliding windows

• Not all object are box shaped.

Page 23: Lecture 4: Feature Extraction II

Limitations of sliding windows

• Non-rigid, deformable objects not captured well with representations assuminga fixed 2D structure; or must assume fixed viewpoint

• Objects with less-regular textures not captured well with holistic appearance-based descriptions.

Page 24: Lecture 4: Feature Extraction II

Limitations of sliding windows

• If considering windows in isolation, context is lost.

Sliding Window Detector’s View

Page 25: Lecture 4: Feature Extraction II

Limitations of sliding windows

• In practice, often entails large, cropped training set.

• Using a global appearance description can lead to sensitivity to partial occlusions.

Need lots of training data Partial occlusion a problem

Page 26: Lecture 4: Feature Extraction II

Another Approach

Page 27: Lecture 4: Feature Extraction II

Local invariant features: The Motivation

• Global representations have major limitations

• Instead, describe and match only local regions

• Increased robustness to:

occlusionsintra-category

variationArticulation

Page 28: Lecture 4: Feature Extraction II

Applications: Image Matching

Page 29: Lecture 4: Feature Extraction II

Applications: Image Stitching

Page 30: Lecture 4: Feature Extraction II

Applications: Image Stitching

Procedure

• Detect feature points in both images.

Page 31: Lecture 4: Feature Extraction II

Applications: Image Stitching

Procedure

• Detect feature points in both images.

• Find corresponding pairs.

Page 32: Lecture 4: Feature Extraction II

Applications: Image Stitching

Procedure

• Detect feature points in both images.

• Find corresponding pairs.

• Use these pairs to align the images.

Page 33: Lecture 4: Feature Extraction II

Applications: Object Recognition

• Image content is transformed into local features that are invariant to translation,rotation, and scale.• Goal: Verify if they belong to a consistent configuration.

Reference Image Local Features Test Image

Page 34: Lecture 4: Feature Extraction II

General Approach

1. Find a set of distinctive keypoints.

2. Define a region around each keypoint.

3. Extract and normalize the region content.

4. Compute a local descriptor from the normalized regions.

5. Match local descriptors.

Page 35: Lecture 4: Feature Extraction II

A1

Af

e.g. color

Bf

e.g. color

B1

A2

A3B2

B3

Similarity measure

Page 36: Lecture 4: Feature Extraction II

Necessary conditions for this approach

Need a repeatable detectorDetect the same point independently in both images

In this case, No chance to match!

Page 37: Lecture 4: Feature Extraction II

Necessary conditions for this approach

Need a reliable and distinctive descriptorFor each point correctly recognize the corresponding one.

?

Page 38: Lecture 4: Feature Extraction II

Descriptor InvariancesGeometric Transformations

⇓ ⇓

Page 39: Lecture 4: Feature Extraction II

An aside Levels of Geometric Invariance

Page 40: Lecture 4: Feature Extraction II

Photometric Transformations

Often modelled as a linear transformation: Scaling + offset.

Page 41: Lecture 4: Feature Extraction II

Requirements

Repeatable and Accurate Region Extraction

• Invariant to translation, rotation, scale changes

• Robust or covariant to out-of-plane (≈affine) transformations

• Robust to lighting variations, noise, blur, quantization

Locality Features are local, therefore robust to occlusion and clutter.

Quantity We need a sufficient number of regions to cover the object.

Distinctiveness The regions should contain interesting structure.

Efficiency Close to real-time performance.

Page 42: Lecture 4: Feature Extraction II

Keypoint Localization

Goals• Repeatable detection• Precise localization• Interesting content.

Look for two-dimensional signal changes.

Page 43: Lecture 4: Feature Extraction II

Find Corners

Key Property Region around a corner its image gradient has two or moredominant directions.

Corners are repeatable and distinctive

Page 44: Lecture 4: Feature Extraction II

Corners as distinctive interest points

Design criteria:• Should easily recognize the point by looking through a small window. (locality)• Shifting a window in any direction should give a large change in intensity. (goodlocalization)

Page 45: Lecture 4: Feature Extraction II

Harris Detector Formulation

Change in the intensity for a shift (u, v):

E(u, v) =∑

(x,y)∈W

w(x, y) [I(x+ u, y + v)− I(x, y)]2

where the window/weight mask w(x, y) is either:

! ⇥!⇤⌅⇧⌃⌥!⌅⌥! ⌦↵,⌅!.↵,✏⌦⌅⇣⌘!✓⌦◆!⇧⌃!56⇣↵⌫◆↵⌦!⇠⇧⌫⇡5⇣↵,⇢⌘!;,⌧⇣↵⇣5⇣!⌃=⌅!;,⌃⇧⌅⌫⌦⇣↵>⌘!?,↵@⌅⌧↵⇣!⇣!"5⇢⌧#5⌅⇢!!$↵%✏6↵⇣,⌅⌧⇣⌅⌥!&'⌘!()*+&,!"5⇢⌧#5⌅⇢⌘!-⌅⌫⌦,./!⌫⌦↵60! ⌦↵,⌅⌥.↵,✏⌦⌅⇣1↵,⌃⇧⌅⌫⌦⇣↵>⌥5,↵(⌦5⇢⌧#5⌅⇢⌥◆

!⇥#⇠⌦◆⇥#⌅⇧

*⇥⇤⇤⌅⇧⌃↵⌦#⌦#⇤/⌃

!⇥#⇠⌦◆⇥#⌅⇧

⇠✏⌦,⇢!⇧⌃!↵,⇣,⌧↵⇣.!⌃⇧⌅!⇣✏!⌧✏↵⌃⇣!6!⇥⇤70

!⇥⇤⌅⇥⇧⌃⇤⌥ ⌃⌦⇤⌅↵,⌃⇥⇤⌅⇥⇧⌃⇤⌥

⌃⇥↵.✏,⌦⇣⇥⌘⇤⌃.⇥

⇧⌅8↵,◆⇧9!⌃5,%⇣↵⇧,!⌅⇧⌃⇥⌥!:

-⌦5⌧⌧↵⌦,+!↵,!9↵,◆⇧9⌘!'!⇧5⇣⌧↵◆

! !⇥ ⇤ ⌅⇥⇤⌅⇧ ⇤ ⌃

⌥! ⇧ ⇤ ⌃⇥⇧  !⇧⌃⇥ ⇤ ⌃⌃⌅ ⇥⌥  ! ⇧ ⇤ ⌃⇥⇤

Page 46: Lecture 4: Feature Extraction II

Harris Detector Formulation

This measure of change can be approximated by:

E(u, v) ≈ (u, v)M(uv

)where M is a 2× 2 matrix computed from image derivatives:

M =∑

(x,y)∈W

w(x, y)(

Ix(x, y)2 Ix(x, y)Iy(x, y)Ix(x, y)Iy(x, y) Iy(x, y)2

)

=∑

(x,y)∈W

w(x, y)(Ix(x, y)Iy(x, y)

) (Ix(x, y) Iy(x, y)

)where Ix is the gradient with respect to x.

Page 47: Lecture 4: Feature Extraction II

Harris Detector Formulation

M is computed from these image derivatives:

Image I Ix Iy IxIy

Page 48: Lecture 4: Feature Extraction II

What does this matrix reveal?

Consider an axis-aligned corner:

Page 49: Lecture 4: Feature Extraction II

What does this matrix reveal?

For an axis-aligned corner:

M =( ∑

I2x

∑IxIy∑

IxIy∑I2y

)=(λ1 00 λ2

)• The dominant gradient directions align with the x and y axis.

• If either λ1 or λ2 is close to 0, then this is not a corner, so look for locationswhere both are large.

What if we have a corner that is not aligned with the image axes?

Page 50: Lecture 4: Feature Extraction II

General caseSince M is symmetric, then

M = R−1

(λ1 00 λ2

)R

Can visualize M as an ellipse with axis lengths determined by its eigenvalues andorientation determined by R.

! ⇥!⇤⌅⇧⌃⌥!⌅⌥! ⌦↵,⌅!.↵,✏⌦⌅⇣⌘!✓⌦◆!⇧⌃!56⇣↵⌫◆↵⌦!⇠⇧⌫⇡5⇣↵,⇢⌘!;,⌧⇣↵⇣5⇣!⌃=⌅!;,⌃⇧⌅⌫⌦⇣↵>⌘!?,↵@⌅⌧↵⇣!⇣!"5⇢⌧#5⌅⇢!!$↵%✏6↵⇣,⌅⌧⇣⌅⌥!&'⌘!()*+&,!"5⇢⌧#5⌅⇢⌘!-⌅⌫⌦,./!⌫⌦↵60! ⌦↵,⌅⌥.↵,✏⌦⌅⇣1↵,⌃⇧⌅⌫⌦⇣↵>⌥5,↵(⌦5⇢⌧#5⌅⇢⌥◆

⌃"⇥$⇠⌦◆⇥$⌅⇧

;,⇣,⌧↵⇣.!%✏⌦,⇢!↵,!⌧✏↵⌃⇣↵,⇢!9↵,◆⇧90!↵⇢,@⌦65!⌦,⌦6.⌧↵⌧

λ+⌘!λ=!>!↵⇢,@⌦65⌧!⇧⌃!⇣

!⇥⇤⌅⇧⌃⇥⌥ ⌥⌦ ⌃↵⌅ ,⌥.⌅,⌃ ⇧↵✏⇣⌅

!⇥⇤⌅⇧⌃⇥⌥ ⌥⌦ ⌃↵⌅ ⌦✏,⌃⌅,⌃ ⇧↵✏⇣⌅

?λ⌫⌦<@(+A=

?λ⌫↵,@(+A=

$66↵⇡⌧!⌘⇧$⇥⇤✓:!%⇧,⌧⇣

⇤⌅⇧ ⇤ ⌃

⌥# ⇧ ⇤ ⌃⇥⇧  ⇧⇤

 ⇧  ⌃

 ⇧  ⌃  ⌃⇤ 

Page 51: Lecture 4: Feature Extraction II

! ⇥!⇤⌅⇧⌃⌥!⌅⌥! ⌦↵,⌅!.↵,✏⌦⌅⇣⌘!✓⌦◆!⇧⌃!56⇣↵⌫◆↵⌦!⇠⇧⌫⇡5⇣↵,⇢⌘!;,⌧⇣↵⇣5⇣!⌃=⌅!;,⌃⇧⌅⌫⌦⇣↵>⌘!?,↵@⌅⌧↵⇣!⇣!"5⇢⌧#5⌅⇢!!$↵%✏6↵⇣,⌅⌧⇣⌅⌥!&'⌘!()*+&,!"5⇢⌧#5⌅⇢⌘!-⌅⌫⌦,./!⌫⌦↵60! ⌦↵,⌅⌥.↵,✏⌦⌅⇣1↵,⌃⇧⌅⌫⌦⇣↵>⌥5,↵(⌦5⇢⌧#5⌅⇢⌥◆

⌃"⇥$⇠⌦◆⇥$⌅⇧

λ+

λ=

3⇠⇧⌅,⌅4

λ+!⌦,◆!λ=!⌦⌅!6⌦⌅⇢⌘!λ+!B!λ=/⌘!↵,%⌅⌦⌧⌧!↵,!⌦66!◆↵⌅%⇣↵⇧,⌧

λ+!⌦,◆!λ=!⌦⌅!⌧⌫⌦66/⌘!↵⌧!⌦6⌫⇧⌧⇣!%⇧,⌧⇣⌦,⇣!↵,!⌦66!◆↵⌅%⇣↵⇧,⌧

3$◆⇢4!

λ+!CC!λ=

3$◆⇢4!

λ=!CC!λ+

3;6⌦⇣4!⌅⇢↵⇧,

⇠6⌦⌧⌧↵⌃↵%⌦⇣↵⇧,!⇧⌃!↵⌫⌦⇢!⇡⇧↵,⇣⌧!5⌧↵,⇢!↵⇢,@⌦65⌧!⇧⌃!⇣0

Page 52: Lecture 4: Feature Extraction II

Corner response function

Measure of corner response:

R = det(M)− κ (trace(M))2

where

det(M) = λ1λ2

trace(M) = λ1 + λ2

κ is a constant whose value was determined empirically to give results in the range[.04, .06].

Page 53: Lecture 4: Feature Extraction II

! ⇥!⇤⌅⇧⌃⌥!⌅⌥! ⌦↵,⌅!.↵,✏⌦⌅⇣⌘!✓⌦◆!⇧⌃!56⇣↵⌫◆↵⌦!⇠⇧⌫⇡5⇣↵,⇢⌘!;,⌧⇣↵⇣5⇣!⌃=⌅!;,⌃⇧⌅⌫⌦⇣↵>⌘!?,↵@⌅⌧↵⇣!⇣!"5⇢⌧#5⌅⇢!!$↵%✏6↵⇣,⌅⌧⇣⌅⌥!&'⌘!()*+&,!"5⇢⌧#5⌅⇢⌘!-⌅⌫⌦,./!⌫⌦↵60! ⌦↵,⌅⌥.↵,✏⌦⌅⇣1↵,⌃⇧⌅⌫⌦⇣↵>⌥5,↵(⌦5⇢⌧#5⌅⇢⌥◆

⌃"⇥$⇠⌦◆⇥$⌅⇧

λ+

λ= 3⇠⇧⌅,⌅4

3$◆⇢4!

3$◆⇢4!

3;6⌦⇣4

⇥!!◆⇡,◆⌧!⇧,6.!⇧,!↵⇢,@⌦65⌧!⇧⌃!

⇥!!↵⌧!6⌦⌅⇢!⌃⇧⌅!⌦!%⇧⌅,⌅

⇥!!↵⌧!,⇢⌦⇣↵@!9↵⇣✏!6⌦⌅⇢!⌫⌦⇢,↵⇣5◆!⌃⇧⌅!⌦,!◆⇢

⇥!EE!↵⌧!⌧⌫⌦66!⌃⇧⌅!⌦!⌃6⌦⇣!⌅⇢↵⇧,

!C!'

!F!'

!F!'""!⌧⌫⌦66

Page 54: Lecture 4: Feature Extraction II

Window function

M =∑

(x,y)∈W

w(x, y)(I2x IxIy

IxIy I2y

)

Option 1: Uniform window Sum over a square window

M =∑

(x,y)∈W

(I2x IxIy

IxIy I2y

)

Problem: Not rotationally invariant.

Page 55: Lecture 4: Feature Extraction II

Option 2: Smooth with Gaussian Gaussian already performs weighted sum

M =∑

(x,y)∈W

G(σ) ∗(I2x IxIy

IxIy I2y

)

Result is rotation invariant.

Page 56: Lecture 4: Feature Extraction II

Summary: Harris Detector

Compute second moment matrix

M(σl, σD) = G(σl) ∗(I2x(σD) IxIy(σD)

IxIy(σD) I2y(σD)

)

Compute cornerness function

R = det(M(σl, σD))− κ trace(M(σl, σD))

= (G ∗ I2x)(G ∗ I2

y)− (G ∗ (IxIy))2 − κ(G ∗ I2x +G ∗ I2

y)2

Perform non-maximum suppression

Page 57: Lecture 4: Feature Extraction II

Original Image

Harris response

Page 58: Lecture 4: Feature Extraction II

Harris: In action

! ⇥!⇤⌅⇧⌃⌥!⌅⌥! ⌦↵,⌅!.↵,✏⌦⌅⇣⌘!✓⌦◆!⇧⌃!56⇣↵⌫◆↵⌦!⇠⇧⌫⇡5⇣↵,⇢⌘!;,⌧⇣↵⇣5⇣!⌃=⌅!;,⌃⇧⌅⌫⌦⇣↵>⌘!?,↵@⌅⌧↵⇣!⇣!"5⇢⌧#5⌅⇢!!$↵%✏6↵⇣,⌅⌧⇣⌅⌥!&'⌘!()*+&,!"5⇢⌧#5⌅⇢⌘!-⌅⌫⌦,./!⌫⌦↵60! ⌦↵,⌅⌥.↵,✏⌦⌅⇣1↵,⌃⇧⌅⌫⌦⇣↵>⌥5,↵(⌦5⇢⌧#5⌅⇢⌥◆

!⇥⇤⇤⌅⇧⌃↵⌦)⌦)⇤,⌃-⇤.!0"

Page 59: Lecture 4: Feature Extraction II

Harris: Corner Response

! ⇥!⇤⌅⇧⌃⌥!⌅⌥! ⌦↵,⌅!.↵,✏⌦⌅⇣⌘!✓⌦◆!⇧⌃!56⇣↵⌫◆↵⌦!⇠⇧⌫⇡5⇣↵,⇢⌘!;,⌧⇣↵⇣5⇣!⌃=⌅!;,⌃⇧⌅⌫⌦⇣↵>⌘!?,↵@⌅⌧↵⇣!⇣!"5⇢⌧#5⌅⇢!!$↵%✏6↵⇣,⌅⌧⇣⌅⌥!&'⌘!()*+&,!"5⇢⌧#5⌅⇢⌘!-⌅⌫⌦,./!⌫⌦↵60! ⌦↵,⌅⌥.↵,✏⌦⌅⇣1↵,⌃⇧⌅⌫⌦⇣↵>⌥5,↵(⌦5⇢⌧#5⌅⇢⌥◆

!⇥⇤⇤⌅⇧⌃↵⌦)⌦)⇤,⌃-⇤.!0"⇠⇧⌫⇡5⇣!%⇧⌅,⌅!⌅⌧⇡⇧,⌧!

Page 60: Lecture 4: Feature Extraction II

Harris: Threshold

! ⇥!⇤⌅⇧⌃⌥!⌅⌥! ⌦↵,⌅!.↵,✏⌦⌅⇣⌘!✓⌦◆!⇧⌃!56⇣↵⌫◆↵⌦!⇠⇧⌫⇡5⇣↵,⇢⌘!;,⌧⇣↵⇣5⇣!⌃=⌅!;,⌃⇧⌅⌫⌦⇣↵>⌘!?,↵@⌅⌧↵⇣!⇣!"5⇢⌧#5⌅⇢!!$↵%✏6↵⇣,⌅⌧⇣⌅⌥!&'⌘!()*+&,!"5⇢⌧#5⌅⇢⌘!-⌅⌫⌦,./!⌫⌦↵60! ⌦↵,⌅⌥.↵,✏⌦⌅⇣1↵,⌃⇧⌅⌫⌦⇣↵>⌥5,↵(⌦5⇢⌧#5⌅⇢⌥◆

!⇥⇤⇤⌅⇧⌃↵⌦)⌦)⇤,⌃-⇤.!0";↵,◆!⇡⇧↵,⇣⌧!9↵⇣✏!6⌦⌅⇢!%⇧⌅,⌅!⌅⌧⇡⇧,⌧0!"⇣✏⌅⌧✏⇧6◆

Page 61: Lecture 4: Feature Extraction II

Harris: Local Maxima

! ⇥!⇤⌅⇧⌃⌥!⌅⌥! ⌦↵,⌅!.↵,✏⌦⌅⇣⌘!✓⌦◆!⇧⌃!56⇣↵⌫◆↵⌦!⇠⇧⌫⇡5⇣↵,⇢⌘!;,⌧⇣↵⇣5⇣!⌃=⌅!;,⌃⇧⌅⌫⌦⇣↵>⌘!?,↵@⌅⌧↵⇣!⇣!"5⇢⌧#5⌅⇢!!$↵%✏6↵⇣,⌅⌧⇣⌅⌥!&'⌘!()*+&,!"5⇢⌧#5⌅⇢⌘!-⌅⌫⌦,./!⌫⌦↵60! ⌦↵,⌅⌥.↵,✏⌦⌅⇣1↵,⌃⇧⌅⌫⌦⇣↵>⌥5,↵(⌦5⇢⌧#5⌅⇢⌥◆

!⇥⇤⇤⌅⇧⌃↵⌦)⌦)⇤,⌃-⇤.!0"G⌦>!⇧,6.!⇣✏!⇡⇧↵,⇣⌧!⇧⌃!6⇧%⌦6!⌫⌦<↵⌫⌦!⇧⌃!

Page 62: Lecture 4: Feature Extraction II

Harris: Final Points

! ⇥!⇤⌅⇧⌃⌥!⌅⌥! ⌦↵,⌅!.↵,✏⌦⌅⇣⌘!✓⌦◆!⇧⌃!56⇣↵⌫◆↵⌦!⇠⇧⌫⇡5⇣↵,⇢⌘!;,⌧⇣↵⇣5⇣!⌃=⌅!;,⌃⇧⌅⌫⌦⇣↵>⌘!?,↵@⌅⌧↵⇣!⇣!"5⇢⌧#5⌅⇢!!$↵%✏6↵⇣,⌅⌧⇣⌅⌥!&'⌘!()*+&,!"5⇢⌧#5⌅⇢⌘!-⌅⌫⌦,./!⌫⌦↵60! ⌦↵,⌅⌥.↵,✏⌦⌅⇣1↵,⌃⇧⌅⌫⌦⇣↵>⌥5,↵(⌦5⇢⌧#5⌅⇢⌥◆

!⇥⇤⇤⌅⇧⌃↵⌦)⌦)⇤,⌃-⇤.!0"

Page 63: Lecture 4: Feature Extraction II

Harris Detector: Responses

Effect: A very precise corner detector.

Page 64: Lecture 4: Feature Extraction II

Harris Detector: Properties

Corner response R is invariant to image rotation

Ellipse rotates but its shape (i.e. eigenvalues) remains the same.

Page 65: Lecture 4: Feature Extraction II

Harris Detector: Properties

Not invariant to changes in scale

corner

all points classified as edges

Page 66: Lecture 4: Feature Extraction II

Scale Invariant Region Selection

Page 67: Lecture 4: Feature Extraction II

From points to regions

• The Harris operator defines interest points

– precise localization– High repeatability

• In order to compare those points, we need to compute a descriptor over a region.

How can we define such a region in a scale invariant manner?

• How can we detect scale invariant interest regions?

Page 68: Lecture 4: Feature Extraction II

Scale Invariant Detection

Consider regions of different sizes around a point. Regions of corresponding sizeswill look the same in both images.

Harris Detector: Some Properties

Rotation invariance

Ellipse rotates but its shape (i.e. eigenvalues)

remains the same

Corner response R is invariant to image rotation

Harris Detector: Some Properties

Partial invariance to affine intensity change

!! Only derivatives are used => invariance

to intensity shift I ! I + b

!! Intensity scale: I ! a I

R

x (image coordinate)

threshold

R

x (image coordinate)

Harris Detector: Some Properties

But: non-invariant to image scale!

All points will be

classified as edges Corner !

Scale Invariant Detection

Consider regions (e.g. circles) of different sizes around a point

Regions of corresponding sizes will look the same in both images

Page 69: Lecture 4: Feature Extraction II

Naive Approach: Exhaustive SearchCompare descriptors while varying the patch size

Page 70: Lecture 4: Feature Extraction II

Naive Approach: Exhaustive SearchCompare descriptors while varying the patch size

Page 71: Lecture 4: Feature Extraction II

Naive Approach: Exhaustive Search

Compare descriptors while varying the patch size

Page 72: Lecture 4: Feature Extraction II

Naive Approach: Exhaustive Search

Compare descriptors while varying the patch size

Page 73: Lecture 4: Feature Extraction II

Naive Approach: Exhaustive Search

Compare descriptors while varying the patch size

• Computationally inefficient

• Inefficient but possible for matching

• Prohibitive for retrieval in large databases

• Prohibitive for recognition.

Page 74: Lecture 4: Feature Extraction II

Scale Invariant Detection

Problem: How do we choose corresponding circles independently in each image?

Scale Invariant Detection

The problem: how do we choose corresponding circles

independently in each image?

Choose the scale of the “best” corner

Feature selection

Distribute points evenly over the image

Adaptive Non-maximal Suppression

Desired: Fixed # of features per image

•! Want evenly distributed spatially…

•! Search over non-maximal suppression radius

[Brown, Szeliski, Winder, CVPR’05]

Feature descriptors

We know how to detect points

Next question: How to match them?

?

Point descriptor should be:

1.! Invariant 2. Distinctive

Page 75: Lecture 4: Feature Extraction II

Automatic Scale Detection

Design a function on the region, which is scale invariant (the same for correspondingregions, even if they are at different scales)

• Take a local maximum of this funcion.• Region size of the maximum should be invariant to image scale.

f

Region size

Image 1 f

Region size

Image 2

scale = .5

s1 = .5 s2s1 s2

Important: this scale invariant region size is found in each image independently.

Page 76: Lecture 4: Feature Extraction II

Automatic Scale DetectionFunction responses for increasing scale (scale signature)

Page 77: Lecture 4: Feature Extraction II

Automatic Scale Detection

Normalize: Rescale to fixed size

Page 78: Lecture 4: Feature Extraction II

What is a useful signature function?Laplacian-of-Gaussian = blob detection

Page 79: Lecture 4: Feature Extraction II

Characteristic Scale

We define the characteristic scale as the scale that produces peak of Laplacianresponse.

Page 80: Lecture 4: Feature Extraction II

Summary: Scale invariant detector

Given Two images of the same scene with a large scale difference between them.

Goal Find the same interest points independently in each image.

Solution Search for maxima of suitable functions in scale and in space (over theimage).

Page 81: Lecture 4: Feature Extraction II

What do we do now?

We know how to detect points.

Next question: How to match them?

Scale Invariant Detection

The problem: how do we choose corresponding circles

independently in each image?

Choose the scale of the “best” corner

Feature selection

Distribute points evenly over the image

Adaptive Non-maximal Suppression

Desired: Fixed # of features per image

•! Want evenly distributed spatially…

•! Search over non-maximal suppression radius

[Brown, Szeliski, Winder, CVPR’05]

Feature descriptors

We know how to detect points

Next question: How to match them?

?

Point descriptor should be:

1.! Invariant 2. Distinctive You know this, what are our options?

Page 82: Lecture 4: Feature Extraction II

Today’s programming assignment

Page 83: Lecture 4: Feature Extraction II

Programming Assignment

• Details available on the course website.

• You will write Matlab functions compute Harris interest points at different scalesand also estimate the dominant direction of the gradient around the interestpoint.

• Important If you have a lecture clash read the course website for a solution.

• Mail me about any errors you spot in the Exercise notes.

• I will notify the class about errors spotted and correstions via the course websiteand mailing list.