Keypoint -based Recognition and Object Search
description
Transcript of Keypoint -based Recognition and Object Search
![Page 1: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/1.jpg)
Keypoint-based Recognition and Object Search
Computer VisionCS 543 / ECE 549
University of Illinois
Derek Hoiem
03/08/11
![Page 2: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/2.jpg)
Notices
• I’m having trouble connecting to the web server, so can’t post lecture slides right now
• HW 2 due Thurs
• Guest lecture Thurs by Ali Farhadi
![Page 3: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/3.jpg)
General Process of Object Recognition
Specify Object Model
Generate Hypotheses
Score Hypotheses
Resolution
![Page 4: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/4.jpg)
General Process of Object RecognitionExample: Keypoint-based
Instance Recognition
Specify Object Model
Generate Hypotheses
Score Hypotheses
Resolution
B1B2
B3A1
A2
A3
Last Class
![Page 5: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/5.jpg)
Overview of Keypoint Matching
K. Grauman, B. Leibe
N p
ixel
s
N pixels
Af
e.g. color
Bf
e.g. color
B1
B2
B3A1
A2 A3
Tffd BA ),(
1. Find a set of distinctive key- points
3. Extract and normalize the region content
2. Define a region around each keypoint
4. Compute a local descriptor from the normalized region
5. Match local descriptors
![Page 6: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/6.jpg)
General Process of Object RecognitionExample: Keypoint-based
Instance Recognition
Specify Object Model
Generate Hypotheses
Score Hypotheses
Resolution
B1B2
B3A1
A2
A3
Affine Parameters
Choose hypothesis with max score above threshold
# Inliers
Affine-variant point locations
This Class
![Page 7: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/7.jpg)
Finding the objects (overview)
1. Match interest points from input image to database image2. Matched points vote for rough position/orientation/scale of
object3. Find triplets of position/orientation/scale that have at least
three votes4. Compute affine registration and matches using iterative least
squares with outlier check5. Report object if there are at least T matched points
Input Image Stored
Image
![Page 8: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/8.jpg)
Matching Keypoints
• Want to match keypoints between:1. Query image2. Stored image containing the object
• Given descriptor x0, find two nearest neighbors x1, x2 with distances d1, d2
• x1 matches x0 if d1/d2 < 0.8– This gets rid of 90% false matches, 5% of true
matches in Lowe’s study
![Page 9: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/9.jpg)
Affine Object Model• Accounts for 3D rotation of a surface under
orthographic projection
![Page 10: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/10.jpg)
Affine Object Model• Accounts for 3D rotation of a surface under
orthographic projection
Scaling/skew Translation
What is the minimum number of matched points that we need?
![Page 11: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/11.jpg)
Finding the objects (in detail)1. Match interest points from input image to database image2. Get location/scale/orientation using Hough voting
– In training, each point has known position/scale/orientation wrt whole object
– Matched points vote for the position, scale, and orientation of the entire object
– Bins for x, y, scale, orientation• Wide bins (0.25 object length in position, 2x scale, 30 degrees orientation)• Vote for two closest bin centers in each direction (16 votes total)
3. Geometric verification– For each bin with at least 3 keypoints– Iterate between least squares fit and checking for inliers and
outliers4. Report object if > T inliers (T is typically 3, can be computed to
match some probabilistic threshold)
![Page 12: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/12.jpg)
Examples of recognized objects
![Page 13: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/13.jpg)
View interpolation• Training
– Given images of different viewpoints
– Cluster similar viewpoints using feature matches
– Link features in adjacent views
• Recognition– Feature matches may be
spread over several training viewpoints
Use the known links to “transfer votes” to other viewpoints
Slide credit: David Lowe
[Lowe01]
![Page 14: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/14.jpg)
Applications• Sony Aibo
(Evolution Robotics)
• SIFT usage– Recognize
docking station– Communicate
with visual cards
• Other uses– Place recognition– Loop closure in SLAM
K. Grauman, B. Leibe 15Slide credit: David Lowe
![Page 15: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/15.jpg)
Location Recognition
Slide credit: David Lowe
Training
[Lowe04]
![Page 16: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/16.jpg)
How to quickly find images in a large database that match a given image region?
![Page 17: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/17.jpg)
Simple ideaSee how many keypoints are close to keypoints in each other image
Lots of Matches
Few or No Matches
But this will be really, really slow!
![Page 18: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/18.jpg)
Fast visual search
“Scalable Recognition with a Vocabulary Tree”, Nister and Stewenius, CVPR 2006.
“Video Google”, Sivic and Zisserman, ICCV 2003
![Page 19: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/19.jpg)
Slide
110,000,000Images in
5.8 Seconds
Slide Credit: Nister
![Page 20: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/20.jpg)
Slide Slide Credit: Nister
![Page 21: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/21.jpg)
Slide Slide Credit: Nister
![Page 22: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/22.jpg)
Slide Credit: NisterSlide
![Page 23: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/23.jpg)
Key Ideas
• Visual Words– Cluster descriptors (e.g., K-means)
• Inverse document file– Quick lookup of files given keypoints
tf-idf: Term Frequency – Inverse Document Frequency
# words in document
# times word appears in document
# documents
# documents that contain the word
![Page 24: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/24.jpg)
Recognition with K-tree
Following slides by David Nister (CVPR 2006)
![Page 25: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/25.jpg)
![Page 26: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/26.jpg)
![Page 27: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/27.jpg)
![Page 28: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/28.jpg)
![Page 29: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/29.jpg)
![Page 30: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/30.jpg)
![Page 31: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/31.jpg)
![Page 32: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/32.jpg)
![Page 33: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/33.jpg)
![Page 34: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/34.jpg)
![Page 35: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/35.jpg)
![Page 36: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/36.jpg)
![Page 37: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/37.jpg)
![Page 38: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/38.jpg)
![Page 39: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/39.jpg)
![Page 40: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/40.jpg)
![Page 41: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/41.jpg)
![Page 42: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/42.jpg)
![Page 43: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/43.jpg)
![Page 44: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/44.jpg)
![Page 45: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/45.jpg)
![Page 46: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/46.jpg)
![Page 47: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/47.jpg)
![Page 48: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/48.jpg)
Performance
![Page 49: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/49.jpg)
More words is better ImprovesRetrieval
ImprovesSpeed
Branch factor
![Page 50: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/50.jpg)
Higher branch factor works better (but slower)
![Page 51: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/51.jpg)
Can we be more accurate?
So far, we treat each image as containing a “bag of words”, with no spatial information
af
z
e
e
afee
af e
e
h
hWhich matches better?
![Page 52: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/52.jpg)
Can we be more accurate?
So far, we treat each image as containing a “bag of words”, with no spatial information
Real objects have consistent geometry
![Page 53: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/53.jpg)
Final key idea: geometric verificationRANSAC for affine transform
af
z
e
ea
f e
ez
af
z
e
ea
f e
ez
Affine Transform
Randomly choose 3 matching pairs
Estimate transformation
Predict remaining points and count “inliers”
Repeat N times:
![Page 54: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/54.jpg)
Application: Large-Scale Retrieval
K. Grauman, B. Leibe 57[Philbin CVPR’07]
Query Results on 5K (demo available for 100K)
![Page 55: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/55.jpg)
Example Applications
B. Leibe 58
Mobile tourist guide• Self-localization• Object/building recognition• Photo/video augmentation
Aachen Cathedral
[Quack, Leibe, Van Gool, CIVR’08]
![Page 56: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/56.jpg)
Application: Image Auto-Annotation
K. Grauman, B. Leibe 59
Left: Wikipedia imageRight: closest match from Flickr
[Quack CIVR’08]
Moulin Rouge
Tour Montparnasse Colosseum
ViktualienmarktMaypole
Old Town Square (Prague)
![Page 57: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/57.jpg)
Video Google System
1. Collect all words within query region
2. Inverted file index to find relevant frames
3. Compare word counts4. Spatial verification
Sivic & Zisserman, ICCV 2003
• Demo online at : http://www.robots.ox.ac.uk/~vgg/research/vgoogle/index.html
K. Grauman, B. Leibe
Query region
Retrieved fram
es
![Page 58: Keypoint -based Recognition and Object Search](https://reader036.fdocuments.in/reader036/viewer/2022062521/56816688550346895dda40e5/html5/thumbnails/58.jpg)
Things to remember• Object instance recognition
– Find keypoints, compute descriptors
– Match descriptors– Vote for / fit affine parameters– Return object if # inliers > T
• Keys to efficiency– Visual words
• Used for many applications– Inverse document file
• Used for web-scale search