Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP...
Transcript of Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP...
![Page 1: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/1.jpg)
Crafting and learning for image matching
presented by Dmytro Mishkin
joint work with
Anastasia Mishchuk, Milan Pultar, Filip Radenovic, Daniel Barath, Michal Perdoch, Jiri Matas
12019.07.06, Odesa, EECVC
![Page 2: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/2.jpg)
What is image matching?
• The task is to find the correspondences between pixels in two images and/or find a geometrical relations between camera poses
• It`s special version is also known as “wide baseline stereo”:
• large change in viewpoint, illumination, time & occlusions, modality
2019.07.06, Odesa, EECVC 2
![Page 3: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/3.jpg)
Where is image matching useful? 3D rec & SLAM
R. Mur-Artal, and J. D. Tardós. ORB-SLAM2: an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras, arXiv 2016
L. Schonberger and J.-M. Frahm,Structure-from-Motion Revisited, 2016COLMAP
32019.07.06, Odesa, EECVC
Daniel DeTone, Tomasz Malisiewicz, Andrew RabinovichMagicLeap SLAM
![Page 4: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/4.jpg)
Image retrieval 2.0F. Radenovic, series of works
42019.07.06, Odesa, EECVC
Google Landmark Retrieval challenge 2019 winner
Where is image matching useful? Image retrieval
![Page 5: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/5.jpg)
What is NOT topic of my talk
• Semantic correspondences (the object/scene is not the same)
2019.07.06, Odesa, EECVC 5
Images from Aberman et al. “Neural Best-Buddies: Sparse Cross-Domain Correspondence”, SIGGRAPH 2018
![Page 6: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/6.jpg)
What is NOT topic of my talk
• Short baseline stereo (wait for Anastasiia Mishchuk talk at 17:20)
• Optimization methods for stereo• see talks from previous EECVCs:
Alexander Shekhovtsov (2017) and Tolga Birdal (2018)
2019.07.06, Odesa, EECVC 6
![Page 7: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/7.jpg)
Wide baseline stereo pipeline
72019.07.06, Odesa, EECVC
Measurement region selector
Measurement region selector
Matching
Descriptor
DescriptorDetector
Detector
Geometrical verification (RANSAC)
Image credit: Andrea Vedaldi, ICCVW 2017
Single feature visualization
![Page 8: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/8.jpg)
Toy example for illustration: matching with OpenCV SIFT
2019.07.06, Odesa, EECVC 8Try yourself: https://github.com/ducha-aiki/matching-strategies-comparison
![Page 9: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/9.jpg)
Toy example for illustration: matching with OpenCV SIFT
2019.07.06, Odesa, EECVC 9
Recovered 1st to 2nd image projection, ground truth 1st to 2nd image project,inlier correspondences
![Page 10: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/10.jpg)
Geometric verification (RANSAC)
2019.07.06, Odesa, EECVC 10
Measurement region selector
Measurement region selector
Matching
Descriptor
DescriptorDetector
Detector
Geometrical verification (RANSAC)
![Page 11: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/11.jpg)
Homography: planar surface/static camera
2019.07.06, Odesa, EECVC 11
Image credit: forums.fast.ai
Planar surface or static camera → use homographyImage with dominant plane → use homographyNot sure what to use? → try homography first.
![Page 12: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/12.jpg)
Fundamental matrix: general two-view case
2019.07.06, Odesa, EECVC 12
• General two view geometry in static scene. A corresponding point lies somewhere on a line in the other image. Where on the line - depends on the (unknown) depth
• Weaker constraint than homography
• Still rigid (no motion in scene assumed)
Image credit: https://en.wikipedia.org/wiki/Epipolar_geometry
![Page 13: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/13.jpg)
RANSAC: fitting the data with gross outliers
• What is it
2019.07.06, Odesa, EECVC 13
Image credit: https://scipy-cookbook.readthedocs.io/items/RANSAC.html
OpenCV functions:cv2.findHomography()cv2.findFundamentalMatrix()
We will publish soon a python package, which is 2..5 times faster and have an additional tricks inside
https://github.com/ducha-aiki/pyransac (save this link,the repo is private now, will clean-up and open next week)
![Page 14: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/14.jpg)
Pitfails and solutions: homography
• Wrong geometry case because of dirty correspondences
2019.07.06, Odesa, EECVC 14
OpenCV finds 31 wrong inliers in 0.018s.
CMP RANSAC finds 6 wrong inliers in 0.004s.
See the same pattern in img1: 3 corrs in line + group.RANSAC H is prone to such case
![Page 15: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/15.jpg)
Pitfails and solutions: homography
• Solution: 2-way error metric
2019.07.06, Odesa, EECVC 15
CMP RANSAC + transfer check:finds 48 correct inliers in 0.005s.
𝐻2−1 = 𝐻1−2−1
Check #inliers consistent in opposite direction
Python CMP RANSAC package will be available soon
D. Mishkin, J. Matas and M. Perdoch. MODS: Fast and Robust Method for Two-View Matching, CVIU 2015,
![Page 16: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/16.jpg)
Pitfails and solutions: fundamental matrix
• F is too permissive (point to line)
2019.07.06, Odesa, EECVC 16
![Page 17: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/17.jpg)
Pitfails and solutions: fundamental matrix
• LAF-check: remember that local feature is oriented circle or ellipse, not just a point.
• Check if additional points on circle are consistent with geometry
2019.07.06, Odesa, EECVC 17
D. Mishkin, J. Matas and M. Perdoch. MODS: Fast and Robust Method for Two-View Matching, CVIU 2015,
![Page 18: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/18.jpg)
Matching strategies
2019.07.06, Odesa, EECVC 18
Measurement region selector
Measurement region selector
Matching
Descriptor
DescriptorDetector
Detector
Geometrical verification (RANSAC)
![Page 19: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/19.jpg)
Nearest neighbor (1NN) strategy
2019.07.06, Odesa, EECVC 20
Features from img1 are matched to features from img2
You can see, that it is asymmetric and allowing “many-to-one” matches
![Page 20: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/20.jpg)
Nearest neighbor (NN) strategy
2019.07.06, Odesa, EECVC 21
Features from img1 are matched to features from img2
OpenCV RANSAC failed to find a good model with NN matchingFound 1st image projection: blue, ground truth: green,Inlier correspondences: yellow
![Page 21: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/21.jpg)
Mutual nearest neighbor (MNN) strategy
2019.07.06, Odesa, EECVC 22
Features from img1 are matched to features from img2Only cross-consistent (mutual NNs) matches are retained.
![Page 22: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/22.jpg)
Mutual nearest neighbor (MNN) strategy
2019.07.06, Odesa, EECVC 23
OpenCV RANSAC failed to find a good model with MNN matchingNo one-to-many connections, but still badFound 1st image projection: blue, ground truth: green ,inlier correspondences: yellow
Features from img1 are matched to features from img2Only cross-consistent (mutual NNs) matches are retained.
![Page 23: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/23.jpg)
Second nearest neighbor ratio (SNN) strategy
2019.07.06, Odesa, EECVC 24
1stNN
2ndNN
2ndNN
1stNN
2ndNN
1stNN
1stNN/2ndNN > 0.8, drop
1stNN/2ndNN < 0.8, keep
Features from img1 are matched to features from img2- we look for 2 nearest neighbors
- If both are too similar (1stNN/2ndNN ratio > 0.8) →discard
- If 1st NN is much closer (1stNN/2ndNN ratio ≤ 0.8) →keep
D. Lowe, Distinctive Image Features from Scale-Invariant Keypoints, IJCV 2004
![Page 24: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/24.jpg)
2019.07.06, Odesa, EECVC 25
NN
SNN
SNN
NN
SNN
NN
NN/SNN > 0.8, drop
NN/SNN < 0.8, keep
OpenCV RANSAC found a model roughly correctNo one-to-many connections, but still badFound 1st image projection: blue, ground truth: green ,inlier correspondences: yellow
Second nearest neighbor ratio (SNN) strategy
![Page 25: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/25.jpg)
1st geometrically inconsistent nearest neighbor ratio (FGINN) strategy
2019.07.06, Odesa, EECVC 26
SNN ratio is cool, but what about symmetrical, or too closely detected features? Ratio test will kill them.Solution: look for 2nd nearest neighbor, which is far enough from 1st nearest.
Mishkin et al., “MODS: Fast and Robust Method for Two-View Matching”, CVIU 2015
![Page 26: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/26.jpg)
1st geometrically inconsistent nearest neighbor ratio (FGINN) strategy
2019.07.06, Odesa, EECVC 27
SNN ratio is cool, but what about symmetrical, or too closely detected features? Ratio test will kill them.Solution: look for 2nd nearest neighbor, which is far enough from 1st nearest.
Mishkin et al., “MODS: Fast and Robust Method for Two-View Matching”, CVIU 2015
![Page 27: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/27.jpg)
SNN vs FGINN
2019.07.06, Odesa, EECVC 28Mishkin et al., “MODS: Fast and Robust Method for Two-View Matching”, CVIU 2015
SNN: roughly correct
FGINN: more correspondences,better geometry found
![Page 28: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/28.jpg)
Symmetrical FGINN
2019.07.06, Odesa, EECVC 29
Recall, that FGINN is still asymmetric:Matching (Img1 → Img2) ≠ (Img2 → Img1)
We can do both (Img1 → Img2) and (Img2 → Img1)
and keep all FGINNs (union)
or only cross-consistent FGINNs
![Page 29: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/29.jpg)
Learned filtering strategy (CVPR 2018)
2019.07.06, Odesa, EECVC 30Yi et al. Learning to Find Good Correspondences https://arxiv.org/abs/1711.05971
Input: matches (x1, y1, x2, y2) [ N x 4 ]Output: scores [ N x 1 ]
![Page 30: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/30.jpg)
Evaluation on IMW2019 data
2019.07.06, Odesa, EECVC 31
CVPR 2019 competition
https://image-matching-workshop.github.io/
Evaluation
Stereo: features ⇨ matching ⇨
⇨ OpenCV RANSAC ⇨ pose estimation
Participants
organizers
Metric: # of precise enough recovered camera poses (mAP @ 15°)
![Page 31: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/31.jpg)
Evaluation on IMW2019 data
2019.07.06, Odesa, EECVC 32
• NN – are you kidding? Never use it alone• SNN is simple and good• FGINN is always a bit better• Symmetrical FGINN rocks• Learning is not that powerful (yet?)
![Page 32: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/32.jpg)
Descriptor: HardNet (NIPS, 2017)
Mishchuk et.al. Working hard to know your neighbor’s margins: Local descriptor learning loss. NIPS 2017
332019.07.06, Odesa, EECVC
Measurement region selector
Measurement region selector
Matching
Descriptor
DescriptorDetector
Detector
Geometrical verification (RANSAC)
![Page 33: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/33.jpg)
• Q1: How to find correct correspondences?
• Q2: How to filter out features, which do not have a correspondence?
• A1: Nearest neighbor by descriptor distance
• A2: Threshold the second-to-first nearest ratio (SNN)
HardNet: lets use it for training CNNs!
34
Classical way to select good correspondences
2019.07.06, Odesa, EECVC
![Page 34: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/34.jpg)
Architecture: deep, VGGNet style
• Adopted from previous sota L2Net descriptor Tian et al (CVPR 2017).
• Vanilla CNN: Convolution + BatchNorm + ReLU
36
2019.07.06, Odesa, EECVC
![Page 35: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/35.jpg)
37Sampling: positives: random
negatives: hard-in-batch
2019.07.06, Odesa, EECVC
![Page 36: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/36.jpg)
HardNet vs SIFT descriptor
2019.07.06, Odesa, EECVC 38Mishkin et al., “MODS: Fast and Robust Method for Two-View Matching”, CVIU 2015
SIFT: 71 inliers
HardNet: 121 inliers
![Page 37: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/37.jpg)
39
Results: HPatches
1.5 … 2 times better than rootSIFT:
2019.07.06, Odesa, EECVC
![Page 38: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/38.jpg)
GeoDesc: same architecture, special loss and sampling utilizing 3d reconstruction data.
40
HardNet training scales well with the bigger datasets
Luo et.al, ECCV 20182019.07.06, Odesa, EECVC
![Page 39: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/39.jpg)
Descriptor: creating the dataset (CVWW, 2019)Leveraging Outdoor Webcams for Local Descriptor LearningMilan Pultar, Dmytro Mishkin, Jiří Matas
412019.07.06, Odesa, EECVC
Measurement region selector
Measurement region selector
Matching
Descriptor
DescriptorDetector
Detector
Geometrical verification (RANSAC)
![Page 40: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/40.jpg)
● Brown dataset ● HPatches● PhotoSynth● GL3D
Existing datasets for local descriptor learning
42
![Page 41: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/41.jpg)
● 1 128 millions of images● Almost 30K webcams● Continuously growing● Cameras placed across all world● ~10 TB of data
● Each camera in one directory○ Split further into folders by year○ Image timestamp in GMT○ GPS info not always available
Archive of Many Outdoor Scenes (AMOS)
43
Camera 1001
Camera 1002
Images -2011
Images -2013Images -2010
Images -2011
![Page 42: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/42.jpg)
AMOS views
44
Good cameras
Bad cameras
![Page 43: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/43.jpg)
Pipeline of AMOS Patches
45
![Page 44: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/44.jpg)
Camera selection
● Choose randomly 20 images from each camera● Test each image using the criteria
● Keep the camera, if 14/20 images pass46
-> 474 cameras
Sky segmentation in the wild: An empirical study.R. P. Mihail et al, 2012(https://github.com/kuangliu/torchcv)
![Page 45: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/45.jpg)
Appearance clustering
● Solves data redundancy
● Use fc6 layer of ImageNet-pretrained AlexNet
● Run K-means in the AlexNet output space
● Choose K=120 most representative images
(by looking at the corresponding outputs)
47-> 474 cameras, each 120 images
Imagenet classification with deep convolutional neural networks.A. Krizhevsky et al, 2012
![Page 46: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/46.jpg)
Viewpoint reclustering
● Solves switching of cameras between views● Uses MODS (image matching) in greedy algorithm1. Pick a reference image2. Find matching pairs3. Create a new view; exclude images from original sequence4. If original sequence not empty:
Repeat● Keep the biggest view from each camera, 50 images each (if available)
48-> 273 views
![Page 47: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/47.jpg)
Registration
● Results still not satisfactory● Why?
○ MODS often outputs homography matrix only for small part of image
○ Need for final manual check
-> Use GDB-ICP● In each view
○ Run registration on pairs of images○ If a single fail -> remove the whole view
49-> 151 registered views
![Page 48: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/48.jpg)
Manual pruning
● Several problems not detected so far○ Dynamic scenes○ Cloud-dominated scenes○ Views with very similar content
50-> 27 registered views, 50 images each
![Page 49: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/49.jpg)
Patch selection
51
● Apply masks (crop out text etc.)
● Sampling of centers (response function)
● Random rotation (any angle)
● Random scale
![Page 50: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/50.jpg)
Experiments: (de)registration
● Displace each patch randomly○ observe the influence on precision
● Result:○ Precise registration is important
● mAP - mean average precision
52
HPatches matching task, full split
![Page 51: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/51.jpg)
Experiments: #batch composition
● We use hard-in-batch triplet margin loss● Composition of a batch influences precision● Idea: choose a subset of views
as source for patches
● Intuition: Tough pairs often comefrom the same image
-> Improvement
53
HPatches matching task, full split
![Page 52: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/52.jpg)
Evaluation
● New state-of-the-art in matching underillumination changes(to the best of our knowledge)
● Outperforms recently proposedHardNetPS in full split
● We propose AMOS Patches test splitfor evaluation of robustness tolighting and season-related conditions
55
![Page 53: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/53.jpg)
Measurement region selector:orientation
562019.07.06, Odesa, EECVC
Measurement region selector
Measurement region selector
Matching
Descriptor
DescriptorDetector
Detector
Geometrical verification (RANSAC)
![Page 54: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/54.jpg)
Which patch should we describe?
2019.07.06, Odesa, EECVC 57
Detector: x, y, scaleShould we rotate patch?Should we deform patch?
Handcrafted: dominant orientation
Learned orientation: CNN
Yi et al. Learning to Assign Orientations to Feature Points CVPR 2016
D. Lowe, Distinctive Image Features from Scale-Invariant Keypoints, IJCV 2004
![Page 55: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/55.jpg)
If images are upright for sure: don`t detect orientation
2019.07.06, Odesa, EECVC 58
DoG + HardNet matches +FGINN union + RANSAC. Found 1st image projection: blue,ground truth: green ,
inlier correspondences: yellow
Dominant gradient orientation: 123 inliers
Learned orientation:140 inliers
Constant orientation: 181 inliers
![Page 56: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/56.jpg)
AffNet (ECCV 2018)Measurement region selector
592019.07.06, Odesa, EECVC
Measurement region selector
Measurement region selector
Matching
Descriptor
DescriptorDetector
Detector
Geometrical verification (RANSAC)
![Page 57: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/57.jpg)
AffNet: learning measurement region
Mishkin et.al. Repeatability Is Not Enough: Learning Affine Regions via Discriminability. ECCV 2018 602019.07.06, Odesa, EECVC
![Page 58: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/58.jpg)
Do AffNet help? Yes, if the problem is hard
2019.07.06, Odesa, EECVC 61
FGINN union + RANSAC. Found 1st image projection: blue,ground truth: green ,
inlier correspondences: yellow
DoG + HardNet 2.0: 123 inliers
DoG + AffNet + HardNet 2.0: 165 inliers
![Page 59: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/59.jpg)
62
• Find affine shape such that maximizes difference between positive and hardest-in-batch negative examples
• Positive-only learning (Yi et. Al, CVPR2015) leads to degenerated ellipses
• Triplet margin (HardNet) – unstable in training affine shape
AffNet: learning measurement region
2019.07.06, Odesa, EECVC
![Page 60: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/60.jpg)
Local feature detector
2019.07.06, Odesa, EECVC 63
Measurement region selector
Measurement region selector
Matching
Descriptor
DescriptorDetector
Detector
Geometrical verification (RANSAC)
![Page 61: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/61.jpg)
Detector is the often failure point of the whole process
• Yet we still use 10-20 y.o stuff like SIFT or FAST, because nothing significantly better for practical purposed have been proposed
• So let`s stick to the basics
2019.07.06, Odesa, EECVC 64
Stylianou et.al, WACV 2015. Characterizing Feature Matching Performance Over Long Time Periods
![Page 62: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/62.jpg)
SIFT is the DoG detector + SIFT descriptor
• Really, there is not such thing, as SIFT detector.
• But everyone so got used to name DoG as SIFT
2019.07.06, Odesa, EECVC 65
DoG filter is a simple blob templatehttps://docs.opencv.org/3.4.3/da/df5/tutorial_py_sift_intro.html
Gaussian scalespace, “stack of gradually smoothed versions” of original image
Detections on synthetic image
![Page 63: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/63.jpg)
ORB: FAST detector + BRIEF descriptor
2019.07.06, Odesa, EECVC 66
FAST is a corner detector based on segment test
![Page 64: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/64.jpg)
Joint detectors and descriptors
SuperPoint (CVPRW 2017)DELF (ICCV 2017)D2Net (CVPR 2019)
2019.07.06, Odesa, EECVC 67
Measurement region selector
Measurement region selector
Matching
Descriptor
DescriptorDetector
Detector
Geometrical verification (RANSAC)
![Page 65: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/65.jpg)
SuperPoint
2019.07.06, Odesa, EECVC 68DeTone et al., SuperPoint: Self-Supervised Interest Point Detection and Description CVPRW2017
![Page 66: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/66.jpg)
DELF
2019.07.06, Odesa, EECVC 69Noh et al., SuperPoint: Large-Scale Image Retrieval with Attentive Deep Local Features ICCV 2017
“Attention” as weighting for global descriptor
![Page 67: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/67.jpg)
D2Net
2019.07.06, Odesa, EECVC 70Dusmanu et al, D2-Net: A Trainable CNN for Joint Description and Detection of Local Features CVPR 2019
![Page 68: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/68.jpg)
Comparison on toy example
2019.07.06, Odesa, EECVC 71
SuperPoint: 51 inliers
DoG + HardNet: 123 inliers
D2Net: 26 inliers, incorrect geometry
![Page 69: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/69.jpg)
All things together
2019.07.06, Odesa, EECVC 72
SIFT + SNN match + OpenCV RANSAC:27 inliers
SIFT + NoOri + HardNet + FGINN union match + CMP RANSAC:179 inliers
![Page 70: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/70.jpg)
I really need to match this
• View synthesis: MODS
2019.07.06, Odesa, EECVC 73
![Page 71: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/71.jpg)
MODS (controller and preprocessor)
MODS handles angular viewpoint difference up to:• 85° for planar scenes • 30° for structured
D. Mishkin, J. Matas and M. Perdoch. MODS: Fast and Robust Method for Two-View Matching, CVIU 2015,
Affine view synthesis
Images
Det1-Desc1
Det2-Desc2
Match RANSACMatch!
Not match? Try more view synthesis
![Page 72: Crafting and learning for image matching - CMPcmp.felk.cvut.cz/~mishkdmy/slides/EECVC2019... · CMP RANSAC + transfer check: finds 48 correct inliers in 0.005s. 𝐻2−1=𝐻1−2](https://reader034.fdocuments.in/reader034/viewer/2022042311/5ed968faf59b0f56f45f7072/html5/thumbnails/72.jpg)
• If you DO NOT need correspondences & camera pose → DO NOT use local features. Use global descriptor (ResNet101 GeM) + fast search (faiss)
• Step 0: try OpenCV SIFT
• Use proper RANSAC (private now, will clean-up and open next week)
• Matching → use FGINN in two-way mode
• Need to be faster → ORB.
• Need to be more robust → use SIFT + HardNet 2.0
• Custom data → train on your own dataset
• Even more robust → use SIFT + AffNet + HardNet 2.0
• If images are upright, DO NOT DETECT the ORIENTATION
• Landmark data → DELF
2019.07.06, Odesa, EECVC 75
Thank you for your attention
ducha_aiki
ducha-aiki