Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS...
-
Upload
miya-furnace -
Category
Documents
-
view
214 -
download
0
Transcript of Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS...
![Page 1: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/1.jpg)
Video-Based In Situ Tagging on Mobile Phones
Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo
IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011
![Page 2: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/2.jpg)
Outline
Introduction Online Target Learning Detection and Tracking Experimental Results Conclusion
![Page 3: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/3.jpg)
Introduction
Objective : Augment a real-world scene with minimal user intervention on a mobile phone.
“Anywhere Augmentation” Considerations:
Avoid reconstruction of 3D scene Perspective patch recognition Mobile phone processing power Mobile phone accelerometers Mobile phone Bluetooth connectivity
http://www.youtube.com/watch?v=Hg20kmM8R1A
![Page 4: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/4.jpg)
Introduction
The proposed method follows a standard procedure of target learning and detection.
Input Image
Online Learning
Real-time Detection
![Page 5: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/5.jpg)
Introduction
The proposed method follows a standard procedure of target learning and detection
Input Image
Online Learning
Real-time Detection
![Page 6: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/6.jpg)
Online Target Learning
Input: Image of the target plane Output: Patch data and camera poses
Assumptions Known camera parameters Horizontal or vertical surface
![Page 7: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/7.jpg)
Online Target Learning
Input Image
Frontal View Generation
Blurred Patch Generation
Post-processing
Input Image
Frontal View Generation
Blurred Patch Generation
Post-processing
![Page 8: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/8.jpg)
Frontal View Generation
We need a frontal view to create the patch data and their associated poses.
Targets whose frontal views are available.
![Page 9: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/9.jpg)
Frontal View Generation
However, frontal views are not always available in the real world.
Targets whose frontal views are NOT available.
![Page 10: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/10.jpg)
Frontal View Generation
Objective : Fronto-parallel view image from the input image.
Approach : Exploit the phone’s built-in accelerometer.
Assumption : Patch is on horizontal or vertical surface.
![Page 11: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/11.jpg)
Frontal View Generation
The orientation of a target (H / V) is recommended based on the current pose of the phone.
Vertical
π/4
-π/4
Parallel to Ground
G (detected by acceleromaeter)
Horizontal
Horizontal
![Page 12: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/12.jpg)
Frontal View Generation
Under the 1 degree of freedom assumption Frontal view camera: [I|0] Captured view camera: [R|c]
T = -Rc
• Function to warp image to virtual frontal view. [12]
[12] R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision. Cambridge, U.K.: Cambridge Univ. Press, 2000.
![Page 13: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/13.jpg)
Frontal View Generation
![Page 14: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/14.jpg)
Online Target Learning
Input Image
Frontal View Generation
Blurred Patch Generation
Post-processing
![Page 15: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/15.jpg)
Blurred Patch Generation
Objective: Learn the appearances of a target surface fast.
Approach : Adopt the approach of patch learning in ”Gepard” [6]
Real-time learning of a patch on the desktop computer.
[6] S. Hinterstoisser, V. Lepetit, S. Benhimane, P. Fua, and N. Navab,“Learning real-time perspective patch rectification,” Int. J. Comput. Vis.,vol. 91, pp. 107–130, Jan. 2011.
![Page 16: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/16.jpg)
Review: Gepard[6]
Fast patch learning by linearizing image warping with principal component analysis.
“Mean patch” as a patch descriptor. Difficult to directly apply to mobile phone
platform. Low performance of mobile phone CPU Large amount of pre-computed data is required
(about 90MB)
![Page 17: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/17.jpg)
Modified Gepard[6]
Remove need for fronto-parallel view Using phone’s accelerometers and limiting to 2 planes
Skip the Feature Point Detection step Instead use larger patches for robustness
Replace how templates are constructed By blurring instead
Added Bluetooth sharing of AR configuration
![Page 18: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/18.jpg)
Blurred Patch Generation
Approach: Use blurred patch instead of mean patch
![Page 19: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/19.jpg)
Blurred Patch Generation
Generate blurred patches through multi-pass rendering in a GPU. Faster image processing through a GPU’s
parallelism.
![Page 20: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/20.jpg)
Blurred Patch Generation
1st Pass: Warping Render the input patch from a certain viewpoint Much faster than on CPU
![Page 21: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/21.jpg)
Blurred Patch Generation
2nd Pass: Radial blurring to the warped patch Allow the blurred patch covers a range of poses
close to the exact pose
![Page 22: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/22.jpg)
Blurred Patch Generation
3rd Pass: Gaussian blurring to the radial-blurred patch Make the blurred patch robust to image noise
![Page 23: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/23.jpg)
Blurred Patch Generation
• Fig. 7. Effectiveness of radial blur. Combining the radial blur and the Gaussian blur outperforms simple Gaussian blurring.
![Page 24: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/24.jpg)
Blurred Patch Generation
4th Pass: Accumulation of blurred patches in a texture unit. Reduce the number of readback from GPU
memory to CPU memory
![Page 25: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/25.jpg)
Online Target Learning
Input Image
Frontal View Generation
Blurred Patch Generation
Post-processing
![Page 26: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/26.jpg)
Post-Processing
Downsampling blurred patches (128x128) to (32x32)
Normalization Zero mean and Standard Deviation of 1
![Page 27: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/27.jpg)
Detection & Tracking
User points the target through the camera.
Square patch at the center of the image is used for detection.
![Page 28: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/28.jpg)
Detection & Tracking
Initial pose is retrieved by comparing the input patch with the learned mean patches.
ESM-Blur[20] is applied for further pose refinement.
NEON instructions are used for faster pose refinement.
[20] Y. Park, V. Lepetit, and W. Woo, “ESM-blur: Handling and rendering blur in 3D tracking and augmentation,” in Proc. Int. Symp. Mixed Augment. Reality, 2009, pp. 163–166.
![Page 29: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/29.jpg)
Experimental Results
Patch size: 128 x 128 Number of views used for learning: 225 Maximum radial blur range: 10 degrees Gaussian blur kernel: 11x11 Memory requirement: 900 KB for a target
![Page 30: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/30.jpg)
Experimental Results
![Page 31: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/31.jpg)
Experimental Results
![Page 32: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/32.jpg)
Experimental Results
![Page 33: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/33.jpg)
Experimental Results
![Page 34: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/34.jpg)
Experimental Results
iPhone3GS / 4 PC
CPU 600MHz / 1GHz Intel QuadCore 2.4 GHz
GPU PowerVR SGX 535 GeForce 8800 GTX
Renderer OpenGL ES 2.0 OpenGL 2.0
Video 480x360 640x480
![Page 35: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/35.jpg)
Experimental Results
More views, more rendering. Slow radial blur due on the mobile phone. Possible speed improvement through shader
optimization.
PC iPhone 3GS iPhone 4
![Page 36: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/36.jpg)
Experimental Results
Comparison with Gepard[6]
[6] S. Hinterstoisser, V. Lepetit, S. Benhimane, P. Fua, and N. Navab,“Learning real-time perspective patch rectification,” Int. J. Comput. Vis.,vol. 91, pp. 107–130, Jan. 2011.
Fig. 11. Planar targets used for evaluation. (a) Sign-1. (b) Sign-2. (c) Car. (d) Wall. (e) City. (f) Cafe. (g) Book. (h) Grass. (i) MacMini. (j) Board. The patches delimited by the yellow squares are used as a reference patch.
![Page 37: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/37.jpg)
Experimental Results
Our approach performs slightly worse in terms of recognition rates, but it is better adapted to mobile phones.
Our approach performs slightly worse in terms of recognition rates, but it is better adapted to mobile phones.
![Page 38: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/38.jpg)
Experimental Results
The mean patches comparison takes about 3ms with 225 views.
The speed of pose estimation and tracking with ESM-Blur depend on the accuracy of the initial pose provided by patch detection.
![Page 39: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/39.jpg)
Limitations
Weak to repetitive textures and reflective surfaces.
Currently single target only.
![Page 40: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/40.jpg)
Conclusion
Potential applications AR tagging on the real world AR apps “anywhere anytime”
Future work More optimization on mobile phones Detection of multiple targets at the same time
![Page 42: Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.](https://reader036.fdocuments.in/reader036/viewer/2022081602/5519dab85503468b0c8b4b1a/html5/thumbnails/42.jpg)
~Thank you for your listening~