CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall...
Transcript of CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall...
![Page 1: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/1.jpg)
CVPR Tutorial: First Person Vision
![Page 2: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/2.jpg)
INSIDE OUT: Riley’s First Date?, PIXAR
![Page 3: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/3.jpg)
Third person camera
![Page 4: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/4.jpg)
![Page 5: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/5.jpg)
External space
What is where? - D. Marr
![Page 6: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/6.jpg)
Person detection
External space
What is where? - D. Marr
![Page 7: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/7.jpg)
Person detection Ground plane
Side wall
Side wall
Object detection Surface normal estimation Object affordance
External space
What is where? - D. Marr
![Page 8: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/8.jpg)
Person detection Ground plane
Side wall
Side wall
Object detection Surface normal estimation Object affordance
Semantic segmentation Tracking
External space
What is where? - D. Marr
![Page 9: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/9.jpg)
Person detection Ground plane
Side wall
Side wall
Object detection Surface normal estimation Object affordance
Semantic segmentation Tracking
External space
What is where? - D. Marr
First person is not moving third person.
What is first person vision?
![Page 10: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/10.jpg)
Internal space
We move in order to see and we see in order to move.
- J. J. Gibson
?
![Page 11: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/11.jpg)
Internal space
We move in order to see and we see in order to move.
- J. J. Gibson
Vanishing line
My orientation
![Page 12: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/12.jpg)
Internal space
We move in order to see and we see in order to move.
- J. J. Gibson
Interaction with me
![Page 13: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/13.jpg)
Person detection Ground plane
Side wall
Side wall
My motion Internal space
We move in order to see and we see in order to move.
- J. J. Gibson
![Page 14: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/14.jpg)
“First person vision is an embedded human-system symbiosis.”
- Takeo Kanade
![Page 15: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/15.jpg)
“First person vision is an embedded human-system symbiosis.”
- Takeo Kanade
First person vision is all about me.
![Page 16: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/16.jpg)
Why first person vision is ideal for human behavior understanding?
![Page 17: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/17.jpg)
Distance from camera, d 0.03m 0.1m 1m 10m 30m
![Page 18: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/18.jpg)
Distance from camera, d
3-30m
0.03m 0.1m 1m 10m 30m
Third person
Target
![Page 19: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/19.jpg)
Distance from camera, d
3-30m
0.03m 0.1m 1m 10m 30m
102 p 103 p 104 p 105 p 106 p
Third person
Target
Number of pixels for head pose (HD resolution), 1/d 2 ∝
![Page 20: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/20.jpg)
Distance from camera, d 0.03m 0.1m 1m 10m 30m
102 p 103 p 104 p 105 p 106 p
Second person
Target
Number of pixels for head pose (HD resolution), 1/d 2 ∝
![Page 21: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/21.jpg)
Distance from camera, d 0.03m 0.1m 1m 10m 30m
102 p 103 p 104 p 105 p 106 p
Second person
Target
0.5-3m
Number of pixels for head pose (HD resolution), 1/d 2 ∝
![Page 22: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/22.jpg)
Distance from camera, d 0.03m 0.1m 1m 10m 30m
102 p 103 p 104 p 105 p 106 p Number of pixels for head pose (HD resolution), 1/d 2 ∝
First person
Target
![Page 23: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/23.jpg)
Distance from camera, d 0.03m 0.1m 1m 10m 30m
102 p 103 p 104 p 105 p 106 p Number of pixels for head pose (HD resolution), 1/d 2 ∝
First person < 0.3m
Target
![Page 24: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/24.jpg)
First person
Target
Second person
Target
Third person
Target
Noninvasiveness
Measurement accuracy 3D estimation error < 5cm
![Page 25: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/25.jpg)
First person
Target
Second person
Target
Third person
Target
Noninvasiveness
Measurement accuracy 3D estimation error < 5cm
Prediction Learning
![Page 26: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/26.jpg)
First person vs. Third person
![Page 27: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/27.jpg)
I. Attention Following
![Page 28: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/28.jpg)
1. Attention Following
![Page 29: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/29.jpg)
Group Attention Following
![Page 30: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/30.jpg)
2. Egocentric Spatial Organization
![Page 31: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/31.jpg)
2.3m
2.3m
2. Egocentric Spatial Organization
![Page 32: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/32.jpg)
2.3m
30cm
30cm
2. Egocentric Spatial Organization
![Page 33: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/33.jpg)
2.3m
Orientation
2. Egocentric Spatial Organization
![Page 34: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/34.jpg)
w/ prior w/o prior
Egocentric action-object detection
![Page 35: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/35.jpg)
Graphical Representation via Kinematics
![Page 36: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/36.jpg)
V1 V2
V4 V3
Position Orientation
Pose Velocity
Role
Graphical Representation via Kinematics
![Page 37: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/37.jpg)
V1 V2
V4 V3
E12
E23
Position Orientation
Pose Velocity
Role
Distance Relative orientation
Relative velocity Social relation
E13 E14
E34
E24
Graphical Representation via Kinematics
Coach’s note
![Page 38: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/38.jpg)
V4 V3
V1 V2
![Page 39: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/39.jpg)
What can first person cameras tell us about me?
![Page 40: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/40.jpg)
What can first person cameras tell us about me?
1. Attention
![Page 41: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/41.jpg)
Personal attention: what am I looking? [Li ICCV13]
![Page 42: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/42.jpg)
Social attention: what are we looking?
![Page 43: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/43.jpg)
1. Attention 2. Kinematics
What can first person cameras tell us about me?
![Page 44: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/44.jpg)
Human kinematics I: Where is my body and object?
![Page 45: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/45.jpg)
Human kinematics II: What does that mean to me?
![Page 46: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/46.jpg)
1. Attention 2. Kinematics 3. Control (sensorimotor)
What can first person cameras tell us about me?
![Page 47: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/47.jpg)
Visual Sensorimotor I: How do I control?
![Page 48: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/48.jpg)
3D reconstruction
Visual Sensorimotor II: What do I feel?
![Page 49: CVPR Tutorial: First Person Vision - University of Minnesotahspark... · Side wall . Side wall Object detection . Surface normal estimation . Object affordance . Semantic segmentation](https://reader035.fdocuments.in/reader035/viewer/2022070920/5fb94e57c495bf201f32f2e7/html5/thumbnails/49.jpg)
1. Attention 2. Kinematics 3. Control (sensorimotor)
What can first person cameras tell us about me?