Pedestrian Detection by Stereo Vision on Mobile Robots · Pedestrian Detection by Stereo Vision on...

Post on 27-Jul-2018

216 views 0 download

Transcript of Pedestrian Detection by Stereo Vision on Mobile Robots · Pedestrian Detection by Stereo Vision on...

Seminar Heidelberg University Mobile Human Detection Systems

Pedestrian Detection by Stereo Vision on Mobile Robots

Philip Mayer Matrikelnummer: 3300646 06.03.2017

Motivation

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 2

Fig.1: Pedestrians Within Bounding Box [6] Fig.2: Car Pedestrian Detection [7]

Outline

1. Problem Formulation

2. Solution Approach

3. Stereo Vision

4. Methods

5. Results

6. Summary and Conclusion 06.03.2017

Philip Mayer, Seminar, Mobile Human Detection Systems, Heidelberg University

3

1. Problem Formulation

Given: • Stereo Vision Depth Image • Mobile Robot • Unknown Background • Cluttered Environment • Crowded Places

Required: • Pedestrian Detection Also If Partially Occluded

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 4

2. Solution Approach

Fig.3: Depth Image [1]

Fig.4: Segmented Regions [1]

Fig.5: Candidates [1]

Fig.6: Detected Humans[1] Fig.7: Block Diagram Solution Approach

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 5

3. Stereo Vision

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 6

Fig.10: Stereo Vision – Geometric Setup [3]

𝑃𝑟𝑒𝑎𝑙

𝑥𝑟𝑒𝑎𝑙

𝑦𝑟𝑒𝑎𝑙

𝑧𝑟𝑒𝑎𝑙

𝑦′ 𝑥′

𝑦 𝑥

𝜆

𝜆 𝑃

𝑃′

• 𝐴, 𝐴‘ – Optical Axis • 𝑂, 𝑂‘ – Lense Centers • 𝐵 – Baseline • 𝑃𝑟𝑒𝑎𝑙 – Point in real space • 𝑃′– Projection of 𝑃𝑟𝑒𝑎𝑙 on Image 2 • 𝑃 – Projection of 𝑃𝑟𝑒𝑎𝑙 on Image 1

3. Stereo Vision

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 7

Fig.8: Color Image 1 – Left Lense [5] Fig.9: Color Image 2 – Right Lense [5]

3. Stereo Vision

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 8

Distance To Camera:

0,5 m Undefined 8 m

Fig.11: Depth Image 1 – Left Lense [5] Fig.12: Depth Image 2 – Right Lense [5]

4. Methods Graph-Based Segmentation

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 9

Fig.3: Depth Image [1] Fig.4: Segmented Regions [1]

4. Methods Graph-Based Segmentation

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 10

i

j

𝐸𝑖𝑚𝑎𝑥,𝑗𝑚𝑎𝑥

0,0 α

α

𝑖𝑚𝑎𝑥 =𝑖𝑚𝑎𝑔𝑒 𝑤𝑖𝑑𝑡ℎ 𝑤

𝑐𝑒𝑙𝑙 𝑤𝑖𝑑𝑡ℎ 𝛼

𝑗𝑚𝑎𝑥 =𝑖𝑚𝑎𝑔𝑒 ℎ𝑒𝑖𝑔ℎ𝑡 ℎ

𝑐𝑒𝑙𝑙 ℎ𝑒𝑖𝑔ℎ𝑡 𝛼

Fig.13: Depth Image With Grid [1]

4. Methods Graph-Based Segmentation

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 11

Fig.14: Random Pixel Selection Within Depth Image Grid Cell

4. Methods Graph-Based Segmentation

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 12

i

j

𝐸𝑖,𝑗 → 𝑃𝑖,𝑗

0,0

𝑃𝑖,𝑗 =

𝑝𝑖,𝑗 𝑥𝑝𝑖,𝑗 𝑦𝑝𝑖,𝑗 𝑧

Point 𝑃𝑖,𝑗 in 3D-Space

Fig.15: Depth Image With Grid Points For Depth And Normals Graph [1]

4. Methods Graph-Based Segmentation

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 13

𝑃𝑖,𝑗 𝑃𝑖+1,𝑗 𝑃𝑖−1,𝑗

𝑃𝑖,𝑗−1

𝑃𝑖,𝑗+1

𝑤𝐷𝑒𝑝𝑡ℎ = 𝑧1 − 𝑧2 𝑧1 = 𝐷𝑒𝑝𝑡ℎ 𝑜𝑓 𝑃𝑖,𝑗

𝑧2 = 𝐷𝑒𝑝𝑡ℎ 𝑜𝑓 𝑃𝑖+1,𝑗

𝑤 = 𝐸𝑑𝑔𝑒 𝑊𝑒𝑖𝑔ℎ𝑡

Fig.16: Depth Graph Weights Calculation

4. Methods Graph-Based Segmentation

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 14

8 Neighbors Of Pi,j

Pi,j

Pi,j Pi+1,j

Pi+1,j−1

Pi+1,j+1

Pi−1,j

Pi−1,j−1

Pi−1,j+1 Pi,j+1

Pi,j−1

• 9 Samples of 𝑃 in 3D-Space • Least-Square-Roots Plane Normals 𝑛𝑖,𝑗 Fig.17: Depth Graph Normals Calculation

4. Methods Graph-Based Segmentation

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 15

𝑃𝑖,𝑗 𝑃𝑖+1,𝑗 𝑃𝑖−1,𝑗

𝑃𝑖,𝑗−1

𝑃𝑖,𝑗+1

𝑤𝑁𝑜𝑟𝑚𝑎𝑙 = 𝑐𝑜𝑠−1(𝑣 ∙ 𝑢)

𝑢 = 𝑁𝑜𝑟𝑚𝑎𝑙 𝑜𝑓 𝑃𝑖,𝑗

𝑣 = 𝑁𝑜𝑟𝑚𝑎𝑙 𝑜𝑓 𝑃𝑖+1,𝑗

𝑤 = 𝐸𝑑𝑔𝑒 𝑊𝑒𝑖𝑔ℎ𝑡

Fig.18: Normals Graph Weights Calculation

4. Methods Graph-Based Segmentation

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 16

𝑅𝑒𝑔𝑖𝑜𝑛 𝑝𝑜𝑖𝑛𝑡𝑠 𝑖𝑛 𝐺𝐷𝑒𝑝𝑡ℎ

𝑅𝑒𝑔𝑖𝑜𝑛 𝑝𝑜𝑖𝑛𝑡𝑠 𝑖𝑛 𝐺𝑁𝑜𝑟𝑚𝑎𝑙

𝑅𝑒𝑔𝑖𝑜𝑛 𝑟𝑖

• Regions 𝑟𝑖 ∈ 𝑅

• Minimal size

of a region is β

Filtering noise

Fig.19: Region Condition

4. Methods Filtering and Merging

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 17

Fig.5: Candidates [1] Fig.4: Segmented Regions [1]

4. Methods Filtering and Merging

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 18

x

y

𝑥2 𝑥1

𝑦2

𝑦1

h

w

𝑤 = 𝑥2 − 𝑥1

ℎ = 𝑦2 − 𝑦1

μ𝑥 = 𝑤

2

μ𝑦 = ℎ

2

μ𝑧 = 𝑚𝑒𝑎𝑛 𝑑𝑒𝑝𝑡ℎ 𝑧 (𝑟𝑖)

Bounding Box

𝑟𝑖

Fig.20: Region Attributes Calculation

4. Methods Filtering and Merging

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 19

1. Select 3 Points Randomly n-Times From 𝑟𝑖 Hypothesis Plane 𝜋𝑘

2. Maximum Number Of Points Fitting The Plane 𝜋𝑘

𝑚𝑎𝑥𝑘=1 𝑛 𝑝 ∈ 𝑟𝑖 : 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 𝑜𝑓 𝑝 𝑡𝑜 𝜋𝑘 < 𝜀

𝑟𝑖

y

x

z

Points above 𝜋𝑘

Points below 𝜋𝑘

Points with distance to 𝜋𝑘 < 휀

3 randomly selected Points 𝜋𝑘

Fig.21: Hypothesis Plane

𝜋𝑘

4. Methods Filtering and Merging

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 20

Finding a rule specifiying valid ranges for: • Mean Depth • Height • Width • Minimum Inlier Fraction

Rule derived from positive examples in the training set Eliminate regions unable to be humans

4. Methods Filtering and Merging

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 21

Region too small but planar:

• 𝑆𝑖𝑧𝑒(𝑟𝑖) < 𝛽 • High number of fitting points on 𝜋𝑘 • Mean depth rule satisfied Merging regions (merging condition)

𝜇𝑥𝑧 𝑟𝑖 − 𝜇𝑥𝑧 𝑟𝑗 < 𝛿𝑥𝑧 and 𝜇𝑦 𝑟𝑖 − 𝜇𝑦 𝑟𝑗 < 𝛿𝑦

Important step due to detached parts by segmentation

4. Methods Filtering and Merging

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 22

• Set of regions Set of (unscaled) candidates

• Classifcation needs scaled candidates

Copy pixels of regions into candidate image with size 𝑤𝑐 × ℎ𝑐

• If pixel copied raw depth pixel

• Undefined otherwise

• Candidates 𝑐𝑖: Candidate image + bounding box

Output candidate set C

4. Methods Candidate Classification

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 23

Fig.4: Segmented Regions [1] Fig.6: Detected Humans[1]

4. Methods Candidate Classification

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 24

Bounding Box

8x8 Pixel Cell

Δ𝐷𝑒𝑝𝑡ℎ𝑥 = 222 – 55 = 167

Δ𝐷𝑒𝑝𝑡ℎ𝑦 = 235 – 33 = 202

𝐺𝑟𝑎𝑑𝑖𝑒𝑛𝑡 𝑉𝑒𝑐𝑡𝑜𝑟 𝑣 𝐺 =Δ𝐷𝑒𝑝𝑡ℎ𝑥Δ𝐷𝑒𝑝𝑡ℎ𝑦

=167202

2x2 Cell Box

Fig.23: Candidate Image With Bounding Box And Fixed Size [1]

Fig.22: Gradient Vector Calculation [2]

4. Methods Candidate Classification

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 25

𝑀𝑎𝑔𝑛𝑖𝑡𝑢𝑑𝑒

Angle [Deg]

𝑀𝑎𝑔𝑛𝑖𝑡𝑢𝑑𝑒 𝑀 = 1672 + 2022 = 262,1 𝐺𝑟𝑎𝑑𝑖𝑒𝑛𝑡 𝐴𝑛𝑔𝑙𝑒 Θ = arctan167

202= 69,3°

Fig.24: Histogram Of Oriented Depth

4. Methods Candidate Classification

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 26

50% Box Overlap Yellow: Initial Step

2x2 Cell Box

Green: Preceeding Step

4 Cell Histograms For Normalization

Vector of Histograms Candidate Descriptor for SVM

Fig.26: Candidate Image With Blocks For Normalization [1]

4. Methods Candidate Classification

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 27

Fig.27: Linear Support Vector Machine [4]

Positive Example

Negative Example

A

B

4. Methods Candidate Classification

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 28

• Depth image frames from training set

• Candidates labeled as positive or negative

Fig.28: Support Vector Machine Scheme [2]

- Set of Humans H

- Set of Candidates C

5. Results

„Hallway“ „Café“

Distances 0,5 – 8 [m] 0,5 – 5 [m]

Occlusion Level Varying Often

Environment Not Cluttered Cluttered

Ergonomic Position of People

Upright Various Poses

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 29

Two Sets Of Experiments: 1. Recall & Precision 2. Impact of varying number of

training examples on Recall & Precision

5. Results

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 30

Hallway Dataset Café Dataset

Fig.29: Accuracy Results, (a) Hallway Data Set, (b) Café Data Set [1]

Equal Error Rate (EER)

84

84

75

75

𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑇𝑃

𝑇𝑃 + 𝐹𝑁 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =

𝑇𝑃

𝑇𝑃 + 𝐹𝑃 𝑇𝑃 = 𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠

𝐹𝑁 = 𝐹𝑎𝑙𝑠𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠

𝐹𝑃 = 𝐹𝑎𝑙𝑠𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠

5. Results

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 31

Hallway Dataset Café Dataset

Fig.30: Impact On Accuracy By Reduction Of Positive Training Examples, (a) Hallway Data Set, (b) Café Data Set [1]

6. Summary & Conclusion

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 32

• Stereo Vision • Segmentation Algorithm • Filtering and merging • HOD Descriptor • SVM • Precision, Recall • Comparison of impact on precision and recall due to less training for SVM

• Missing Information: Impact Of Resolution Loss • Comparison Of Datasets: Environmental Difference, Different Ergonomic Positions • Presented Depth Image: No Reference About Depth Information Encoding • No Measure Units in Data Sheet Table

Paper (Literatur)

1. Fast Human Detection for Indoor Mobile Robots Using Depth Images – 2013 IEEE International Conference on Robotics and Automation (ICRA) Karlsruhe, Germany, May 6-10, 2013

2. L. Spinello and K. Arras, “People Detection in RGB-D Data,” in Proceedings of IROS 2011, pp. 3838–3843 Perma-Link: http://ref.scielo.org/cmkfvr

3. Web-Page: https://en.wikipedia.org/wiki/Support_vector_machine

4. Web-Page: http://vision.middlebury.edu/stereo/data/scenes2003/

5. Web-Page: https://www.nextplatform.com/wp-content/uploads/2015/08/ped_det.png

6. Web-Page: https://www.extremetech.com/wp-content/uploads/2016/04/Autoliv-pedestrian-detection-640x395.jpg

06.03.2017 Philip Mayer, Seminar, Mobile Human

Detection Systems, Heidelberg University 33