TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial...
Transcript of TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial...
![Page 1: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/1.jpg)
TorontoCity: Seeing the World
with a Million Eyes
![Page 2: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/2.jpg)
Authors
Shenlong Wang, Min Bai, Gellert Mattyus, Hang Chu, Wenjie Luo, Bin Yang
Justin Liang, Joel Cheverie, Sanja Fidler, Raquel Urtasun
* Project Completed by Summer 2016
![Page 3: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/3.jpg)
Why Toronto? The best place to live in the world*
• Toronto 4
*According to 2015 Global Liveability Ranking
![Page 4: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/4.jpg)
Why Toronto? The best place to live in the world*
• Toronto 4
The places you are working at:
• Boston 36
• Pittsburgh 39
• San Francisco 49
• Los Angeles 51
*According to 2015 Global Liveability Ranking
![Page 5: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/5.jpg)
A dataset over 700 km2 region!
![Page 6: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/6.jpg)
From all the views!
![Page 7: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/7.jpg)
Dataset
Aerial
Data Source
![Page 8: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/8.jpg)
Dataset
Aerial
Ground Level Panorama
Data Source
![Page 9: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/9.jpg)
Dataset
Aerial LIDAR
Data Source
Ground Level Panorama
![Page 10: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/10.jpg)
Dataset
Aerial LIDAR Stereo
Data Source
Ground Level Panorama
![Page 11: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/11.jpg)
Dataset
Aerial LIDAR Stereo
Drone
Data Source
Ground Level Panorama
![Page 12: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/12.jpg)
Dataset
Aerial Airborne LIDAR
Data Source
Ground Level Panorama
LIDAR Stereo
Drone
![Page 13: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/13.jpg)
Why we need this?
• Mapping for Autonomous Driving
• Smart City
• Benchmarking:• Large-Scale Machine Learning / Deep Learning
• 3D Vision
• Remote Sensing
• Robotics
Source: Here 360
![Page 14: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/14.jpg)
Why we need this?
• Mapping for Autonomous Driving
• Smart City
• Benchmarking:• Large-Scale Machine Learning / Deep Learning
• 3D Vision
• Remote Sensing
• Robotics
Source: Toronto SmartCity Summit
![Page 15: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/15.jpg)
Why we need this?
• Mapping for Autonomous Driving
• Smart City
• Benchmarking:• Large-Scale Machine Learning / Deep Learning
• 3D Vision
• Remote Sensing
• Robotics
![Page 16: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/16.jpg)
Annotations
• Manual annotation? Impossible!• Suppose each 500x500 image costs $1 to annotate pixel-wise
labels, we need to pay $11M to create ground-truth only for the aerial images.
![Page 17: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/17.jpg)
Annotations
• Manual annotation? Impossible!• Suppose each 500x500 image costs $1 to annotate pixel-wise
labels, we need to pay $11M to create ground-truth only for the aerial images.
I’m not as
rich as
Jensen
![Page 18: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/18.jpg)
Annotations
• Manual annotation? Impossible!• Suppose each 500x500 image costs $1 to annotate pixel-wise
labels, we need to pay $11M to create ground-truth only for the aerial images.
• However, humans already collect rich knowledge about the world!
I’m not as
rich as
Jensen
![Page 19: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/19.jpg)
Annotations
• Manual annotation? Impossible!• Suppose each 500x500 image costs $1 to annotate pixel-wise
labels, we need to pay $1139200 to create ground-truth only for the aerial images.
• Humans already collect rich knowledge about the world!
Use maps!
I’m not as
rich as
Jensen
![Page 20: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/20.jpg)
Map as Annotations
HD Map
Maps
![Page 21: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/21.jpg)
Map as Annotations
3D BuildingHD Map
Maps
![Page 22: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/22.jpg)
Map as Annotations
3D BuildingHD Map Meta Data
Maps
![Page 23: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/23.jpg)
Together, the rich sources of data enable a plethora of exciting tasks!
![Page 24: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/24.jpg)
Building Footprint Extraction
![Page 25: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/25.jpg)
Road Curb and Centerline Extraction
![Page 26: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/26.jpg)
Building Instance Segmentation
![Page 27: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/27.jpg)
Zoning Prediction
InstitutionalResidential
Commercial
![Page 28: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/28.jpg)
Technical DifficultiesMis-alignment and Data Noise
Aerial-ground images mis-alignment from raw GPS location data
Road centerline is shifted
Building’s shape/location is not accurate
![Page 29: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/29.jpg)
Data Pre-processing and AlignmentAppearance based Ground-aerial Alignment
Before Alignment
After Alignment
![Page 30: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/30.jpg)
Data Pre-processing and AlignmentInstance-wise Aerial-map Alignment
Before alignment
![Page 31: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/31.jpg)
Data Pre-processing and AlignmentInstance-wise Aerial-map Alignment
After alignment
![Page 32: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/32.jpg)
Data Pre-processing and AlignmentRobust Road Surface Generation
Input Road Curb and Centreline (Noisy)
Polygonized Road Surface
![Page 33: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/33.jpg)
Pilot Study with Neural NetworksBuilding Contour and Road Curb/Centerline Extraction
GT ResNet
![Page 34: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/34.jpg)
Pilot Study with Neural NetworksSemantic Segmentation
Method Road Building Mean
FCN 74.94% 73.88% 74.41%
ResNet-56 82.72% 78.80% 80.76%
Metric: Intersection-over-union (IOU), higher is better
![Page 35: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/35.jpg)
Pilot Study with Neural NetworksBuilding Instance Segmentation
Input DWT
![Page 36: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/36.jpg)
Pilot Study with Neural NetworksBuilding Instance Segmentation
Metric: Weighted Coverage, AP, Precision-50%, Recall-50%, higher is better
MethodWeighted
Coverage
Average
PrecisionRecall-50% Precision-50%
FCN 41.92% 11.37% 21.50% 36.00%
ResNet-56 40.65% 12.13% 18.90% 45.36%
Deep
Watershed
Transform
56.22% 21.22% 67.16% 63.67%
![Page 37: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/37.jpg)
Pilot Study with Neural NetworksBuilding Instance Segmentation
Join the other talk today to know more about the deep watershed instance segmentation: Wednesday, May 10, 4:00 PM - 4:25 PM – Room 210G
MethodWeighted
Coverage
Average
PrecisionRecall-50% Precision-50%
FCN 41.92% 11.37% 21.50% 36.00%
ResNet-56 40.65% 12.13% 18.90% 45.36%
Deep
Watershed
Transform
56.22% 21.22% 67.16% 63.67%
![Page 38: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/38.jpg)
Pilot Study with Neural NetworksGround-view Road Segmentation
True Positive: Yellow; False Negative: Green; False Positive: Red
![Page 39: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/39.jpg)
Pilot Study with Neural NetworksGround-view Road Segmentation
Metric: Intersection-over-Union, higher is better
Method Non-Road IOU Road IOU Mean IOU
FCN 97.3% 95.8% 96.5%
ResNet-56 97.8% 96.6% 97.2%
![Page 40: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/40.jpg)
Pilot Study with Neural NetworksGround-view Zoning Classification
Top-1 Accuracy
Method From-ScratchPre-trained from
ImageNet
AlexNet 66.48% 75.49
GoogLeNet 75.08% 77.95%
ResNet 75.65% 79.33%
Metric: Top-1 Accuracy, higher is better
![Page 41: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/41.jpg)
• # of buildings: 397846
• Total area: 712.5 km2
• Total length of road: 8439 km
Statistics
![Page 42: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/42.jpg)
Building height distribution Zoning type distribution
Statistics
![Page 43: TorontoCity: Seeing the World with a Million Eyes€¦ · Ground Level Panorama. Dataset Aerial Airborne LIDAR Data Source Ground Level Panorama LIDAR Stereo Drone. Why we need this?](https://reader033.fdocuments.in/reader033/viewer/2022052102/603c665f751e8162b276f308/html5/thumbnails/43.jpg)
Conclusion
• We propose a large dataset with from different views and sensors
• Maps are used to create GT annotations
• In future we have many more exciting tasks to come
• Check our paper for more details: https://arxiv.org/abs/1612.00423
• Data available soon. Stay tuned and welcome to over-fit
Join the other talk today to know more about the deep watershed instance segmentation:
Wednesday, May 10, 4:00 PM - 4:25 PM – Room 210G