GPU Rasterization for Real-Time Spatial Aggregation over … · 2020-02-21 · GPU Rasterization...
Transcript of GPU Rasterization for Real-Time Spatial Aggregation over … · 2020-02-21 · GPU Rasterization...
GPU Rasterization for Real-Time Spatial Aggregation over Arbitrary Polygons
Eleni Tzirita Zacharatou Harish Doraiswamy
Anastasia Ailamaki Juliana FreireClaudio Silva
Visual Data Exploration
• Urbane: Visual Analytics Tool developed with architects to better understand cities based on the various urban data sets
• Questions they are interested in• Which part of the city has most
crime?• How does noise vary across the city?• What regions have better public
transportation?• …
2
Urbane [Ferreira et al. 2015]
SIGMOD 2018 Best Demo Award
Visual Data Exploration
• Relies on spatial aggregation queries
3
SELECT COUNT(*)FROM taxi T, neighborhoods NWHERE T.pickup INSIDE N.geometryAND T.picktime in January 2009GROUP BY N.id
Urbane [Ferreira et al. 2015]
Challenge: Interactive response times
Heatmap of taxi pickup locations
SIGMOD 2018 Best Demo Award
On-Demand Join & Aggregation
4
Taxi Neighborhoods
Selection Index
Join
Aggregation
Result
Materialize the Join
Billions of points Hundreds of vertices
Point-in-Polygon tests
Filter & Refine
Unsuitable for interactive applications
Group by: neighborhood.idFunction: count(pickup)
taxi.pickupinsideneighborhood.geometry
taxi.picktime in January 2014
SELECT COUNT(*)FROM taxi T, neighborhoods NWHERE T.pickup INSIDE N.geometryAND T.picktime in January 2009GROUP BY N.id
Spatial Aggregation: a Geometric Perspective
5
Input points Input polygon Spatial join
Leverage the graphics pipeline of the GPU
Spatial join = “Drawing” on the same canvas
Raster Join: I. Draw the Points
6
Raster Join: I. Draw the Points
1
7
Raster Join: I. Draw the Points
1
8
Raster Join: I. Draw the Points
11
9
Raster Join: I. Draw the Points
11
10
Raster Join: I. Draw the Points
21
11
Raster Join: I. Draw the Points
21
11
2
11 1
11 1
1
1
11
11 1
11
11 11
121
11
12
Raster Join: II. Draw the Polygons
21
11
2
11 1
11 1
1
1
11
11 1
11
11 11
121
11
0
13
Raster Join: II. Draw the Polygons
21
11
2
11 1
11 1
1
1
11
11 1
11
11 11
121
11
2
14
Raster Join: II. Draw the Polygons
21
11
2
11 1
11 1
1
1
11
11 1
11
11 11
121
11
2
15
Raster Join: II. Draw the Polygons
211
11
2
11
11 1
1
1
11
11 1
11
11 11
121
11
3
16
Raster Join: II. Draw the Polygons
211
11
2
11
11 1
1
1
11
11 1
11
11 11
121
11
4
17
Raster Join: II. Draw the Polygons
1511 1 1
2
1 11
121 1
11 1
11 1
1
1 1
1 1
1
11
1
2
1
1
18No Point-in-Polygon testsCombines the aggregation with the join operation
Exploits the native support for drawing in GPUs
Bounding the Approximation Error
11 1 1
2
1 11
121 1
11 1
11 1
1
1 1
1 1
1
11
1
2
1
1
• Bound the Hausdorffdistance between the approximate (purple) and the original polygon.
• Smaller pixel size → higher accuracy.
H(Pa,P) ≤ ε
19Trade off accuracy for response time
Hybrid Raster Join: an Accurate Technique
Blue pixels - completely inside the polygon: store count Grey pixels - polygon boundary: Point-in-Polygon (PiP) tests
20Extra computation: identifying the boundary & performing PiP tests
Bounded Raster Join (Approximate, ε-bound = 10 meters)
Scaling with increasing data sizes
21
[Max resolution: 8192Grid index: 10242 cells]
Hybrid Raster Join(Accurate)
GPU Baseline(Accurate)
COUNT Taxi rides (points) GROUP BY NYC Neighborhoods (260 polygons)
[Intel Core i7 Quad-Core CPU @ 2.80GHz, 16GB RAM, NVIDIA GTX 1060 mobile GPU, 6GB memory (using only 3GB)]
https://github.com/ViDA-NYU/raster-join
Exec
utio
n Ti
me
(sec
)
Memory Processing
CPU-GPU data transfer takes a significant amount of time Bounded Raster Join eliminates all Point-in-Polygon tests
Bounded Raster Join: Accuracy
22
COUNT Taxi rides (600M points) GROUP BY NYC Neighborhoods (260 polygons)
All points close to the diagonal → Negligible errors
ε-bound = 20 meters
The graph plots the accurate vs. the approximate values for each polygon.
Enabling Interactive Spatial Analytics
23Thank You!
• Decompose complex spatial aggregation queries into graphics primitives
• Leverage advances in computer graphics to attain interactive speeds
• Big spatial data analytics interactively on commodity machines