Scaling Charts with Design and GPUs Leo Meyerovich (@LMeyerov) CEO of Graphistry.com | UC Berkeley...
-
Upload
harvey-preston -
Category
Documents
-
view
214 -
download
2
Transcript of Scaling Charts with Design and GPUs Leo Meyerovich (@LMeyerov) CEO of Graphistry.com | UC Berkeley...
1
Scaling Charts with Design and GPUs
Leo Meyerovich (@LMeyerov)CEO of Graphistry.com | UC Berkeley
Superconductor
2
Visibility
3
Visibility through
design + speed
4
Histogram of Voter Turnout by Town
0% 25%50%
75% 100%Voter
Turnout
# Towns
Most towns had ~40% people
vote
ballot box stuffing?
5
6
Opposition
Incumbent
Tiny square shows town size (area) and vote (color)
7
Filter for townsw/ high turnout
8
Tag suspicious with black
9
10
For visibility,
speed design
11
allwork projecthobby projectmanagerfemale1 employee2-4 employees5-9 employees10-19 employees20-100 employees100-499 employees500 employees or moreis techmanager peerunder 2030+20-30
Problem: Plot 10+ Time Series Signals
12
Design 200 Time Series Signals
100 s0 s 0 s
13
Speed Pan/Zoom Interactions
38 s37 s 37 s
14
CPU Bottlenecks: naïve and offline
TransformParse
LayoutRender
0ms 1600ms
real-time
is30ms
15
Prep
Optimize Binary Data, Multicore Layout, GPU Render
LayoutRender
0ms 1600ms
• Real-time interaction• Stream from server
12MB+/s
16
Graphs: Placing Nodes and Edges
17
Direct Feedback on Settings
18
Uber: Trip Start to End
19
Direct Edge Placement: Overplotting
20
Speed Design Edge Bundling
21
22
web
23
Bare Metal in the Browser
Sequential
Multicore
GPU
5 X
4+ cores
1024 lanes
SIMD 4 lanes
24
SUPERCONDUCTOR: Parallel JS Viz Engine
HTML dataCSS styling
JS script
Pixels
Parser
Selectors
Layout
Renderer
Java
Scrip
t VM
Renderer.GL
webpage
Layout.CL
Selectors.CL
GPU
datastyling
widgets
data viz
Compiler
Parser.js
BROWSERSUPERCONDUCTOR.js
25
Leaf
Layout as Parallel Tree Traversals
w,h w,h
w,hw,h
w,hw,h x,y …
1. Works for all data sets2. Compiler: CSS Schedule
logical joins logical
spawns
Parallelism in each traversal!
26
parallel for looplevel synchronous
GPU Traversals: Flat & Level-Synchronous
level 1
Tree
level n
whxy
Nodes in arrays
flat
Array per attribute
Compiler handles transform of code & data
27
More Scalable DesignsImmens (Stanford) Nanocubes (AT&T) MapD (MIT)
Abstract Rendering (Continuum) Synerscope
28
29
Achieve data visibility throughhardware-accelerated designs
(and deploy on the web )
30
Visualize Magnitudes More Data in the Browser
Leo Meyerovich (@LMeyerov)CEO of Graphistry.com | UC Berkeley
Graphistry
31
Leaf
Layout as Parallel Tree Traversals
w,h w,h
w,hw,h
w,hw,h x,y …
1. Works for all data sets2. Compiler: CSS Schedule
logical joins logical
spawns
Parallelism in each traversal!
32
parallel for looplevel synchronous
GPU Traversals: Flat & Level-Synchronous
level 1
Tree
level n
whxy
Nodes in arrays
flat
Array per attribute
Compiler handles transform of code & data
33
L2: 1MB
RAM: 2GB
432
432256-way SIMTGPGPU core 1
4-way SIMD
L1d: 32KB
Today’s Supercomputer-in-a-Pocket
core 1
Pre
fetc
h
En
gin
e
1
Challenge: Parallelize Data
Visualization
Phone16-lane CPU
1024-lane GPU
34
circ(…)
Problem: Dynamic Memory Allocation on GPU?
square(…) rect(…); …
line(…); …
rect(…); …
oval(…)
1.0 0.8 0.5 0.2 0 0.2
function circ (x,y,r) { buffer = new
Array(r * 10) for (i = 0; i < r * 10; i++)
buffer[i] = cos(i)}
dynamic
allocation
35
Dynamic Allocation as SIMD Traversals
allocCirc(…) 4allocRect(…) 6
allocLine(…) 6
allocRect(…) 7
fillCirc(…)
fillRect(…)
fillLine(…)
fillRect(…)
1. Prefix sum for needed space2. Allocate buffers
3. Distribute offsets & compute4. Give OpenGL buffer
pointer
1.0 0.8 0.5 0.2 0 0.2
1.0 0.8 0.5 0.2
1.0 0.8 0.5 0.2 0 0.2
36
layout (4 passes)
rendering pass
TOTAL1
10
100
1,000
10,000Naïve JS (Chrome 26) GPU (Safari + WebCL 11/3) 24fps
Tim
e (
ms)
CPU vs. GPU for Election Treemap: 5 traversals over 100K nodes
WebCL: 30X
WebCL: 70X
COMBINED: 54X !