Post on 17-Dec-2015
Big Data Platforms
Mihai Budiu
, Oct 6 2014
2
My work• Ph.D. from Carnegie Mellon, 2003• Hardware synthesis• Reconfigurable hardware• Compilers and computer architecture
• Researcher at Microsoft Research Silicon Valley 2004-2014• Computer security• Cloud computing infrastructure:
• distributed computation platforms • monitoring and debugging• performance analysis
• Big data analysis and visualization • Large scale machine learning
3
500 Years Ago
Tycho Brahe(1546-1601)
Johannes Kepler(1571-1630)
4
The Laws of Planetary Motion
Tycho’s measurements Kepler’s laws
5
The Large Hadron Collider
25 PB/year WLHC Grid: 200K computing cores
6
Genetic Code
7
Astronomy
8
Weather
9
The Webs
Internet
Facebook friends graph
10
Big Data
11
Big Computers
12
Talk Outline
• Motivation• Dryad: A distributed runtime• DryadLINQ: A compiler for Dryad• Tools and applications• Sketch: A billion-row spreadsheet
13
Design Space
Throughput(batch)
Latency(interactive)
Internet
Datacenter
Data-parallel
Sharedmemory
DryadSearch
HPC
Grid
Transaction
Sketch
14
Dryad• Eurosys 2007• Continuously deployed in
Microsoft since 2006• Execution engine of Bing
analytics• > 105 machines•Many PB of data analyzed daily
Dryad painting by Evelyn de Morgan
15
Dryad = Execution Layer
Job (application)
Dryad
Cluster
Pipeline
Shell
Machine≈
16
2-D Piping• Unix Pipes: 1-D
grep | sed | sort | awk | perl
• Dryad: 2-D grep1000 | sed500 | sort1000 | awk500 | perl50
17
Virtualized 2-D Pipelines
18
Virtualized 2-D Pipelines
19
Virtualized 2-D Pipelines
20
Virtualized 2-D Pipelines
21
Virtualized 2-D Pipelines• 2D DAG• multi-machine• virtualized
22
Dryad Job Structure
grep
sed
sortawk
perlgrep
grepsed
sort
sort
awk
Inputfiles
Vertices (processes)
Outputfiles
ChannelsStage
23
Dryad System Architecture
Files, TCP, FIFO, Networkjob schedule
data plane
control plane
NS,Sched RE RERE
V V V
job manager cluster
GM code
vertex code
Staging1. Build
2. Send .exe
3. Start manager
5. Generate graph
7. Serializevertices
8. MonitorVertex execution
4. Querycluster resources
Nameserver6. Initialize vertices
Remoteexecutionservice
25
Talk Outline
• Motivation• Dryad: A distributed runtime• DryadLINQ: A compiler for Dryad• Tools and applications• Sketch: A billion-row spreadsheet
26
Distributed Collections
Partition
Collection
.Net objects
27
LINQ
Dryad
=> DryadLINQ
28
LINQ = .Net+ Queries
Collection<T> collection;bool IsLegal(Key);string Hash(Key);
var results = from c in collection where IsLegal(c.key) select new { Hash(c.key), c.value};
29
Collection<T> collection;bool IsLegal(Key k);string Hash(Key);
var results = from c in collection where IsLegal(c.key) select new { Hash(c.key), c.value};
DryadLINQ = LINQ + Dryad
C#
collection
results
C# C# C#
Vertexcode
Queryplan(Dryad job)Data
30
Language Summary
WhereSelectGroupByOrderByAggregateJoin
31
Very expressive
var result = input.SelectMany(r => Mapper(r)) .GroupBy(r => Key(r)) .Select(g => Reducer(g));
Map-Reduce
Distributed sorting
Iterative machine-learning (EM)
32
Talk Outline
• Motivation• Dryad: A distributed runtime• DryadLINQ: A compiler for Dryad• Tools and applications• Sketch: A billion-row spreadsheet
33
Debugging DryadLINQ jobs
34
Distributed performance counters
35
Training Kinect
Depth map Body parts
Classifier
Xbox GPU
36
Learn from Many Examples
DecisionTree
Classifier
Machine learning
37
Talk Outline
• Motivation• Dryad: A distributed runtime• DryadLINQ: A compiler for Dryad• Tools and applications• Sketch: A billion-row spreadsheet
Bandwidth hierarchy
39
Principles
• Visualizations are bounded data displays• All computations are sketches
• Sketch is a runtime for (1) running streaming (sketching) algorithms(2) implementing visualizations with bounded data renderings
40
Streaming algorithms
• Sketches = randomized streaming algorithms • Input = set of size n• Result same independent of the order• Memory = O(log(n))• Multi-pass
• Linear input transformations
4 billion rows on 155 machines
42
Spreadsheet operations• Browsing/scrolling• Filtering• Using predicates• Heavy hitters• Sampling
• Searching• Sorting• Computing new columns• Set operations (intersection, union, etc.)• Charting
Histograms
Heat Maps
Sketch distributed service
45data
Sketchservice
data
Sketchservice
data
Sketchservice
data
Sketchservice
46
DataSets = distributed objects
Network
46
Client
Servers
DataSet<T>
Application
T T T T T T T T T T T
47
Sketch Spreadsheet architecture
DataSet<Table>
SQL Server CSV Files Column store Cosmos Storage layer
Table operations
GUI
Distributed objects
Spreadsheet logic
Spreadsheet display
48
DataSet API
interface IDataSet<T> { IDataSet<S> Map<S>(Func<T,S> f); IDataSet<Pair<T,S>> Zip(IDataSet<S> other); R Sketch(ISketch<T, R> sketch);}interface ISketch<T,R> {
R Create(T data);R Combine(List<R> parts);
}
49
DataSet Implementations
Application
Network
Client Parallel
Proxy Proxy
GUI
Parallel
Local Local Local Local
Parallel
Local Local
Parallel
Datasetinterface
Rack aggregation
Core parallelism
Cluster parallelism
RMI layer
Proxy
ref ref ref
Parallel
Server 0 Server 1 Server n
Rack 0 Rack r
Address space
T T T T T T
Proxy
Local Local
Parallel
Proxy
Local Local
Parallel
T T S Sff
Map(f)
51
Sketch(s)
Proxy
Local Local
ParallelR R
R
R
s.Combine
T T
s.Create
interface ISketch<T,R> {R Create(T data);R Combine(List<R> parts);
}
52
Zip
Proxy
Local Local
Parallel
Proxy
Local Local
Parallel
T T S S
Proxy
Local Local
Parallel
T,S T,S
53
Histograms
CDF
2Dhistogram
54
Compute
Computing a histogram
Client
Server 1
Server n
Histogram
1D + 2Dcomposit
esketch
Datarangesketch
Render
Displayhistogra
m
User click tr th
ta
55
Some numbers
• Window Server 2012 R2 • 8-core 2.1GHz
AMD Opteron 2373 EE • > 16GB RAM• 3 x 1TB disks using RAID-0• 155 machines • 5 racks • 1Gbps Ethernet
56
1 2 4 8 16 24 32 64 128
155
0
100
200
300
400
500
600 No aggregation network
With aggregation network
Null Sketch
Machines
Tim
e (m
s)
57
Histogram computation
• 26M rows/machine• Scale-out
1 2 4 8 16 24 32 64 128
155
0200400600800
1000120014001600
machines
Tim
e (m
s)
58
Conclusions
• Big data is here to stay• Better tools are needed• Quest for high-level abstractions for
building distributed systems• Execution graphs• Distributed collections• Higher-order transformations• Distributed stateful objects• Sketching algorithms
59
Execution
Application
Data-Parallel Computation
60
Storage
Language
Map-Reduce
GFSBigTable
CosmosAzure
SQL Server
Dryad
DryadLINQScope
Sawzall,FlumeJava
Hadoop
HDFSS3
Pig, Hive≈SQL LINQ, SQLSawzall, Java