The DryadLINQ Approach to Distributed Data-Parallel Computing Yuan Yu Microsoft Research Silicon...
-
Upload
richard-eyles -
Category
Documents
-
view
218 -
download
2
Transcript of The DryadLINQ Approach to Distributed Data-Parallel Computing Yuan Yu Microsoft Research Silicon...
The DryadLINQ Approach to Distributed Data-Parallel Computing
Yuan Yu
Microsoft Research Silicon Valley
Distributed Data-Parallel Computing
• Dryad talk: the execution layer– How to reliably and efficiently execute distributed
data-parallel programs on a compute cluster?
• This talk: the programming model– How to write distributed data-parallel programs for
a compute cluster?
The Programming Model• Sequential, single machine programming abstraction• Same program runs on single-core, multi-core, or cluster• Preserve the existing programming environments
– Modern programming languages (C# and Java) are very good• Expressive language and data model• Strong static typing, GC, generics, …
– Modern IDEs (Visual Studio and Eclipse) are very good• Great debugging and library support
• Legacy code could be easily reused
Dryad and DryadLINQ
DryadLINQ provides automatic query plan generationDryad provides automatic distributed execution
Outline
• Programming model• DryadLINQ• Applications• Discussions and conclusion
LINQ• Microsoft’s Language INtegrated Query
– Available in .NET3.5 and Visual Studio 2008• A set of operators to manipulate datasets in .NET
– Support traditional relational operators• Select, Join, GroupBy, Aggregate, etc.
– Integrated into .NET programming languages• Programs can invoke operators• Operators can invoke arbitrary .NET functions
• Data model– Data elements are strongly typed .NET objects– Much more expressive than relational tables
• For example, nested data structures
LINQ Framework
PLINQ
Local machine
.Netprogram(C#, VB, F#, etc)
Execution engines
Query
Objects
LINQ-to-SQL
DryadLINQ
LINQ-to-Obj
LIN
Q p
rovi
der i
nter
face
Scalability
Single-core
Multi-core
Cluster
A Simple LINQ Query
IEnumerable<BabyInfo> babies = ...;
var results = from baby in babies where baby.Name == queryName && baby.State == queryState && baby.Year >= yearStart && baby.Year <= yearEnd orderby baby.Year ascending select baby;
A Simple PLINQ Query
IEnumerable<BabyInfo> babies = ...;
var results = from baby in babies.AsParallel() where baby.Name == queryName && baby.State == queryState && baby.Year >= yearStart && baby.Year <= yearEnd orderby baby.Year ascending select baby;
A Simple DryadLINQ Query
PartitionedTable<BabyInfo> babies = PartitionedTable.Get<BabyInfo>(“BabyInfo.pt”);
var results = from baby in babies where baby.Name == queryName && baby.State == queryState && baby.Year >= yearStart && baby.Year <= yearEnd orderby baby.Year ascending select baby;
DryadLINQ Data Model
Partition
Partitioned Table
.Net objects
Partitioned table exposes metadata information– type, partition, compression scheme, serialization, etc.
Demo
• It is just programming– The same familiar programming languages,
development tools, libraries, etc.
K-means Execution Graph
C0 ac
P1
P2
P3
acac
cc C1
acac
ac
cc C2
acac
ac
cc C3
K-means in DryadLINQ
public static Vector NearestCenter(Vector v, IEnumerable<Vector> centers) { return centers.Aggregate((r, c) => (r - v).Norm2() < (c - v).Norm2() ? r : c);}
public static IQueryable<Vector> Step(IQueryable<Vector> vectors, IQueryable<Vector> centers) { return vectors.GroupBy(v => NearestCenter(v, centers)) .Select(group => group.Aggregate((x,y) => x + y) / group.Count());}
var vectors = PartitionedTable.Get<Vector>("dfs://vectors.pt");var centers = vectors.Take(100);for (int i = 0; i < 10; i++) { centers = Step(vectors, centers);}centers.ToPartitionedTable<Vector>(“dfs://centers.pt”);
public class Vector { public double[] entries;
[Associative] public static Vector operator +(Vector v1, Vector v2) { … }
public static Vector operator -(Vector v1, Vector v2) { … }
public double Norm2() { …}
}
PageRank Execution Graph
N0
1ae
E1
E2
E3
aeae
cc
N0
2
N0
3
cc
cc
DD
D
N1
1 N1
2 N1
3
aeae
ae
cc
cc
cc
DD
D
N2
1 N2
2 N2
3
aeae
ae
cc
cc
cc
DD
D
N3
1 N3
2 N3
3
PageRank in DryadLINQ
var pages = PartitionedTable.Get<Page>(“dfs://pages.pt”); var ranks = pages.Select(page => new Rank(page.name, 1.0)); // repeat the iterative computation several times for (int iter = 0; iter < n; iter++) { ranks = Step(pages, ranks); }
ranks.ToPartitionedTable<Rank>(“dfs://ranks.pt”);
public struct Page { public UInt64 name; public Int64 degree; public UInt64[] links;
public Page(UInt64 n, Int64 d, UInt64[] l) { name = n; degree = d; links = l; }
public Rank[] Disperse(Rank rank) { Rank[] ranks = new Rank[links.Length]; double score = rank.rank / this.degree; for (int i = 0; i < ranks.Length; i++) { ranks[i] = new Rank(this.links[i], score); } return ranks; } }
public struct Rank { public UInt64 name; public double rank;
public Rank(UInt64 n, double r) { name = n; rank = r; } }
public static IQueryable<Rank> Step(IQueryable<Page> pages, IQueryable<Rank> ranks) { // join pages with ranks, and disperse updates var updates = from page in pages join rank in ranks on page.name equals rank.name select page.Disperse(rank);
// re-accumulate. return from list in updates from rank in list group rank.rank by rank.name into g select new Rank(g.Key, g.Sum());}
MapReduce in DryadLINQ
MapReduce(source, // sequence of Ts mapper, // T -> Ms keySelector, // M -> K reducer) // (K, Ms) -> Rs{ var map = source.SelectMany(mapper); var group = map.GroupBy(keySelector); var result = group.SelectMany(reducer); return result; // sequence of Rs}// Ex: Count the frequencies of words in a bookMapReduce(book, line => line.Split(' '), w => w.ToLower(), g => g.Count())
Dryad
DryadLINQ System Architecture
DryadLINQ
Client machine
(11)
Distributedquery plan
.NET program
Query Expr
Cluster
Output TablesResults
Input TablesInvoke
Query plan
Output Table
Dryad Execution
.Net Objects
LINQ query
foreach
Vertexcode
DryadLINQ
• Distributed execution plan generation– Static optimizations: pipelining, eager aggregation, etc.– Dynamic optimizations: data-dependent partitioning,
dynamic aggregation, etc.
• Vertex runtime– Single machine (multi-core) implementation of LINQ– Vertex code that runs on vertices– Data serialization code– Callback code for runtime dynamic optimizations– Automatically distributed to cluster machines
A Simple Example
• Count the frequencies of words in a book
var map = book.SelectMany(line => line.Split(' '));var group = map.GroupBy(w => w.ToLower());var result = group.Select(g => g.Count());
Naïve Execution Plan
M
D
MG
G
R
X
M
D
MG
G
R
X
M
D
Map
Distribute
Merge
GroupBy
Reduce
Consumer
map
redu
ce
MG
G
R
X
.....
…g.Count()
line.Split(' ')
Execution Plan Using Partial AggregationM
G1
D
M
G1
D
M
G1
D
MG
G2
C
MG
G2
C
GroupBy
InitialReduce
Merge
GroupBy
Combine
Merge
GroupBy
FinalReduce
Consumer
map
aggr
egati
on tr
eere
duce
IR IRIR
MG
G3
F
X
MG
G3
F
X
Map
Distribute
g.Count()
g.Sum()
g.Sum()
Challenge: Program Analysis Support
• The main sources of difficulty– Complicated data model– User-defined functions all over the places
• Requires sophisticated static program analysis at byte-code level– Mainly some form of flow analysis
• Possible with modern programming languages and runtimes, such as C#/CLR
Inferring Dataset Property
• Useful for query optimizations
Select(x => new { A=x.a+x.b, B=f(x.c) }
Hash(y => y.a+y.b)
Hash(z => z.A)
GroupBy(x => x.A)
No data partitioning at GroupBy
Inferring Dataset Property
• Useful for query optimizations
Select(x => Foo(x) }
Hash(y => y.a+y.b)
Hash(x => x.A)?
GroupBy(x => x.A)
Need IL-level data-flow analysis of Foo
Caching Query Results
• A cluster-wide caching service to support– Reuse of common subqueries– Incremental computations
• Cache: [Key Value]– Key is <q, d>, Value is the result of q(d)– Key requires a pretty good reachability analysis
• For correctness, Key must include “everything” reachable from q
• For performance, Key should only contain things q depends on
More Static Analysis • Purity checking
– All the functions called in DryadLINQ queries must be side-effect free
– DryadLINQ (and PLINQ) doesn’t enforce it• Metadata validity checking
– Partitioned table’s metadata contains the record type, partition scheme, serialization functions, …
– Need to determine if the metadata is valid– DryadLINQ doesn’t fully enforce it
• Static enforcement of program properties for security and privacy mechanisms
Examples of DryadLINQ Applications• Data mining
– Analysis of service logs for network security– Analysis of Windows Watson/SQM data– Cluster monitoring and performance analysis
• Graph analysis– Accelerated Page-Rank computation– Road network shortest-path preprocessing
• Image processing– Image indexing– Decision tree training– Epitome computation
• Simulation– light flow simulations for next-generation display research– Monte-Carlo simulations for mobile data
• eScience– Machine learning platform for health solutions– Astrophysics simulation
Decision Tree TrainingMihai Budiu, Jamie Shotton et al
Machinelearning
1M images x 10,000 pixels x 2,000 features x 221 tree nodes
Complexity >1020 objects
Decision Tree
label image
Learn a decision tree to classify pixels in a large set of images
30
Read, preprocessRedistribute
HistogramRegroup histograms on nodeCompute new tree layer
Compute new tree
Broadcast new tree
Initial empty tree
Final tree
Image inputs, partitioned
Sample Execution plan
Application Details
• Workflow = 37 DryadLINQ jobs• 12 hours running time on 235 machines• More than 100,000 processes• More than 100 days of CPU time• Recovers from several failures daily• 34,000 lines of .NET code
Client
Front EndCluster S
QM
Ser
vice
Datapoints collected on
client and Uploaded.
IIS Servers Check validity &
concatenate
File movers move incoming
data into inexpensive compressed
storage
Distributed Excution Engine
queries data based on user initiated ad-hoc queries or well defined queries for
reporting
Rich reports created with
customization capability
Dat
aFl
ow
DataMarts
Compressed Storage
SQM Client
Front EndReporting Portal
Extract data for storage in
custom schema with different
retention policy
V V V
Distributed Execution
Engine
DataMarts stores data for
reporting purposes
Custom Schema/Storage
Data validation and analysis
toolsets which query raw data
directly
Windows SQM Data Analysis Michal Strehovsky, Sivarudrappa Mahesh et al
The Language Integration Approach
• Single unified programming environment– Unified data model and programming language– Direct access to IDE and libraries– Different from SQL, HIVE, Pig Latin
• Multiple layers of languages and data models
• Works out very well, but requires good programming language supports– LINQ extensibility: custom operators/providers– .NET reflection, dynamic code generation, …
34
Combining with PLINQ
Query
DryadLINQ
PLINQ
subquery
The combination of PLINQ and DryadLINQ delivers computation to every core in the cluster
Acyclic Dataflow Graph
• Acyclic dataflow graph provides a very powerful computation model– Easy target for higher-level programming abstractions
such as DryadLINQ– Easy expression of many data-parallel optimizations
• We designed Dryad to be general and flexible– Programmability is less of a concern– Used primarily to support higher-level programming
abstractions– No major changes made to Dryad in order to support
DryadLINQ
Expectation Maximization (Gaussians)
36
• Generated by DryadLINQ• 3 iterations shown
Decoupling of Dryad and DryadLINQ• Separation of concerns
– Dryad layer concerns scheduling and fault-tolerance – DryadLINQ layer concerns the programming model
and the parallelization of programs– Result: powerful and expressive execution engine and
programming model
• Different from the MapReduce/Hadoop approach– A single abstraction for both programming model and
execution engine– Result: very simple, but very restricted execution
engine and language
38
ImageProcessing
Cosmos DFSSQL Servers
Software Stack
Windows Server
Cluster Services (Azure, HPC, or Cosmos)
Azure DFS
Dryad
DryadLINQ
Windows Server
Windows Server
Windows Server
CIFS/NTFS
MachineLearning
Graph Analysis
DataMining
Applications
…… eScience
Conclusion
• Single unified programming environment– Unified data model and programming language– Direct access to IDE and libraries
• An open and extensible system– Many LINQ providers out there
• Existing ones: LINQ-to-XML, LINQ-to-SQL, PLINQ, …• Very easy to write one for your app domain
– Dryad/DryadLINQ scales out all of them!
Availability
• Freely available for academic use– http://connect.microsoft.com/DryadLINQ– DryadLINQ source, Dryad binaries, documentation,
samples, blog, discussion group, etc.
• Will be available soon for commercial use– Free, but no product support