Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations...
-
date post
22-Dec-2015 -
Category
Documents
-
view
215 -
download
1
Transcript of Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations...
![Page 1: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/1.jpg)
data parallelism
Chris Olston
Yahoo! Research
![Page 2: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/2.jpg)
set-oriented computation
data management operations tend to be “set-oriented”, e.g.:– apply f() to each member of a set– compute intersection of two sets
easy to parallelize
parallel data management is parallel computing’s biggest success story
![Page 3: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/3.jpg)
history
relational datatabase systems (declarative set-oriented primitives)
parallel relational database systems
renaissance: map-reduce etc.
1970’s
1980’s
now
![Page 4: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/4.jpg)
architectures
shared-memory
shared-disk
shared-nothing (clusters) • message overheads• skew
• expense• scale
![Page 5: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/5.jpg)
early systems
XPRS (Berkeley, shared-memory)
Gamma (Wisconsin, shared-nothing)
Volcano (Colorado, shared-nothing)
Bubba (MCC, shared-nothing)
Terradata (shared-nothing)
Tandem non-stop SQL (shared-nothing)
![Page 6: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/6.jpg)
example
data:– pages(url, change_freq, spam_score, …)– links(from, to)
question:– how many inlinks from non-spam pages
does each page have?
![Page 7: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/7.jpg)
…pages … links
…
…
…
the answer
filter by spam score
join bypages.url = links.from
group by links.to
parallel evaluation
![Page 8: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/8.jpg)
parallelism opportunities
inter-job
intra-job– inter-operator
• pipeline• tree
– intra-operator• partition
f(g(X), h(Y))
f(g(X))
f(X) = f(X1) f(X2) … f(Xn)
![Page 9: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/9.jpg)
parallelism obstacles
data dependencies– e.g. set intersection w/asymmetric hashing:
must hash input1 before reading input2
resource contention– e.g. many nodes transmit to node X
simultaneously
startup & teardown costs
![Page 10: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/10.jpg)
metrics
speed-up
scale-upparallelism
thro
ughp
ut
ideal
data size + parallelism
thro
ughp
ut
ideal
![Page 11: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/11.jpg)
talk outline
introduction
query processing
data placement
recent systems
![Page 12: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/12.jpg)
query evaluation
key primitives:– lookup– sort– group– join
![Page 13: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/13.jpg)
lookup by key
data partitioned on function of key? – great!
otherwise– d’oh
…
partitioned data
![Page 14: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/14.jpg)
sort
…
…
…
sort local data
merge sorted runs
…
a-c w-z
problems with this approach?
![Page 15: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/15.jpg)
sort, improved
…
…
…
partition
receive & sort
…
a-c w-z
a key issue: avoiding skew– sample to
estimate data distribution
– choose ranges to get uniformity
![Page 16: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/16.jpg)
group
again, skew is an issueapproaches:– avoid (choose
partition function carefully)
– react (migrate groups to balance load)
…
…
…
partition
sort or hash
…
0 99
![Page 17: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/17.jpg)
join
alternatives:– symmetric repartitioning– asymmetric repartitioning– fragment and replicate– generalized f-and-r
![Page 18: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/18.jpg)
join: symmetric repartitioning
input B
… equality-based join
input A
… …
… … partitionpartition
![Page 19: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/19.jpg)
join: asymmetric repartitioning
input B
equality-based join
input A
…
…
…
… partition
(already suitably partitioned)
![Page 20: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/20.jpg)
join: fragment and replicate
input B
input A
…
…
![Page 21: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/21.jpg)
join: generalized f-and-r
input B
input A
![Page 22: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/22.jpg)
join: other techniques …
semi-join
bloom-join
![Page 23: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/23.jpg)
query optimization
degrees of freedom
objective functions
observations
approaches
![Page 24: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/24.jpg)
degrees of freedom
conventional query planning stuff:– access methods, join order, join algorithms,
selection/projection placement, …
parallel join strategy (repartition, f-and-r)
partition choices (“coloring”)
degree of parallelism
scheduling of operators onto nodes
pipeline vs. materialize between nodes
![Page 25: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/25.jpg)
objective functions
want:– low response time for jobs– low overhead (high system throughput)
these are at odds– e.g., pipelining two operators may decrease
response time, but incurs more overhead
![Page 26: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/26.jpg)
proposed objective functions
Hong:– linear combination of response time and overhead
Ganguly:– minimize response time, with limit on extra
overhead– minimize response time, as long as cost-benefit
ratio is low
![Page 27: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/27.jpg)
observations
response time metric violates “principle of optimality”– every subplan of an optimal plan is optimal– dynamic programming relies on this property– hence, so does System-R (w/“interesting orders” patch)
example:
A indexB index
A
A join B
2010
10
10
![Page 28: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/28.jpg)
approaches
two-phase [Hong]:1. find optimal sequential plan2. find optimal parallelization of above (“coloring”)
• optimal for shared-memory w/intra-operator parallelism only
one-phase (still open research):– model sources & deterrents of parallelism in cost formulae– can’t use DP, but can still prune search space using
partial orders (i.e., some subplans dominate others) [Ganguly]
![Page 29: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/29.jpg)
talk outline
introduction
query processing
data placement
recent systems
![Page 30: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/30.jpg)
data placement
degrees of freedom:– declustering degree– which set of nodes– map records to nodes
![Page 31: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/31.jpg)
declustering degree
spread table across how many nodes? – function of table size– determine empirically, for a given system
![Page 32: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/32.jpg)
which set of nodes
three strategies [Mehta]:– random (worst)– round-robin– heat-based (best, given accurate workload model)
(none take into account locality for joins)
![Page 33: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/33.jpg)
map records to nodes
avoid hot-spots– hash partitioning works fairly well– range partitioning with careful ranges better
[DeWitt]
add redundancy– chained declustering [DeWitt]:
1p2s
3p1s
2p3s
![Page 34: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/34.jpg)
talk outline
introduction
query processing
data placement
recent systems
![Page 35: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/35.jpg)
academia
C-store [MIT]– separate transactional & read-only systems– compressed, column-oriented storage– k-way redundancy; copies sorted on different keys
River & Flux [Berkeley]– run-time adaptation to avoid skew– high availability for long-running queries via
redundant computation
![Page 36: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/36.jpg)
Google (batch computation)
Map-Reduce:– grouped aggregation with UDFs– fault tolerance: redo failed operations– skew mitigation: fine-grained partitioning & redundant
execution of “stragglers”
Sawzall language:– SELECT-FROM-GROUPBY style queries– schemas (“protocol buffers”)– convert errors into undefined values– primitives for operating on nested sets
![Page 37: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/37.jpg)
Google (random access)
Bigtable:– single logical table, physically distributed
• horizontal partitioning
• sorted base + deltas, with periodic coalescing
– API: read/write cells, with versioning– one level of nesting: a top-level cell may contain a set
• e.g. set of incoming anchortext strings
![Page 38: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/38.jpg)
Yahoo!
Pig (batch computation):– relational-algebra-style query language– map-reduce-style evaluation
PNUTS (random access):– primary & secondary indexes– transactional semantics
![Page 39: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/39.jpg)
IBM, Microsoft
Impliance [IBM; still on drawing board]– 3 kinds of nodes: data, processing, xact mgmt– supposed to handle loosely structured data
Dryad [Microsoft]– computation expressed as logical dataflow graph
with explicit parallelism– query compiler superimposes graph onto cluster
nodes
![Page 40: Data parallelism Chris Olston Yahoo! Research. set-oriented computation data management operations tend to be “set-oriented”, e.g.: –apply f() to each.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d7f5503460f94a6226f/html5/thumbnails/40.jpg)
summary
big data = a good app for parallel computing
the game: – partition & repartition data– avoid hotspots, skew– be prepared for failures
still an open area!– optimizing complex queries, caching intermediate
results, horizontal vs. vertical storage, …