Topic 9: MR+

9: MR+

Zubair Nabi

zubair.nabi@itu.edu.pk

April 19, 2013

Zubair Nabi 9: MR+ April 19, 2013 1 / 26

Outline

1 Introduction

3 Implementation

4 Code-base

Outline

1 Introduction

3 Implementation

4 Code-base

Implicit MapReduce Assumptions

The input data has no structure

The distribution of intermediate data is balanced

Results materialize when all the map and reduce tasks complete

The number of values of each key is small enough to be processed bya single reduce task

Processing the data at the reduce stage in most cases is usually asimple aggregation function

Zipf distributions are everywhere

Reduce-intensive applications

Image and speech correlation

Backpropagation in neural networks

Co-clustering

Tree learning

Computation of node diameter and radii in Tera-scale graphs

Outline

1 Introduction

3 Implementation

4 Code-base

Design Goals

Negate skew in intermediate data

Exploit structure in input data

Estimate results

Favour commodity clusters

Maintain original functional model of MapReduce

Design Goals

Estimate results

Design Goals

Estimate results

Design Goals

Estimate results

Design Goals

Estimate results

Design

Maintains the simple MapReduce programming model

Instead of implementing MapReduce as a sequential two-stagedarchitecture, MR+ allows map and reduce stages to interleave anditerate over intermediate results

Leading to a multi-level inverted tree of reduce workers

Design

Architecture

MR MR End Brick-wall

Map Phase Reduce Phase

Map Reduce

(a) MapReduce

MR+ Start MR+ End Brick-wall

5% -10% Estimation cycle prioritizes data

(b) MR+

Figure: Architectural comparison of MapReduce and MR+.

Architectural Flexibility

1 Instead of waiting for all maps to finish before scheduling a reducetask, MR+ permits a model where a reduce task can be scheduled forevery n invocations of the map function

2 A densely populated key can be recursively reduced by repeatedinvocation of the reduce function at multiple reduce workers

Architectural Flexibility

1 Instead of waiting for all maps to finish before scheduling a reducetask, MR+ permits a model where a reduce task can be scheduled forevery n invocations of the map function

2 A densely populated key can be recursively reduced by repeatedinvocation of the reduce function at multiple reduce workers

Advantages

Resilient to TCP Incast by amortizing data copying over the course ofthe job

Early materialization of partial results for queries with thresholds orconfidence intervals

Finds structure in the data by running a sample cycle to learn thedistribution of information and prioritizes input data with respect to theuser query

Advantages

Programming Model

Retains the 2-stage MapReduce API

MR+ reducers can be likened to distributed combiners

Repeated invocation of the reducer by default rules out non-associativefunctions

But reducers can be designed in such a way that the associativeoperation is applied only at the very last reduce

Programming Model

Outline

1 Introduction

3 Implementation

4 Code-base

Scheduling

Tasks are scheduled according to a configurablemap_to_reduce_schedule_ratio parameter

For every map_to_reduce_schedule_ratio map tasks, 1reduce task is scheduled

For instance, if map_to_reduce_schedule_ratio is 4, then thefirst reduce task is scheduled when 4 map tasks complete

Scheduling

Level-1 reducers

Each reduce is assigned the output of map_to_reduce_rationumber of maps

The location of their inputs is communicated by the JobTracker

Each reduce task pulls its input via HTTP

After the reduce logic has been applied to all keys, the output isearmarked for L > 1 reducers

Level-1 reducers

Level > 1 reducers

Assigned the input of reduce_input_ratio number of reducetasks

Eventually all key/value pairs make their way to the final level, whichhas a single worker

This final reduce can also be used to apply any non-associativeoperation

Level > 1 reducers

Structural comparison

Reduce1

Shuffler....

k1, v1,v2,...k2, v1,v2,...

...kn, v1,v2,...

k1, v1,v2,...k2, v1,v2,...

...kn, v1,v2,...

Reduce2

Reduce3

Reduce4

Reduceθ

k1, v1,v2,...

k2, v1,v2,...

k3, v1,v2,...

k4, v1,v2,...

kn, v1,v2,...

Brick-wall

k1, v1,v2,...k2, v1,v2,...

...kn, v1,v2,...

(a) MapReduce

Reduce1,1

k1, v1,v2,...k2, v1,v2,...

...kn, v1,v2,...

Reduce2,1

Reduce3,1

Reduce4,1

Reduceα-1,1

Reduceα,1

Reduce1,2

Reduce2,2

Reduceβ,2

... Reduce1,φ

k1, v1,v2,...k2, v1,v2,...

...kn, v1,v2,...

k1, v1,v2,...k2, v1,v2,...

...kn, v1,v2,...

k1, v1,v2,...k2, v1,v2,...

...kn, v1,v2,...

k1, v1,v2,...k2, v1,v2,...

...kn, v1,v2,...

k1, v1,v2,...k2, v1,v2,...

...kn, v1,v2,...

Mapω-1

...α = ω/mr

β = α/rr

ϒ = β/rr...1

(b) MR+

Figure: Structural comparison of MapReduce and MR+.

Reduce Locality

MR+ does not rely on key/values for input assignment

Reduce inputs are assigned on the basis of locality1 Node-local2 Rack-local3 Any

Reduce Locality

MR+ does not rely on key/values for input assignmentReduce inputs are assigned on the basis of locality

1 Node-local2 Rack-local3 Any

Reduce Locality

MR+ does not rely on key/values for input assignmentReduce inputs are assigned on the basis of locality

1 Node-local2 Rack-local3 Any

Fault Tolerance

Deterministic input assignment simplifies failure recovery inMapReduce

In case of MR+, if a map task or a level-1 reduce fails, it is simplyre-executedFor level > 1 reduce tasks, MR+ implements three strategies, whichexpose the trade-off between computation and storage

1 Chain re-execution: The entire chain is re-executed2 Local replication: The output of each reduce is replicated on the local

file system of a rack-local neighbour3 Distributed replication: The output of each reduce is replicated on the

distributed file system

Fault Tolerance

In case of MR+, if a map task or a level-1 reduce fails, it is simplyre-executed

For level > 1 reduce tasks, MR+ implements three strategies, whichexpose the trade-off between computation and storage

Fault Tolerance

1 Chain re-execution: The entire chain is re-executed

2 Local replication: The output of each reduce is replicated on the localfile system of a rack-local neighbour

3 Distributed replication: The output of each reduce is replicated on thedistributed file system

Fault Tolerance

file system of a rack-local neighbour

3 Distributed replication: The output of each reduce is replicated on thedistributed file system

Fault Tolerance

Input Prioritization

User-defined map and reduce functions are applied to asample_percentage amount of input, taken at random

This sampling cycle yields a representative distribution of data

Used to exploit structure: data with semantic grouping or clusters ofrelevant information

The distribution is used to generate a priority queue to assign to maptasks

A full-fledged MR+ job is then run, in which map tasks read input fromthe priority queue

Input Prioritization (2)

Due to this prioritization, relevant clusters of information are processedfirst

As a result, the computation can be stopped mid-way if a thresholdcondition is satisfied

Input Prioritization (2)

Due to this prioritization, relevant clusters of information are processedfirst

As a result, the computation can be stopped mid-way if a thresholdcondition is satisfied

Outline

1 Introduction

3 Implementation

4 Code-base

Code-base

Around 15,000 lines of Python code

Code implements both vanilla MapReduce and MR+

Written over the course of roughly 5 years at LUMS

Publicly available at: https://code.google.com/p/mrplus/source/browse/?name=BRANCH_VER_0_0_0_4_PY2x

Code-base

Storage

Abstracts away the underlying storage system

Currently supports the HDFS and Amazon’s S3

Also supports the local OS file system (for unit testing)

Storage

Structure

Modular structure so most of the code is re-used across MapReduceand MR+

Google Protobufs and JSON used for serialization

All configuration options within two files: siteconf.xml (site-wide)and jobconf.xml (job-specific)

Structure

Topic 9: MR+

Technology

Transcript of Topic 9: MR+

Weather & Climate Mr. Skirbst Life Science Topic 20.

Topic 8: Energy production - Mr. Sjokvist

Topic 9 Police powers test Topic 9 Police powers test.

Topic 9- Suppositories

Earth & Moon Mr. Skirbst Physical Science Topic 28.

Anatomy of the Seed Topic # 2023 Mr. Christensen.

CS5261 Information Security CS 526 Topic 9 Malwares Topic 9: Malware.

Topic 9 Deposition

DDW Topic 9

Topic 2 Review - Mr. Chan's Classroom

Topic 9 pp

Nuclear Chemistry Mr. Skirbst Physical Science Topic 21.

Topic 9 (short)

Topic 3 MIS and MR Lect PPT

Classification Mr. Skirbst Life Science Topic 09.

Topic 9: Remedies

TEXT: Topic 9.

Topic 9.ppd

Exploring Creativity Mr Stones Unit 1 Topic 1.2.2.

Topic 9.ppt