Massively-Parallel
Stream Processing
Under QoS Constraints
with Nephele
Björn Lohrmann, Daniel Warneke, and Odej Kao
Technische Universität Berlin
Background
22.06.2012 2Massively-Parallel Stream Processing Under QoS Constraints with Nephele
Nephele is part of the Stratosphere platform for
massively-parallel data processing
in in
map red
match
out
Cloud
Cluster
PACTs
Compiler
Nephele
Runtime
Open Source, downloadable at stratosphere.eu
Background
22.06.2012 3Massively-Parallel Stream Processing Under QoS Constraints with Nephele
Nephele and PACTs currently focus on batch-job
workloads
-to-
What about streaming workloads?
Possible with Nephele, but (as of now) not PACTs
May have different goals
Meet pipeline latency and throughput requirements
Max/Min other custom metrics
Motivation
22.06.2012 4Massively-Parallel Stream Processing Under QoS Constraints with Nephele
Live Processing of streamed data is an important issue
Proliferation of mobile devices capable of producing
streamed data (video, audio, other sensors)
Large Scale Deployments of Sensors in Science and
Industry
Examples: Smart Grids, Traffic Monitoring, Astronomy
Why not adapt todays mass.-parallel frameworks?
Goals
22.06.2012 5Massively-Parallel Stream Processing Under QoS Constraints with Nephele
Identify major aspects of massively-parallel
frameworks that affect QoS goals
Find general strategies to deal with QoS goals
Implement & Evaluate them using the Nephele
Execution Engine
Agenda
22.06.2012 6Massively-Parallel Stream Processing Under QoS Constraints with Nephele
1. Highlight common mass.-parallel framework design
principles
2. Explain implications for streamed workloads
3. Meeting latency requirements in Nephele
4. Experimental Results
Framework Design
Principles
22.06.2012 7Massively-Parallel Stream Processing Under QoS Constraints with Nephele
Task
n
Task
n+1
Task
n
Task
n+1
Task
n
Task
n+1
Task
n
Compute Node X
Compute Node Y
Compute Node Z
Input Buffer
Queue
Thread/ProcessOutput
Buffer
Data
Item
Implications for Streaming
Applications
22.06.2012 8Massively-Parallel Stream Processing Under QoS Constraints with Nephele
Large buffer = high tp, high latency
Small buffer = low tp, low latency
Trade-off needs to be found to meet latency goals
Thread/Process Model
1 Task= 1 Thread model is flexible, but has overhead
Thread scheduling, synchronization, communication
Serialization may be necessary (bad for TP & latency)
N Tasks = 1 Thread model can sometimes provide
better better tp and latency
Meeting Latency
Requirements
22.06.2012 9Massively-Parallel Stream Processing Under QoS Constraints with Nephele
QoS goal:
Meet latency constraint X, then maximize throughput
Based on observations we designed two strategies:
1. Adaptive Output Buffer Sizing
2. Dynamic Task Chaining
Both strategies
work autonomously (only latency constraint is required)
are applied on-demand at runtime
are applicable in systems with similar design principles
Adaptive Output Buffer
Sizing
22.06.2012 10Massively-Parallel Stream Processing Under QoS Constraints with Nephele
Only applied when latency constraint violated
For each channel
Determine output buffer latency (obl)
If obl > threshold, decrease buffer size:
If obl < threshold, increase buffer size again
200,98.0
),max(:
r
rsizesize obl
310500,1.1
),min(:
r
rsizesize obl
Task Chaining
Conditions:
Pipeline of unchained
tasks
Sum of CPU utilizations
is < 90% of capacity of
one core
Only apply to longest
chainable pipeline of
tasks
18.11.2013 Autor - Vortragstitel 11
Task
n
Task
n+1
Compute Node
Task
n
Task
n+1
Compute Node
Again, only applied when overall latency constraint is
violated
Complete System Overview
22.06.2012 12Massively-Parallel Stream Processing Under QoS Constraints with Nephele
JM
300ms
TM TM TM TMTM TM TM
Periodical measurements
(latency, throughput)Buffer Size Updates,
Chain Commands
Sample Application: Video
Livestreaming
22.06.2012 13Massively-Parallel Stream Processing Under QoS Constraints with Nephele
Node 1 Node 2 Node n-1 Node n
Decoder
Merger
Overlay
Encoder
Partitioner
RTP
Server
Latency w/o Optimizations
22.06.2012 14Massively-Parallel Stream Processing Under QoS Constraints with Nephele
Setup:
10 nodes, 80 cores
32 KB output buffer
size
320 video streams
Results:
Latency oscillates
around 4s
Large buffers cause
Latency w/ Adaptive Buffer
Sizing
22.06.2012 15Massively-Parallel Stream Processing Under QoS Constraints with Nephele
Final Latency:
improvement)
Latency /w ABS+TC
22.06.2012 16Massively-Parallel Stream Processing Under QoS Constraints with Nephele
Final Latency:
improvement)
Conclusion and Future Work
22.06.2012 17Massively-Parallel Stream Processing Under QoS Constraints with Nephele
Massively-parallel frameworks can be adapted to do
latency constrained stream processing
Prototype implementation on Nephele showed up to
94% latency improvement on video livestreaming job
Future Work
Distribute latency monitoring (better scalability)
Adapt PACT layer of Stratosphere to provide streaming
capabilities and latency awareness
Top Related