PREDIcT: Towards Predicting the Runtime of Iterative Analytics
Adrian Popescu1, Andrey Balmin2, Vuk Ercegovac3, Anastasia Ailamaki1
1 2 3
Predicting Runtime of Iterative Analytics
2
computation messaging synch
Requirements:• # of iterations• per iteration resources (key features), i.e.,
for Bulk Synchronous Parallel (BSP):
• cost model
Challenges:• dependence on prior iterations• variable resource requirements
Tim
e Ite
ratio
n 1
Workers
Tim
e Ite
ratio
n 2
Partitioned Input
PREDIcT at a Glance
3• Cost model for BSP Execution Model
Reso
urc
es
Iterations
Sample run
Iterations
Actual run
Reso
urc
es
• Transformations: • Input dataset: sampling• Parameters: transform function
• Prediction methodology for iterative analytics on graphs:
Proportionality for resources,similarity for # of iterations
Supported AnalyticsGlobal convergence metric: e.g., an average, a ratio, fix point

Ranking (e.g., PageRank)
Graph processing (e.g., neighborhood estimation)
Graph clustering (e.g., semi-clustering)
Example: PageRank
⇒ Sampling technique
⇒ Transform function

• PageRank of a page: given by the rank of its inbound pages
• Rank computation: iterative
• Convergence: RankChange < τG1. graph structure:
connectivity, degree ratio, diameter
2. parameters: N, τG
1
2 4
3
8
7
6
5G
Sampling: Biased Random Jump
• Variation of Random Jump (RJ) / random walk
Sampling scale-free graphs: e.g., web graphs
11
1
2 3
5
4
6 7
8 9 10
12 13 14
15 16
2 3
5 6
8 9
12 13
11
1
5
4
6 7
8 9
RJ BRJ
• Seed vertices: k high out degree nodes (hubs)
G
Disconnected Connected sampleBRJ: Improving connectivity at the same sampling ratio
Transformations: Preserving Iterations
1 3
8
5S Sampling Ratio (SR) = 50%
1
2 4
3
8
7
6
5G
Convergence: RankChange (G) < τG

τS = τG / SR
Average rank change: RankChange(S) prop. w/ RankChange(G)Transform function T:
Sample and transform function preserve iterations
S maintains: connectivity, in/out degree ratio, effective diameter
Prediction
Cost ModelF (X1,…,Xk)
Extrapolator
Runtime
Scaled features
Profiled features
Sample run Estimated actual run
Two extrapolation factors:• on edges• on vertices
Customized cost model for the Bulk Synchronous Parallel execution model: i.e., Giraph BSP

9
Tim
e Ite
ratio
n 1
Workers
Partitioned Input
Cost Model: Translating Features into Time
Active vertices, message counts
Message counts / sizes,Locality of messages
Skew
computation messaging synch
• Each phase but synch: multivariate linear
regression
• Synchronization: identifying critical path
Bulk Synchronous Parallel Model
Experimental Evaluation• Setup: 10 machines, 6C CPUs Intel X5660, 48GB
RAM, 1Gbps
• Datasets: Real graph datasets: Wikipedia (Wiki), Twitter (TW), UK-2002 (UK), LiveJournal(LJ), with sizes in [1,25] GB
• Representative Algorithms: PageRank (PR), Top-k Ranking and semi-clustering (SC)
• Default transformations: BRJ and Tr = (IDConf, τS = τG / SR)
• Metrics: signed relative error: RE=(Predicted - Actual) * 100 % / Actual (i.e., “+” = over-prediction, “-” = under-prediction) 10
Predicting Features (Iterations)Giraph BSP, 10 machines, real datasets in [1,25] GB

Predicting Features (Iterations)
Predicting iterations for semi-clustering: Ϯ= 0:01(left), and Ϯ = 0:001 (right).
Predicting Features (Iterations)
Predicting key features for top-k ranking: Predicting iterations (left), and predicting remote message bytes (right).
Predicting Features (Iterations)
LJ Wiki UK-2002 Twitter0
13
25
38
50Actual UpBound PREDIcT
Nu
mb
er
of
ite
rati
on
sPageRank
Sampling Ratio = 0.1
PREDIcT reduces relative error from [104, 168]% to [0, 11]%
Predicting Time
0.01
0.05
0.1 0.15
0.2 0.25
-0.7
-0.4
0.0
0.3
0.6LJ Wiki UK-2002
Sampling Ratio
Rela
tive E
rror
Tim
e
Semi-clustering
0.01 0.05 0.1 0.15 0.2 0.25-0.2
0.0
0.1
0.3
0.5LJ Wiki UK-2002
Sampling Ratio
Rela
tive E
rror
Tim
e
Neighborhood estimation
[10, 30]% relative error for 15% sample
Algorithms with variable work/iteration• Cumulated impact of: # of iterations and per iteration
resources
Impact
• PREDIcT: Experimental methodology for estimating key features and runtime for iterative analytics on graphs
• Enables key feature prediction: pluggable transformations, and runtime prediction: cost model
• Accurate empirical solution:• Iterations: [0, 11]% (opposed to [104,168]%)• Time: [10, 30]%
http://dias.epfl.ch/predict
Thank you!
Backup slides
17
Cost Model: Model Fitting
Multivariate regression
Pool of BSP features
Model Fitting
Historical runs
• Training data: sample run + historical runs (if such runs exist)
• Customizable cost model (per input algorithm)
F (X1,…,Xk)
Sample run
18
Cost Model
compute message sync
Itera
tion
W1 W2W3
Active vertices,Message countsMessage counts,Message sizes,Locality of messagesPartitioning scheme / skew
• Bulk Synchronous Parallel execution model
• Specialized for network intensive algorithms
• Each phase but sync: multivariate regression
• Synchronization modeled implicitly

Customized Cost Model for Bulk Synchronous Parallel Execution Model
Feasibility Analysis
PR (UK) PR (TW) SC (UK) CC (UK) CC (TW)0
1250
2500
3750
5000
Actual runSample run
Runti
me (
sec)
20Feasible for algorithms dominated by iteration time
Context: BSP Processing Model
Giraph BSPW1
W2
W3
W4
Vertex centric model: Each vertex performs local processing, then messaging
Algorithms in BSP are inherently iterative
Itera
tion
W1 W2W3
compute message sync
Bulk Synchronous Parallel (BSP) W4
Prediction
Cost ModelF (X1,…,Xk)
Extrapolator
Runtime
Scaled features
Profiled features
Sample run Estimated actual run
Two extrapolation factors:• on edges• on vertices
Customized cost model for the Bulk Synchronous Parallel execution model: i.e., Giraph BSP

Top Related