Computer Science and Engineering Predicting Performance for Grid-Based...
-
Upload
roderick-mccoy -
Category
Documents
-
view
216 -
download
0
Transcript of Computer Science and Engineering Predicting Performance for Grid-Based...
Computer Science and Engineering
IPDPS’07 Predicting Performance for Grid-Based Datamining
A Performance Prediction Framework for Grid-Based Data Mining Applications
Leonid Glimcher
Gagan Agrawal
Computer Science and Engineering
IPDPS’07 Predicting Performance for Grid-Based Datamining
Motivating Scenario
Data Repository Clusters
Compute Clusters
User?
3 stages:•Disk i/o,•Network,•Compute.
Computer Science and Engineering
IPDPS’07 Predicting Performance for Grid-Based Datamining
Remote Data Analysis
• Remote data analysis– Grid is a good fit– Details can be very tedious
• Middleware abstracts away lots of development details
• Resource selection – crucial to performance• Performance prediction facilitates resource
selection
Computer Science and Engineering
IPDPS’07 Predicting Performance for Grid-Based Datamining
Presentation Road Map
• Problem statement and motivation• Middleware background• Our performance prediction approach• Experimental evaluation• Related work• Conclusions
Computer Science and Engineering
IPDPS’07 Predicting Performance for Grid-Based Datamining
Problem Statement
Given: Parallel data processing application Execution time break-down (profile) Configurations of available computing resources Dataset replicas in different size repositories
Predict application execution time in order to select right dataset replica and resource configuration
Computer Science and Engineering
IPDPS’07 Predicting Performance for Grid-Based Datamining
FREERIDE-G Design
Computer Science and Engineering
IPDPS’07 Predicting Performance for Grid-Based Datamining
FREERIDE-G Processing
KEY observation: most data mining algorithms follow canonical loop
Middleware API: • Subset of data to be
processed• Reduction object • Local and global reduction
operations • Iterator
While( ) {
forall( data instances d) {
I = process(d)
R(I) = R(I) op d
}
…….
}
Computer Science and Engineering
IPDPS’07 Predicting Performance for Grid-Based Datamining
Performance Prediction Approach
• 3 Phases of execution:– Retrieval at data server– Data delivery to compute node– Parallel processing at compute node
• Special processing structure:– Generalized reduction
Texec = Tdisk + Tnetwork + Tcompute
Computer Science and Engineering
IPDPS’07 Predicting Performance for Grid-Based Datamining
Needed profile information
Numbers of storage nodes (n) compute nodes (c)
Available bandwidth between these (b), in profile configuration
Execution time breakdown: data retrieval (td)
network communication (tn)
data processing (tc) components
Dataset size (s)
Reduction object information: maximum size communication time
Global reduction time
Computer Science and Engineering
IPDPS’07 Predicting Performance for Grid-Based Datamining
Data Retrieval and Communication Time
Data Retrieval:
Dataset size (s) and number of data hosts (n) for base profile and predicted configuration (s’ and n’).
Used to scale td.
Data Communication:
Also need dataset size and number of data hosts, as well as bandwidth (b and b’).
Used to scale tn.
tT nnetwtork b
b
n
n
s
s
''
'
Computer Science and Engineering
IPDPS’07 Predicting Performance for Grid-Based Datamining
Initial Data Processing Time Prediction
Dataset size (s) and number of compute nodes (c):
• base profile (s,c) • predicted profile (s’, c’)
Used to scale up tc.
Limitations – not modeling:• Inter-processor
communication time• Global reduction time
ccompute tc
c
s
sT
'
'
Computer Science and Engineering
IPDPS’07 Predicting Performance for Grid-Based Datamining
Modeling Interprocessor Communication
• Parallel computation involves communication of reduction object
• Communication time (Tro)• Reduction object size (r)• Interprocessor bandwidth (w)• Latency (l)• Reduction object size either
remains constant or scales linearly Tt roc
T '
lrwT ro
^
''
'TT rocompute
Tc
c
s
s
ccompute tc
c
s
sT
'
'
Computer Science and Engineering
IPDPS’07 Predicting Performance for Grid-Based Datamining
Modeling Global Reduction
• Global reduction time (Tg) is also serialized
• Depending on application, global reduction time:
– Scales linearly with number of nodes but is constant independent of size
– Stays constant independent of number of nodes, but scales linearly with data size
TTt grocT "
^^
"'
'TTT grocompute
Tc
c
s
s
^
''
'TT rocompute
Tc
c
s
s
Computer Science and Engineering
IPDPS’07 Predicting Performance for Grid-Based Datamining
Modeling Across Heterogeneous Clusters
Need scaling factors for all 3 stages of computation (from a set of representative applications).
3/)(
3
3
2
2
1
1
TT
TT
TT
sdisk
disk
disk
disk
disk
disk
A
B
A
B
A
B
d
^^^^
TsTsTsT computenetworkdiskexec AcAnAdB
Computer Science and Engineering
IPDPS’07 Predicting Performance for Grid-Based Datamining
FREERIDE-G Applications
Data mining:• K-means clustering• KNN search• EM clustering
Scientific data processing:• Vortex extraction (right)• Molecular defect detection
and categorization
Computer Science and Engineering
IPDPS’07 Predicting Performance for Grid-Based Datamining
Experimental Setup
Base:700 MHz Pentiums connected through Myrinet LaNai 7.0
Heterogeneous prediction:2.4 GHz Opteron 250’s connected through Infiniband (1Gb)
Goal – to correctly model changes in:1. Parallel configuration2. Dataset size3. Network bandwidth4. Underlying resources
TTTexact
predictedexactError||
Computer Science and Engineering
IPDPS’07 Predicting Performance for Grid-Based Datamining
Modeling Parallel Performance
Errors for 3 approaches for:
1. Vortex detection, base:• 1-1 configuration• 710 MB dataset
2. Defect detection, base:• 1-1 configuration• 130 MB dataset
Results:• modeling reduction pays
off• accurate predictions
Vortex Detection (base: 1-1 configuration, 710MB dataset)
0.00%
0.50%
1.00%
1.50%
2.00%
2.50%
3.00%
3.50%
4.00%
4.50%
5.00%
1 cn 2 cn 4 cn 8 cn 16 cn 2 cn 4 cn 8 cn 16 cn 4 cn 8 cn 16 cn 8 cn 16 cn
1 2 4 8Number of data nodes
Re
lati
ve
pre
dic
tio
n e
rro
r %
no communicationreduction communicationglobal reduction
Molecular Defect Detection (base: 1-1 configuration, 130MB dataset)
0.00%
1.00%
2.00%
3.00%
4.00%
5.00%
6.00%
7.00%
8.00%
9.00%
10.00%
1 cn 2 cn 4 cn 8 cn 16 cn 2 cn 4 cn 8 cn 16 cn 4 cn 8 cn 16 cn 8 cn 16 cn
1 2 4 8Number of data nodes
Re
lati
ve
pre
dic
tio
n e
rro
r % no communication
reduction communicationglobal reduction
Computer Science and Engineering
IPDPS’07 Predicting Performance for Grid-Based Datamining
Modeling Dataset SizeEM clustering (base: 1-1 configuration/350 MB, predicted: 1.4 GB dataset)
0.00%
1.00%
2.00%
3.00%
1 cn 2 cn 4 cn 8 cn 16 cn 2 cn 4 cn 8 cn 16 cn 4 cn 8 cn 16 cn 8 cn 16 cn
1 2 4 8Number of data nodes
Re
lati
ve
pre
dic
tio
n e
rro
r %
global reduction
Molecular Defect Detection (base: 1-1 configuration/130MB dataset; predicting: 1.8 GB dataset)
0.00%
1.00%
2.00%
3.00%
4.00%
5.00%
6.00%
1 cn 2 cn 4 cn 8 cn 16 cn 2 cn 4 cn 8 cn 16 cn 4 cn 8 cn 16 cn 8 cn 16 cn
1 2 4 8Number of data nodes
Re
lati
ve
pre
dic
tio
n e
rro
r %
global reduction
Errors for 1 (best) approach for:1. EM clustering (1.4 GB) , base:
• 1-1 configuration• 350 MB dataset
2. Defect detection (1.8 GB), base:• 1-1 configuration• 130 MB dataset
Results:• biggest error when number of
data nodes is same as number of compute nodes
• accurate predictions
Computer Science and Engineering
IPDPS’07 Predicting Performance for Grid-Based Datamining
Impact of Network Bandwidth
Errors for 1 (best) approach for:1. EM clustering (250 Kbps) ,
base:• 1-1 configuration• 500 Kbps
2. Defect detection (250 Kbps), base:• 1-1 configuration• 500 Kbps
Results:• biggest error when number of
data nodes is same as number of compute nodes
• Modeling reduction is most accurate
EM clustering (base: 4-4 configuration/1.4GB dataset; predicting: 130 MB dataset)
0.00%
0.50%
1.00%
1.50%
2.00%
1 cn 2 cn 4 cn 8 cn 16 cn 2 cn 4 cn 8 cn 16 cn 4 cn 8 cn 16 cn 8 cn 16 cn
1 2 4 8Number of data nodes
Re
lati
ve
pre
dic
tio
n e
rro
r %
global reduction
Molecular Defect Detection (base: 4-4 configuration/1.8 GB dataset; predicting: 350 dataset)
0.00%
0.50%
1.00%
1.50%
1 cn 2 cn 4 cn 8 cn 16 cn 2 cn 4 cn 8 cn 16 cn 4 cn 8 cn 16 cn 8 cn 16 cn
1 2 4 8Number of data nodes
Re
lati
ve
pre
dic
tio
n e
rro
r %
global reduction
Computer Science and Engineering
IPDPS’07 Predicting Performance for Grid-Based Datamining
Predictions for different type of cluster
Errors for 1 (best) approach for:1. Defect detection (1.8 GB) ,
base:• 1-1 configuration• 710 MB dataset
2. EM clustering (700 MB), base:• 8-8 configuration• 350 MB dataset
Results:• Scaling factors different• Largest error when predicted
configuration has same number of compute nodes as base
Molecular Defect Detection (base: 4-4 configuration, 130MB dataset;prediction: 1.8 GB dataset)
0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
1 cn 2 cn 4 cn 8 cn 16 cn 2 cn 4 cn 8 cn 16 cn 4 cn 8 cn 16 cn 8 cn 16 cn
1 2 4 8Number of data nodes
Re
lati
ve
pre
dic
tio
n e
rro
r %
global reduction
EM clustering (base: 8-8 configuration, 350 MB dataset; prediction: 700 MB dataset)
0.00%
1.00%
2.00%
3.00%
4.00%
5.00%
6.00%
7.00%
8.00%
9.00%
10.00%
1 cn 2 cn 4 cn 8 cn 16 cn 2 cn 4 cn 8 cn 16 cn 4 cn 8 cn 16 cn 8 cn 16 cn
1 2 4 8Number of data nodes
Re
lati
ve
pre
dic
tio
n e
rro
r %
global reduction
Computer Science and Engineering
IPDPS’07 Predicting Performance for Grid-Based Datamining
Existing Work
3 broad categories for resource allocation: Heuristic approach to mapping Prediction through modeling:
Statistical estimation/predictionAnalytical modeling of parallel
application Simulation based performance prediction
Computer Science and Engineering
IPDPS’07 Predicting Performance for Grid-Based Datamining
Summary
• Performance prediction approach • Exploits similarities in application processing
structure to come up with very accurate results• Approach accurately models changes in:
– Computing configuration– Dataset size– Network bandwidth– Underlying compute resources