Thilina Gunarathne ([email protected]) Advisor : Prof.Geoffrey Fox ([email protected])
Thilina Gunarathne
description
Transcript of Thilina Gunarathne
![Page 1: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/1.jpg)
SCALABLE PARALLEL COMPUTING ON CLOUDS : EFFICIENT AND SCALABLE ARCHITECTURES TO PERFORM PLEASINGLY PARALLEL,
MAPREDUCE AND ITERATIVE DATA INTENSIVE COMPUTATIONS ON CLOUD ENVIRONMENTS
Thilina Gunarathne
![Page 2: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/2.jpg)
Figure 1: A sample MapReduce execution flow
![Page 3: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/3.jpg)
Figure 2: Steps of a typical MapReduce computation
Map Task Reduce Task
Task
SchedulingData read Map execution Collect Spill Merge Shuffle Merge
Reduce
Execution
Write
output
![Page 4: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/4.jpg)
Figure 3: Structure of a typical data-intensive iterative application
![Page 5: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/5.jpg)
Figure 4: Multi-Dimensional Scaling SMACOF application architecture using iterative MapReduce
BC: Calculate BX Map Reduce Merge
X: Calculate invV(BX)Map Reduce Merge
Calculate StressMap Reduce Merge
New Iteration
Optional Step
![Page 6: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/6.jpg)
Figure 5 : Bio sequence analysis pipeline[14]
![Page 7: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/7.jpg)
Figure 6: Classic cloud processing architecture for pleasingly parallel computations
![Page 8: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/8.jpg)
Figure 7: Hadoop MapReduce based processing model for pleasingly parallel computations
![Page 9: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/9.jpg)
Figure 8 Cap3 application execution cost with different EC2 instance types
![Page 10: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/10.jpg)
Figure 9 : Cap3 applciation compute time with different EC2 instance types
0
500
1000
1500
2000
Comp
ute Ti
me (s
)
C ap3 C ompute T ime
![Page 11: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/11.jpg)
Figure 10: Parallel efficiency of Cap3 application using the pleasingly parallel frameworks
![Page 12: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/12.jpg)
Figure 11: Cap3 execution time for single file per core using the pleasingly parallel frameworks
![Page 13: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/13.jpg)
Figure 12 : Cost to process 64 BLAST query files on different EC2 instance types
![Page 14: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/14.jpg)
Figure 13 : Time to process 64 BLAST query files on different EC2 instance types
0
500
1000
1500
2000
2500Co
mpute
Time
(s)
B L A S T C ompute T ime
![Page 15: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/15.jpg)
Figure 14: Time to process 8 query files using BLAST application on different Azure instance types
![Page 16: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/16.jpg)
Figure 15 : BLAST parallel efficiency using the pleasingly parallel frameworks
![Page 17: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/17.jpg)
Figure 16 : BLAST average time to process a single query file using the pleasingly parallel
frameworks
![Page 18: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/18.jpg)
Figure 17 : Cost of using GTM interpolation application with different EC2 instance types
![Page 19: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/19.jpg)
Figure 18 : GTM Interpolation compute time with different EC2 instance types
0
100
200
300
400
500
600
Comp
ute Ti
me (s
)
G TM C ompute T ime
![Page 20: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/20.jpg)
Figure 19: GTM Interpolation parallel efficiency using the pleasingly parallel frameworks
![Page 21: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/21.jpg)
Figure 20 : GTM Interpolation performance per core using the pleasingly parallel frameworks
![Page 22: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/22.jpg)
Figure 21: MapReduceRoles4Azure: Architecture for implementing MapReduce frameworks on Cloud environments
using cloud infrastructure services
![Page 23: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/23.jpg)
Figure 22: Task decomposition mechanism of SWG pairwise distance calculation MapReduce application
![Page 24: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/24.jpg)
Figure 23: SWG MapReduce pure performance
![Page 25: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/25.jpg)
Figure 24: SWG MapReduce relative parallel efficiency
![Page 26: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/26.jpg)
Figure 25: SWG MapReduce normalized performance
![Page 27: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/27.jpg)
Figure 26:SWG MapReduce amortized cost for clouds
![Page 28: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/28.jpg)
Figure 27: Cap3 MapReduce scaling performance
![Page 29: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/29.jpg)
Figure 28: Cap3 MapReduce parallel efficiency
![Page 30: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/30.jpg)
Figure 29: Cap3 MapReduce computational cost in cloud infrastructures
![Page 31: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/31.jpg)
Figure 30: Twister4Azure iterative MapReduce programming model
Reduce
Reduce
MergeAdd
Iteration? No
Map Combine
Map Combine
Map Combine
Data Cache
Yes
Hybrid scheduling of the new iteration
Job Start
Job Finish
Broadcast
![Page 32: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/32.jpg)
Figure 31: Cache Aware Hybrid Scheduling
![Page 33: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/33.jpg)
Figure 32: Twister4Azure tree based broadcast over TCP with Azure Blob storage as the
persistent backup.
Blob Storage
N1N1
N2
N6
N10
N3
N4
N5
N3
Workers
![Page 34: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/34.jpg)
Figure 33: MDS weak scaling. Workload per core is constant. Ideal is a straight horizontal
line
![Page 35: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/35.jpg)
Figure 34: MDS Data size scaling using 128 Azure small instances/cores, 20 iterations
![Page 36: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/36.jpg)
Figure 35: Twister4Azure Map Task histogram for MDS of 204800 data points on 32 Azure Large Instances (graphed only 10 iterations out of 20).
Two adjoining bars represent an iteration (2048 tasks per iteration), where each bar represent the different applications inside the iteration.
![Page 37: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/37.jpg)
Figure 36: Number of executing Map Tasks in the cluster at a given moment. Two adjoining bars represent an iteration.
![Page 38: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/38.jpg)
Figure 37: KMeans Clustering Scalability. Relative parallel efficiency of strong scaling using 128 million
data points.
![Page 39: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/39.jpg)
Figure 38: KMeansClustering Scalability. Weak scaling. Workload per core is kept constant (ideal is a straight horizontal
line).
![Page 40: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/40.jpg)
Figure 39: Twister4Azure Map Task execution time histogram for KMeans Clustering 128 million data
points on 128 Azure small instances.
![Page 41: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/41.jpg)
Figure 40: Twister4Azure number of executing Map Tasks in the cluster at a given moment
![Page 42: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/42.jpg)
Figure 41: Performance of SW-G for randomly distributed inhomogeneous data with ‘400’ mean
sequence length.
![Page 43: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/43.jpg)
Figure 42: Performances of SW-G for skewed distributed inhomogeneous data with ‘400’
mean sequence length
![Page 44: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/44.jpg)
Figure 43: Performance of Cap3 for random distributed inhomogeneous data.
![Page 45: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/45.jpg)
Figure 44: Performance of Cap3 for skewed distributed inhomogeneous data
![Page 46: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/46.jpg)
Figure 45: Virtualization overhead of Hadoop SW-G on Xen virtual machines
![Page 47: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/47.jpg)
Figure 46: Virtualization overhead of Hadoop Cap3 on Xen virtual machines
![Page 48: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/48.jpg)
Figure 47: Sustained performance of cloud environments for MapReduce type of applications
![Page 49: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/49.jpg)
Figure 48: Execution traces of Twister4Azure MDS Using in-memory caching on small instances.
(The taller bars represent the MDSBCCalc computation, while the shorter bars represent the MDSStressCalc computation and together they represent an iteration. )
![Page 50: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/50.jpg)
Figure 49: Execution traces of Twister4Azure MDS using Memory-Mapped file based caching on Large
instances.
![Page 51: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/51.jpg)
Figure 50: MapReduce-MergeBroadcast computation flow
Map Combine Shuffle Sort Reduce Merge Broadcast
![Page 52: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/52.jpg)
Figure 51: Map-Collective primitives
![Page 53: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/53.jpg)
Figure 52: Map-AllGather Collective
![Page 54: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/54.jpg)
Figure 53: Map-AllReduce collective
![Page 55: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/55.jpg)
Figure 54: Example Map-AllReduce with Sum operation
![Page 56: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/56.jpg)
Figure 55: MDS Hadoop using only the BC Calculation MapReduce job per iteration to highlight the overhead. 20
iterations, 51200 data points
![Page 57: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/57.jpg)
Figure 56: MDS application implemented using Twister4Azure. 20 iterations. 51200 data points
(~5GB).
![Page 58: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/58.jpg)
Figure 57: Hadoop MapReduce MDS-BCCalc histogram
![Page 59: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/59.jpg)
Figure 58: H-Collectives AllGather MDS-BCCalc histogram
![Page 60: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/60.jpg)
Figure 59: H-Collectives AllGather MDS-BCCalc histogram without speculative scheduling
![Page 61: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/61.jpg)
Figure 60: Hadoop K-means Clustering comparison with H-Collectives Map-AllReduce Weak scaling. 500 Centroids
(clusters). 20 Dimensions. 10 iterations.
![Page 62: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/62.jpg)
Figure 61: Hadoop K-means Clustering comparison with H-Collectives Map-AllReduce Strong scaling. 500 Centroids
(clusters). 20 Dimensions. 10 iterations.
![Page 63: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/63.jpg)
Figure 62 Twister4Azure K-means weak scaling with Map-AllReduce. 500 Centroids, 20 Dimensions. 10
![Page 64: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/64.jpg)
Figure 63: Twister4Azure K-means Clustering strong scaling. 500 Centroids, 20 Dimensions, 10 iterations.
![Page 65: Thilina Gunarathne](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816147550346895dd0c2b4/html5/thumbnails/65.jpg)
Figure 64: HDInsight KMeans Clustering compared with Twister4Azure and Hadoop
32 x 32 M 64 x 64 M 128 x 128 M 256 x 256 M0
200
400
600
800
1000
1200
1400Hadoop AllReduce
Hadoop MapReduce
Twister4Azure AllReduce
Twister4Azure Broadcast
Twister4Azure
HDInsight (AzureHadoop)
Num. Cores X Num. Data Points
Tim
e (s
)