Parallelisation of Industrial Software for Hard Real-time ...
A New Method for Vertical Parallelisation of TAN Learning Based on Balanced Incomplete Block Designs
-
Upload
anders-l-madsen -
Category
Science
-
view
76 -
download
1
Transcript of A New Method for Vertical Parallelisation of TAN Learning Based on Balanced Incomplete Block Designs
![Page 1: A New Method for Vertical Parallelisation of TAN Learning Based on Balanced Incomplete Block Designs](https://reader033.fdocuments.in/reader033/viewer/2022052509/55b70ad8bb61ebf75c8b47ee/html5/thumbnails/1.jpg)
A New Method for Vertical Parallelisation of TAN Learning Based on
Balanced Incomplete Block Designs
Anders L. Madsen12 Frank Jensen1 Antonio Salmerón3 MartinKarlsen1 Helge Langseth4 Thomas D. Nielsen2
HUGIN EXPERT A/S, Aalborg, Denmark
Department of Computer Science, Aalborg University, Denmark
Department of Mathematics, University of Almería , Spain
Department of Computer and Information Science, Norwegian University of Science andTechnology, Norway
PGM, Sept 17-19, 2014 1
![Page 2: A New Method for Vertical Parallelisation of TAN Learning Based on Balanced Incomplete Block Designs](https://reader033.fdocuments.in/reader033/viewer/2022052509/55b70ad8bb61ebf75c8b47ee/html5/thumbnails/2.jpg)
Outline
1 Introduction
2 Learning TAN Using BIB Designs
3 Experimental Analysis
4 Conclusion & Future Work
PGM, Sept 17-19, 2014 2
![Page 3: A New Method for Vertical Parallelisation of TAN Learning Based on Balanced Incomplete Block Designs](https://reader033.fdocuments.in/reader033/viewer/2022052509/55b70ad8bb61ebf75c8b47ee/html5/thumbnails/3.jpg)
Introduction
A Bayesian network (BN) is an excellent knowledge representation,but learning a BN from data can be a challenge
I Data sets are increasing in size w.r.t. both variables and casesI The computational power of computers is increasing
Research ProblemLearning the structure of a Tree-Augmented Naive Bayes modelfrom massive amounts of data in parallel
This work is performed as part of the AMIDST project (no. 619209)
PGM, Sept 17-19, 2014 3
![Page 4: A New Method for Vertical Parallelisation of TAN Learning Based on Balanced Incomplete Block Designs](https://reader033.fdocuments.in/reader033/viewer/2022052509/55b70ad8bb61ebf75c8b47ee/html5/thumbnails/4.jpg)
Learning a TAN model
Let F be a set of features and C be the class variable
C
F1 F2 · · · Fn
The structure of a TAN can be constructed as ([11] and [3]):1. Compute mutual information I (Fi ,Fj |C ) for each pair, i 6= j .2. Build a complete graph G over F with edges annotated by
I (Fi ,Fj |C ).3. Build a maximal spanning tree T from G .4. Select a vertex and recursively direct edges outward from it.5. Add C as parent of each F ∈ F .
PGM, Sept 17-19, 2014 4
![Page 5: A New Method for Vertical Parallelisation of TAN Learning Based on Balanced Incomplete Block Designs](https://reader033.fdocuments.in/reader033/viewer/2022052509/55b70ad8bb61ebf75c8b47ee/html5/thumbnails/5.jpg)
Parallel TAN learning
I Step 1 (scoring all pairs w.r.t. MI) is the most time-consumingelement
I Both horizontal and vertical parallelization of learning ispossible
I Data should be distributed, we assume each variable has itsown file
ChallengeGiven a set of features F we need to distribute MI-computationssuch that each pair is scored exactly once
We use Balanced Incomplete Block (BIB) Designs to achievevertical parallelisation
PGM, Sept 17-19, 2014 5
![Page 6: A New Method for Vertical Parallelisation of TAN Learning Based on Balanced Incomplete Block Designs](https://reader033.fdocuments.in/reader033/viewer/2022052509/55b70ad8bb61ebf75c8b47ee/html5/thumbnails/6.jpg)
Balanced Incomplete Block (BIB) Designs[23]
Definition (Design)A design is a pair (X ,A) s. t. the following properties are satisfied:1. X is a set of elements called points, and2. A is a collection of nonempty subsets of X called blocks.
Definition (BIB design)Let v , k and λ be positive integers s. t. v > k ≥ 2. A (v , k , λ)-BIBdesign is a design (X ,A) s. t. the following properties are satisfied:1. |X | = v,2. each block contains exactly k points, and3. every pair of distinct points is contained in exactly λ blocks.
If v = b where b = |A|, then the design is symmetricPGM, Sept 17-19, 2014 6
![Page 7: A New Method for Vertical Parallelisation of TAN Learning Based on Balanced Incomplete Block Designs](https://reader033.fdocuments.in/reader033/viewer/2022052509/55b70ad8bb61ebf75c8b47ee/html5/thumbnails/7.jpg)
BIB Design Intuition
Let F be the features with |F| = 7 and let (v = 7, k = 3, λ = 1)be a symmetric BIB design, i.e., v = b, with blocks
{013}, {124}, {235}, {346}, {450}, {561}, {602}.
We observeI λ = 1 - we compute mutual information exactly onceI a symmetric BIB design, i.e., v = b
I k = 3 is the number of (subset) of features in each blockWe take the following approach
I create a process for each block to compute mutual informationI a point p represents Fp when |F| > v .I each process generates its block based on its rank
PGM, Sept 17-19, 2014 7
![Page 8: A New Method for Vertical Parallelisation of TAN Learning Based on Balanced Incomplete Block Designs](https://reader033.fdocuments.in/reader033/viewer/2022052509/55b70ad8bb61ebf75c8b47ee/html5/thumbnails/8.jpg)
Difference Sets for Symmetric BIB designs
BIB design Difference set k/v
(3,2,1) (0,1) 0.67(7,3,1) (0,1,3) 0.43(13,4,1) (0,1,3,9) 0.31(21,5,1) (0,1,4,14,16) 0.24(31,6,1) (0,1,3,8,12,18) 0.19(57,8,1) (0,1,3,13,32,36,43,52) 0.14(73,9,1) (0,1,3,7,15,31,36,54,63) 0.12(91,10,1) (0,1,3,9,27,49,56,61,77,81) 0.11(133,12,1) (0,9,10,12,26,30,67,74,82,109,114,120) 0.09(183,14,1) (0,12,19,20,22,43,60,71,76,85,89,115,121,168) 0.08
Difference sets for (v , k, λ)-BIB designs where v is the number of points and k is the
number of points in a block
PGM, Sept 17-19, 2014 8
![Page 9: A New Method for Vertical Parallelisation of TAN Learning Based on Balanced Incomplete Block Designs](https://reader033.fdocuments.in/reader033/viewer/2022052509/55b70ad8bb61ebf75c8b47ee/html5/thumbnails/9.jpg)
BIB Design and TAN Learning
All processes should compute the same number of I (Fi ,Fj |C )scores
I There is(n2
)= n(n − 1)/2 pairs where n = |F|
I We assume each variable is in one fileIf we have p processes and each process reads data from kvariables, then the following must be satisfied
p
(k
2
)≥
(n
2
).
Solving for k produces k ≥ n/√
p, and equality holds only whenp = 1.
PGM, Sept 17-19, 2014 9
![Page 10: A New Method for Vertical Parallelisation of TAN Learning Based on Balanced Incomplete Block Designs](https://reader033.fdocuments.in/reader033/viewer/2022052509/55b70ad8bb61ebf75c8b47ee/html5/thumbnails/10.jpg)
BIB Design Example
Consider the (7, 3, 1)-BIB design and assume |F| = 14I Each point represents two featuresI Each process is assigned six features
The seven blocks (b = 7) are:
{013}, {124}, {235}, {346}, {450}, {561}, {602}
The pairwise scoring is performed as
p = 1
0 1 3
X1 X2 X3 X4 X7 X8
x1 x2 x3 x4 x7 x8
· · ·
p = 7
6 0 2
X13 X14 X1 X2 X5 X6
x13 x14 x1 x2 x5 x6
Intra
Inter
PGM, Sept 17-19, 2014 10
![Page 11: A New Method for Vertical Parallelisation of TAN Learning Based on Balanced Incomplete Block Designs](https://reader033.fdocuments.in/reader033/viewer/2022052509/55b70ad8bb61ebf75c8b47ee/html5/thumbnails/11.jpg)
Experimental Setup
I Software implementation based on HUGIN software and usingMessage Passing Interface (MPI)
I Client-server architecture; a master and worker processesI Average run time (wall-clock) is computed over ten runs of the
algorithm on each dataset with specific number of processesI Clock starts before reading data and ends after all results are
received by master process
data set |X | N
Munin1 189 750,000Munin2 1,003 750,000Bank 1,823 1,140,000
PGM, Sept 17-19, 2014 11
![Page 12: A New Method for Vertical Parallelisation of TAN Learning Based on Balanced Incomplete Block Designs](https://reader033.fdocuments.in/reader033/viewer/2022052509/55b70ad8bb61ebf75c8b47ee/html5/thumbnails/12.jpg)
Three Different Test Computers
1. Ubuntu: a linux server running - 1 Intel Xeon(TM) E3-1270Processor (4 cores) and 32 GB RAM
2. Fyrkat: cluster with 80 nodes - each has 2 Intel Xeon (TM)X5260 Processors (2 cores) and 16GB RAM
3. Vilje: cluster with 1404 nodes - each has 2 Intel Xeon E5-2670Processors (8 cores) and 32GB
Test computers have different resource management systemsFor 2. and 3. we allocate one worker node for each process
PGM, Sept 17-19, 2014 12
![Page 13: A New Method for Vertical Parallelisation of TAN Learning Based on Balanced Incomplete Block Designs](https://reader033.fdocuments.in/reader033/viewer/2022052509/55b70ad8bb61ebf75c8b47ee/html5/thumbnails/13.jpg)
Ubuntu - Linux Server One Node
0
5
10
15
20
25
30
35
40
45
50
0 5 10 15 20 25
Av
erag
e ru
n t
ime
in s
eco
nd
s
Number of processors
Munin1 on Ubuntu
0
100
200
300
400
500
600
700
800
900
1000
0 5 10 15 20 25
Av
erag
e ru
n t
ime
in s
eco
nd
s
Number of processors
Munin2 on Ubuntu
Ubuntu: A linux server running Ubuntu - Intel Xeon(TM) E3-1270 Processor (4 cores) and 32 GB RAM
PGM, Sept 17-19, 2014 13
![Page 14: A New Method for Vertical Parallelisation of TAN Learning Based on Balanced Incomplete Block Designs](https://reader033.fdocuments.in/reader033/viewer/2022052509/55b70ad8bb61ebf75c8b47ee/html5/thumbnails/14.jpg)
Fyrkat - Cluster 80 Nodes
0
200
400
600
800
1000
1200
1400
0 5 10 15 20 25 30 35
Av
erag
e ru
n t
ime
in s
eco
nd
s
Number of processors
Munin2 on Fyrkat
0
1000
2000
3000
4000
5000
6000
7000
0 5 10 15 20 25 30 35
Av
erag
e ru
n t
ime
in s
eco
nd
s
Number of processors
Bank on Fyrkat
Fyrkat: 80 nodes - each has 2 Intel Xeon (TM) X5260 Processors (2 cores) and 16GB RAM
PGM, Sept 17-19, 2014 14
![Page 15: A New Method for Vertical Parallelisation of TAN Learning Based on Balanced Incomplete Block Designs](https://reader033.fdocuments.in/reader033/viewer/2022052509/55b70ad8bb61ebf75c8b47ee/html5/thumbnails/15.jpg)
Vilje - Cluster 1404 Nodes
0
200
400
600
800
1000
1200
1400
1600
1800
0 10 20 30 40 50 60 70 80 90 100
Av
erag
e ru
n t
ime
in s
eco
nd
s
Number of processors
Munin2 on Vilje
0
1000
2000
3000
4000
5000
6000
7000
0 20 40 60 80 100 120 140 160 180 200
Av
erag
e ru
n t
ime
in s
eco
nd
s
Number of processors
Bank on Vilje
Vilje: 1404 nodes - each has 2 Intel Xeon E5-2670 Processors (8 cores) and 32GB
PGM, Sept 17-19, 2014 15
![Page 16: A New Method for Vertical Parallelisation of TAN Learning Based on Balanced Incomplete Block Designs](https://reader033.fdocuments.in/reader033/viewer/2022052509/55b70ad8bb61ebf75c8b47ee/html5/thumbnails/16.jpg)
Speedup Factors - Vilje & Bank
Relative performance compared to case of using one processor
0
5
10
15
20
25
0 20 40 60 80 100 120 140 160 180 200
Sp
eed
up
fac
tor
Number of processors
Time
Files
Optimal
Time speed-up factors for reading data
Files theoretical number of files read byeach process (k/v · |X |)
Optimal square root of the number ofprocessors
0
20
40
60
80
100
120
140
160
180
0 20 40 60 80 100 120 140 160 180 200
Sp
eed
up
fac
tor
Number of processors
Score
Score speed-up factor for computingmutual information
PGM, Sept 17-19, 2014 16
![Page 17: A New Method for Vertical Parallelisation of TAN Learning Based on Balanced Incomplete Block Designs](https://reader033.fdocuments.in/reader033/viewer/2022052509/55b70ad8bb61ebf75c8b47ee/html5/thumbnails/17.jpg)
Conclusion & Future Work
ConclusionI Introduced a new algorithm for learning TAN structure using
parallel computingI Empirical evaluation shows that significant performance
improvements are possible
Future WorkI Apply principles to structure learning in general <> TANI Multithread implementation <> MultiprocessI Horizontal parallelization <> Vertical parallelization
PGM, Sept 17-19, 2014 17
![Page 18: A New Method for Vertical Parallelisation of TAN Learning Based on Balanced Incomplete Block Designs](https://reader033.fdocuments.in/reader033/viewer/2022052509/55b70ad8bb61ebf75c8b47ee/html5/thumbnails/18.jpg)
This project has received funding from the European Union’sSeventh Framework Programme for research, technologicaldevelopment and demonstration under grant agreement no 619209
PGM, Sept 17-19, 2014 18