Ensemble model and mpp
-
Upload
delostilos -
Category
Data & Analytics
-
view
287 -
download
0
description
Transcript of Ensemble model and mpp
![Page 1: Ensemble model and mpp](https://reader034.fdocuments.in/reader034/viewer/2022052600/557c1ed4d8b42af2418b5293/html5/thumbnails/1.jpg)
Presenter:
Date: Note:
Company:
eMail:
Twitter:
Juan-José van der Linden June 5, 2014 DV, MPP
QOSQO
@delostilos
![Page 2: Ensemble model and mpp](https://reader034.fdocuments.in/reader034/viewer/2022052600/557c1ed4d8b42af2418b5293/html5/thumbnails/2.jpg)
SMP => MPP => AMPP
SMP
Symmetric Processing
MPP
Massively
Parallel
Processing
AMPP
Asymmetric MPP
( SMP + MPP)
![Page 3: Ensemble model and mpp](https://reader034.fdocuments.in/reader034/viewer/2022052600/557c1ed4d8b42af2418b5293/html5/thumbnails/3.jpg)
Primary key => distribution key
hub -< satellite join
- data redistribution
- join local in parallel
BK SID
Ensemble 1
Dimensional 2
SID LDTS INFO
1 2001-01-01 My first DV
1 2014-06-05 DV Masters
2 1997-08-02 DM manifesto
Node 1
Node 2
![Page 4: Ensemble model and mpp](https://reader034.fdocuments.in/reader034/viewer/2022052600/557c1ed4d8b42af2418b5293/html5/thumbnails/4.jpg)
Hub SID => distribution key
hub -< satellite join
- join local in parallel
BK SID
Ensemble 1
Dimensional 2
SID LDTS INFO
1 2001-01-01 First DV
1 2014-06-05 DV Masters
2 1997-08-02 DM manifesto
Node 1
Node 2
![Page 5: Ensemble model and mpp](https://reader034.fdocuments.in/reader034/viewer/2022052600/557c1ed4d8b42af2418b5293/html5/thumbnails/5.jpg)
Link SID => distribution key
Default L_SID, 1:N & N:M
- data redistribution
- join local in parallel
H_MID H_SID L_SID
1 A 1
1 B 2
L_SID LDTS LDTS_END CURRENT
1 2001-01-01 2006-01-01 N
1 2014-06-05 9999-12-31 Y
2 2006-01-01 2014-06-05 N
H_MID H_SID L_SID
1 A 1
1 B 2
L_SID H_MID H_SID LDTS LDTS_END
1 1 A 2001-01-01 2006-01-01
1 1 B 2014-06-05 9999-12-31
2 1 A 2006-01-01 2014-06-05
1:N => H_MID on link satellite
- join local in parallel
H_MID is the ensemble identifier !
Node 1
Node 2
![Page 6: Ensemble model and mpp](https://reader034.fdocuments.in/reader034/viewer/2022052600/557c1ed4d8b42af2418b5293/html5/thumbnails/6.jpg)
Use the ensemble identifier if possible!
H_SID H_SID LDTS INFO
L_SID? H_SID H_MID H_SID ? L_SID ? LDTS INFO
Distributing data efficiently to ensure good performance in a MPP database.
- If uneven distribution, one node may become a bottleneck for the whole execution
Try to minimize data movement between nodes
- Data redistribution may occur when joining tables
Ensemble