EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques...
Transcript of EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques...
![Page 1: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/1.jpg)
Efficient Local Resorting Techniqueswith Space Filling Curves
Applied to a Parallel Tsunami Simulation Model
Natalja Rakowsky and Annika FuchsAWI, Tsunami-Modelling-Group
The 10th International Workshop on Multiscale (Un-)structured MeshNumerical Modelling for coastal, shelf and global ocean dynamics
Alfred Wegener Institute for Polar and Marine ResearchBremerhaven, 22 - 25 August 2011
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 1 / 35
![Page 2: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/2.jpg)
Outline
introducing TsunAWI
motivation for resorting
construction of Hilbert space filling curve (SFC) ordering
comparison to other sortings
conclusions
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 2 / 35
![Page 3: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/3.jpg)
The AWI Tsunami Modell TsunAWI
TsunAWI in a nutshellshallow water equations with inundation
unstructured P1 − PNC1 finite element grid
explicit time stepping scheme
OpenMP parallel Fortran90 code
Most important application:
German-Indonesian Tsunami Early Warning System
3470 scenarios for different prototypic ruptures3h modeltime (10.800 timesteps of 1s)
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 3 / 35
![Page 4: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/4.jpg)
The AWI Tsunami Modell TsunAWI
TsunAWI in a nutshellshallow water equations with inundation
unstructured P1 − PNC1 finite element grid
explicit time stepping scheme
OpenMP parallel Fortran90 code
Most important application:
German-Indonesian Tsunami Early Warning System
3470 scenarios for different prototypic ruptures3h modeltime (10.800 timesteps of 1s)
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 3 / 35
![Page 5: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/5.jpg)
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 4 / 35
![Page 6: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/6.jpg)
TsunAWI: example for a computational domainregional grid for the Sunda Arc
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 5 / 35
![Page 7: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/7.jpg)
TsunAWI: example for a computational domainregional grid for the Sunda Arc
The computational grid discretizes thedomain with
varying resolution50m areas of interest500m all other coastal areas15km deep ocean
2.366.319 nodes
4.721.884 elements
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 6 / 35
![Page 8: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/8.jpg)
TsunAWI: example for a computational domainregional grid for the Sunda Arc, focus on Bali
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 7 / 35
![Page 9: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/9.jpg)
TsunAWI: example for a computational domainregional grid for the Sunda Arc, focus on Bali
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 8 / 35
![Page 10: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/10.jpg)
TsunAWI: example for a computational domainregional grid for the Sunda Arc, focus on Bali
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 9 / 35
![Page 11: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/11.jpg)
TsunAWI: example for a computational domainregional grid for the Sunda Arc, focus on Bali
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 10 / 35
![Page 12: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/12.jpg)
TsunAWI: example for a computational domainOriginal numbering of nodes as provided by the grid generator
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 11 / 35
![Page 13: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/13.jpg)
adjacency matrix, original grid
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 12 / 35
![Page 14: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/14.jpg)
adjacency matrix, original grid
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 13 / 35
![Page 15: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/15.jpg)
Motivation for resorting
Data locality on the original grid is very, very bad.
E.g., each computation on all nodes of one element results in atleast one cache miss.
Most time consuming routines in every timestep:
compute velocity at nodes v(node) = F(adjacent edges, elems)
compute velocity v(edge) = F(adjacent elems, nodes)
compute ssh ssh(node) = F(adjacent elems, nodes)
compute gradient gradx ,y (elem) = F(adjacent nodes)
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 14 / 35
![Page 16: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/16.jpg)
Motivation for resorting
Data locality on the original grid is very, very bad.
E.g., each computation on all nodes of one element results in atleast one cache miss.
Most time consuming routines in every timestep:
compute velocity at nodes v(node) = F(adjacent edges, elems)
compute velocity v(edge) = F(adjacent elems, nodes)
compute ssh ssh(node) = F(adjacent elems, nodes)
compute gradient gradx ,y (elem) = F(adjacent nodes)
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 14 / 35
![Page 17: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/17.jpg)
Ideas for resorting
SFC like Sierpinski curve in adaptive grid (J. Behrens etal., KlimaCampus Uni Hamburg) could help.But how to derive SFC for highly unstructured grid?
���������@
@@@@@@@@�
����
@@@
@@�����
@@
@@@
���
@@@
@@@
Construct SFC like 3D Hilbert curve in particle codeGadget-2 (communication with T. Rung, TUHamburg-Harburg)
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 15 / 35
![Page 18: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/18.jpg)
Ideas for resorting
SFC like Sierpinski curve in adaptive grid (J. Behrens etal., KlimaCampus Uni Hamburg) could help.But how to derive SFC for highly unstructured grid?
���������@
@@@@@@@@�
����
@@@
@@�����
@@
@@@
���
@@@
@@@
Construct SFC like 3D Hilbert curve in particle codeGadget-2 (communication with T. Rung, TUHamburg-Harburg)
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 15 / 35
![Page 19: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/19.jpg)
SFC construction
•n
0
1 2
3
0
1 2
301
2 3
For all nodes n calculatethe index in the Hilbertcurve as a quadnumber:
SFC index(n) =
132. . .
e.g. for 8 levels:
SFC index(n) =
1·48 + 3·47 + 2·46 + . . .
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 16 / 35
![Page 20: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/20.jpg)
SFC construction
•n
0
1 2
3
0
1 2
301
2 3
For all nodes n calculatethe index in the Hilbertcurve as a quadnumber:
SFC index(n) =
132. . .
e.g. for 8 levels:
SFC index(n) =
1·48 + 3·47 + 2·46 + . . .
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 16 / 35
![Page 21: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/21.jpg)
SFC construction
•n
0
1 2
3
0
1 2
301
2 3
For all nodes n calculatethe index in the Hilbertcurve as a quadnumber:
SFC index(n) =1
32. . .
e.g. for 8 levels:
SFC index(n) =
1·48 + 3·47 + 2·46 + . . .
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 16 / 35
![Page 22: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/22.jpg)
SFC construction
•n
0
1 2
3
0
1 2
301
2 3
For all nodes n calculatethe index in the Hilbertcurve as a quadnumber:
SFC index(n) =1
32. . .
e.g. for 8 levels:
SFC index(n) =
1·48 + 3·47 + 2·46 + . . .
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 16 / 35
![Page 23: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/23.jpg)
SFC construction
•n
0
1 2
3
0
1 2
3
01
2 3
For all nodes n calculatethe index in the Hilbertcurve as a quadnumber:
SFC index(n) =13
2. . .
e.g. for 8 levels:
SFC index(n) =
1·48 + 3·47 + 2·46 + . . .
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 16 / 35
![Page 24: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/24.jpg)
SFC construction
•n
0
1 2
3
0
1 2
301
2 3
For all nodes n calculatethe index in the Hilbertcurve as a quadnumber:
SFC index(n) =13
2. . .
e.g. for 8 levels:
SFC index(n) =
1·48 + 3·47 + 2·46 + . . .
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 16 / 35
![Page 25: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/25.jpg)
SFC construction
•n
0
1 2
3
0
1 2
3
01
2 3
For all nodes n calculatethe index in the Hilbertcurve as a quadnumber:
SFC index(n) =132
. . .
e.g. for 8 levels:
SFC index(n) =
1·48 + 3·47 + 2·46 + . . .
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 16 / 35
![Page 26: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/26.jpg)
SFC construction
•n
0
1 2
3
0
1 2
3
01
2 3
For all nodes n calculatethe index in the Hilbertcurve as a quadnumber:
SFC index(n) =132. . .
e.g. for 8 levels:
SFC index(n) =
1·48 + 3·47 + 2·46 + . . .
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 16 / 35
![Page 27: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/27.jpg)
SFC construction
•n
0
1 2
3
0
1 2
3
01
2 3
For all nodes n calculatethe index in the Hilbertcurve as a quadnumber:
SFC index(n) =132. . .
e.g. for 8 levels:
SFC index(n) =
1·48 + 3·47 + 2·46 + . . .
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 16 / 35
![Page 28: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/28.jpg)
SFC reordering
Reorder the nodes according to SFC index.
Reorder the elementsby an SFC separatly, ornumerically by node indicees(more efficient for TsunAWI)
Edges are constructed in TsunAWI (sorted along the nodes)
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 17 / 35
![Page 29: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/29.jpg)
SFC ordering of the nodesfor TsunAWI regional indonesian grid
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 18 / 35
![Page 30: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/30.jpg)
SFC ordering of the nodesfor TsunAWI regional indonesian grid
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 19 / 35
![Page 31: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/31.jpg)
adjacency matrix for SFC sorted grid
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 20 / 35
![Page 32: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/32.jpg)
adjacency matrix for SFC sorted grid
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 21 / 35
![Page 33: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/33.jpg)
Comparison: RCM orderingadjacency matrix
RCM (reverse Cuthill McKee) ordering obtained via adjacency matrixand Matlab symrcm for sparse matrices.
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 22 / 35
![Page 34: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/34.jpg)
Comparison: RCM orderingadjacency matrix
RCM (reverse Cuthill McKee) ordering obtained via adjacency matrixand Matlab symrcm for sparse matrices.
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 23 / 35
![Page 35: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/35.jpg)
Comparison: RCM orderingfor TsunAWI regional indonesian grid
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 24 / 35
![Page 36: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/36.jpg)
Comparison: RCM orderingfor TsunAWI regional indonesian grid
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 25 / 35
![Page 37: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/37.jpg)
Comparison: AMD orderingadjacency matrix
AMD (approximate minimum degree) ordering obtained via adjacencymatrix and Matlab symamd for sparse matrices.
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 26 / 35
![Page 38: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/38.jpg)
Comparison: AMD orderingadjacency matrix
AMD(approximate minimum degree) ordering obtained via adjacency matrixand Matlab symamd for sparse matrices.
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 27 / 35
![Page 39: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/39.jpg)
Comparison: AMD orderingfor TsunAWI regional indonesian grid
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 28 / 35
![Page 40: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/40.jpg)
Comparison: AMD orderingfor TsunAWI regional indonesian grid
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 29 / 35
![Page 41: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/41.jpg)
SFC compared to unsorted, RCM, SymAMDcomputation time: IBM Power6
Computational time [seconds] for timestep on a cluster node1× IBM Power6 (4 Cores, 2× hyperthreading)
OMP NUM THREADS1 2 4 8
orig. 9.77 4.08 2.91 1.57RCM 2.78 1.77 0.97 0.69AMD 2.76 1.42 0.95 0.66SFC 2.69 1.58 0.92 0.60
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 30 / 35
![Page 42: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/42.jpg)
SFC compared to unsorted, RCM, SymAMDHardware counters: IBM Power6
IBM Hardware counter hpmcount for 1000 timesteps on1× IBM Power6 (4 Cores, 2× hyperthreading,OMP NUM THREADS=8)
hpmcount event
L2 cache missesNumber of loadsper load miss
orig. 274,478,564,540 17.8RCM 57,244,100,260 64.0AMD 54,709,662,295 65.6SFC 49,980,798,689 88.5
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 31 / 35
![Page 43: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/43.jpg)
SFC compared to unsorted, RCM, SymAMDcomputation time: Intel Xeon Nehalem-EX
Computational time [seconds] for one timestep onone blade SGI Altix UV (HLRN, ZIB Berlin and RRZN Hannover)2× Intel Xeon 5570 (8 Cores, 2× hyperthreading)
OMP NUM THREADS
32, No
1 2 4 8 16 32
64 First Touch
orig. 3.84 2.16 1.48 0.89 0.52 0.40
1.63 0.51
RCM 1.64 1.12 0.59 0.35 0.20 0.19
0.37 0.32
AMD 1.47 0.77 0.50 0.30 0.18 0.16
0.32 0.19
SFC 1.47 0.90 0.51 0.31 0.17 0.14
0.30 0.18
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 32 / 35
![Page 44: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/44.jpg)
SFC compared to unsorted, RCM, SymAMDcomputation time: Intel Xeon Nehalem-EX
Computational time [seconds] for one timestep onone blade SGI Altix UV (HLRN, ZIB Berlin and RRZN Hannover)2× Intel Xeon 5570 (8 Cores, 2× hyperthreading)
OMP NUM THREADS
32, No
1 2 4 8 16 32 64
First Touch
orig. 3.84 2.16 1.48 0.89 0.52 0.40 1.63
0.51
RCM 1.64 1.12 0.59 0.35 0.20 0.19 0.37
0.32
AMD 1.47 0.77 0.50 0.30 0.18 0.16 0.32
0.19
SFC 1.47 0.90 0.51 0.31 0.17 0.14 0.30
0.18
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 32 / 35
![Page 45: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/45.jpg)
SFC compared to unsorted, RCM, SymAMDcomputation time: Intel Xeon Nehalem-EX
Computational time [seconds] for one timestep onone blade SGI Altix UV (HLRN, ZIB Berlin and RRZN Hannover)2× Intel Xeon 5570 (8 Cores, 2× hyperthreading)
OMP NUM THREADS 32, No1 2 4 8 16 32 64 First Touch
orig. 3.84 2.16 1.48 0.89 0.52 0.40 1.63 0.51RCM 1.64 1.12 0.59 0.35 0.20 0.19 0.37 0.32AMD 1.47 0.77 0.50 0.30 0.18 0.16 0.32 0.19SFC 1.47 0.90 0.51 0.31 0.17 0.14 0.30 0.18
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 32 / 35
![Page 46: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/46.jpg)
Remark on OpenMPimportance of first touch for data locality
allocate(array(dim))
array(:) = 0.
!$OMP PARALLEL DOdo n=1,dimarray(n) = 0.end do!$OMP END PARALLEL DO
!$OMP PARALLEL DOdo n=1,dimarray(n) = ...end do!$OMP END PARALLEL DO
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 33 / 35
![Page 47: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/47.jpg)
Remark on OpenMPimportance of first touch for data locality
allocate(array(dim))
array(:) = 0.
!$OMP PARALLEL DOdo n=1,dimarray(n) = 0.end do!$OMP END PARALLEL DO
!$OMP PARALLEL DOdo n=1,dimarray(n) = ...end do!$OMP END PARALLEL DO
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 33 / 35
![Page 48: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/48.jpg)
properties of resorting by a SFC
SFC is a very valuable method, because
it is cheap to compute
provides good data localityon all levels of the memory hierarchy
as domain decomposition, it keeps interfaces small(though not optimal)
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 34 / 35
![Page 49: EPIC - Efficient Local Resorting Techniques with Space ...Efficient Local Resorting Techniques with Space Filling Curves Applied to a Parallel Tsunami Simulation Model Natalja Rakowsky](https://reader036.fdocuments.in/reader036/viewer/2022071502/6121d53935aaaa78a35e5612/html5/thumbnails/49.jpg)
work to do
Influence of SFC ordering onILU based preconditioners
fill-incomputational loadconvergence rate
sparse matrix computations in general
SFC compared to generic partitioning algorithms(MeTiS, scotch,. . . )
TsunAWIfurther optimize OpenMP parallelization
MPI parallelization
N. Rakowsky, A. Fuchs SFC in TsunAWI IMUM 2011, Bremerhaven 35 / 35