Partitioned Multi-Physics on Distributed Data via preCICE · Universitat Stuttgart, Technische...
Transcript of Partitioned Multi-Physics on Distributed Data via preCICE · Universitat Stuttgart, Technische...
Universitat Stuttgart, Technische Universitat Munchen
SPPEXA APM 2016
Partitioned Multi-Physics on Distributed Data viapreCICE
Hans-Joachim Bungartz, Florian Lindner, Miriam Mehl,Klaudius Scheufele, Alexander Shukaev, Benjamin Uekermann
Universitat Stuttgart, Technische Universitat Munchen
January 25, 2016
Uekermann et al.: Partitioned Multi-Physics on Distributed Data via preCICE
SPPEXA APM 2016, January 25, 2016 1
Multi-Physics and Exa-Scale
I many exciting applications need multi-physics
I more compute power ⇒ more physics?
I many sophisticated, scalable,single-physics, legacy codes
I our goal: minimal invasive coupling,no deterioration of scalability
Our Example: Fluid-Structure-Acoustic Interaction
Structure
Fluid
Acoustic
I FEAP - FASTEST - Ateles
I OpenFOAM - OpenFOAM - Ateles
I glue-software: preCICE
Our Example: Fluid-Structure-Acoustic Interaction
FS
AFSFSFSFSFS
A A A A A A A AFS
I implicit or explicit coupling
I subcycling
Agenda
This talk: glue-software preCICENext talk (Verena Krupp): application perspective
1. Very brief introduction to preCICE
2. Realization on Distributed Data
3. Performance on Distributed Data
Agenda
This talk: glue-software preCICENext talk (Verena Krupp): application perspective
1. Very brief introduction to preCICE
2. Realization on Distributed Data
3. Performance on Distributed Data
preCICE
I precise Code Interaction Coupling Environment
I developed in Munich and Stuttgart
I library approach, minimal invasive
I high-level API in C++, C, Fortran77/90/95,Fortran2003
I once adapter written ⇒ plug and play
I https://github.com/precice/precice
Big Picture
SOLVER A.1
ADAPTER
STEERING
EQUATION COUPLING
COMMUNICATION
DATA MAPPING
MASTER
SOLVER A.2
ADAPTER
SLAVE
SOLVER A.N
ADAPTER
SLAVE
SOLVER B.1
ADAPTER
STEERING
MASTER
SOLVER B.2
SLAVE
SOLVER B.M
SLAVE
ADAPTER
ADAPTER
Coupled Solvers
Ateles (APES) CF, A in-house (U Siegen)
Alya System IF, S in-house (BSC)
Calculix S open-source (A*STAR)
CARAT S in-house (TUM STATIK)
COMSOL S commercial
EFD IF in-house (TUM SCCS)
FASTEST IF in-house (TU Darmstadt)
Fluent IF commercial
OpenFOAM CF, IF, S open-source (TU Delft)
Peano 1 IF in-house (TUM SCCS)
SU2 CF open-source
preCICE API
turn on()
while time loop 6= done do
solve timestep()
end whileturn off()
preCICE API
turn on()precice create(“FLUID”, “precice config.xml”, index, size)
precice initialize()while time loop 6= done do
solve timestep()
end whileturn off()precice finalize()
preCICE API
turn on()precice create(“FLUID”, “precice config.xml”, index, size)
precice initialize()while time loop 6= done dowhile precice action required(readCheckPoint) do
solve timestep()precice advance()
end whileend whileturn off()precice finalize()
preCICE API
turn on()precice create(“FLUID”, “precice config.xml”, index, size)precice set vertices(meshID, N, pos(dim*N), vertIDs(N))precice initialize()while time loop 6= done dowhile precice action required(readCheckPoint) do
solve timestep()precice advance()
end whileend whileturn off()precice finalize()
preCICE API
turn on()precice create(“FLUID”, “precice config.xml”, index, size)precice set vertices(meshID, N, pos(dim*N), vertIDs(N))precice initialize()while time loop 6= done dowhile precice action required(readCheckPoint) do
precice write bvdata(stresID,N,vertIDs,stres(dim*N))solve timestep()precice advance()precice read bvdata(displID,N,vertIDs,displ(dim*N))
end whileend whileturn off()precice finalize()
preCICE Config
Communication on Distributed Data
coupling surface
solver A
solver B
Communication on Distributed Data
A acceptor B requestor
2
1
0
4
3
2
1
0
I kernel: 1:N communication based on either TCP/IP or MPI Ports⇒ no deadlocks at initialization (independent of order at B)
I asynchronous communication (preferred over threads)⇒ no deadlocks at communication
Interpolation on Distributed Data
Projection-based Interpolation
I first or second order
I example: consistent mapping from B to A
I parallelization: almost trivial
B
A
B
A
nearest neighbour mapping nearest projection mapping
Radial Basis Function Interpolation
I higher order
I parallelization: far from trivial (realized with PETSc)
Interpolation on Distributed Data
Projection-based Interpolation
I first or second order
I example: consistent mapping from B to A
I parallelization: almost trivial
B
A
B
A
nearest neighbour mapping nearest projection mapping
Radial Basis Function Interpolation
I higher order
I parallelization: far from trivial (realized with PETSc)
Fixed-Point Acceleration on Distributed Data
Anderson Acceleration (IQN-ILS)
Find x ∈ D ⊂ Rn : H(x) = x , H : D → Rn
initial value x0
x0 = H(x0) and R0 = x0 − x0
x1 = x0 + 0.1 · R0
for k = 1 . . . doxk = H(xk) and Rk = xk − xk
V k = [∆Rk0 , . . . ,∆Rk
k−1] with ∆Rki = R i − Rk
Wk = [∆xk0 , . . . ,∆xkk−1] with ∆xki = x i − xk
decompose V k = QkUk
solve the first k lines of Ukα = −QkTRk
∆x = Wαxk+1 = xk + ∆xk
end for
Fixed-Point Acceleration on Distributed Data
= v
r
0 = V
R
V Q
R
Q^
^ ^
^
Insert Column
In: Q ∈ Rn×m, R ∈ Rm×m, v ∈ Rn, Out: Q ∈ Rn×(m+1), R ∈ R(m+1)×(m+1)
for j = 1 . . .m dor(j) = Q(:, j)T vv = v − r(j) · Q(:, j)
end forr(m + 1) = ‖v‖2
Q(:,m + 1) = r(m + 1)−1 · v
R =
[r ,
(R0
)]Given’s rotations Gi,j s.t. R = G1,2 . . .Gm,m+1R upper triangleQ = QGm,m+1 . . .G1,2
Performance Tests, Initialization
#DOFs at interface: 2.6 · 105, strong scaling
I II III IV V com ILS NN
Tim
e [m
s]
10 0
10 1
10 2
10 3
10 4
10 5
p=128p=256p=512p=1024p=2048
Performance Tests, Work per Timestep
#DOFs at interface: 2.6 · 105, strong scaling
com ILS NN
Tim
e [m
s]
1
2
3
4
5
6
7
8
9
10p=128p=256p=512p=1024p=2048
Performance Tests, Traveling Pulse
DG solver Ateles, Euler equations,#DOFs: total: 5.9 · 109, at interface: 1.1 · 107,NN mapping and communicationstrong scaling from 128 to 16384 processors per participant.
Ateles Left
Ateles Right
preC
ICE
xy
Z
Joint work with V. Krupp et al. (Universitat Siegen)
Performance Tests, Work per Timestep
27 28 29 210 211 212 213 214100
101
102
103
104
105
106
6830335590
183958389
40701954
1002544606
175
2859
15 20 18
5
Number of Processes per Participant (NPP)
Tim
e[m
s]
Compute (Ateles)
Advance (preCICE)
Summary
Structure
Fluid
AcousticSOLVER A.1
ADAPTER
STEERING
EQUATION COUPLING
COMMUNICATION
DATA MAPPING
MASTER
SOLVER A.2
ADAPTER
SLAVE
SOLVER A.N
ADAPTER
SLAVE
SOLVER B.1
ADAPTER
STEERING
MASTER
SOLVER B.2
SLAVE
SOLVER B.M
SLAVE
ADAPTER
ADAPTER
turn on()precice create(“FLUID”, “precice config.xml”, index, size)precice set vertices(meshID, N, pos(dim*N), vertIDs(N))precice initialize()while time loop 6= done do
while precice action required(readCheckPoint) dopre write bvdata(stresID,N,vertIDs,stres(dim*N))solve timestep()precice advance()pre read bvdata(displID,N,vertIDs,displ(dim*N))
end whileend whileturn off()precice finalize() 27 28 29 210 211 212 213 214100
101
102
103
104
105
106
6830335590
183958389
40701954
1002544606
175
2859
15 20 18
5
Number of Processes per Participant (NPP)
Tim
e[m
s]
Compute (Ateles)
Advance (preCICE)
Summary
Structure
Fluid
Acoustic
SOLVER A.1
ADAPTER
STEERING
EQUATION COUPLING
COMMUNICATION
DATA MAPPING
MASTER
SOLVER A.2
ADAPTER
SLAVE
SOLVER A.N
ADAPTER
SLAVE
SOLVER B.1
ADAPTER
STEERING
MASTER
SOLVER B.2
SLAVE
SOLVER B.M
SLAVE
ADAPTER
ADAPTER
turn on()precice create(“FLUID”, “precice config.xml”, index, size)precice set vertices(meshID, N, pos(dim*N), vertIDs(N))precice initialize()while time loop 6= done do
while precice action required(readCheckPoint) dopre write bvdata(stresID,N,vertIDs,stres(dim*N))solve timestep()precice advance()pre read bvdata(displID,N,vertIDs,displ(dim*N))
end whileend whileturn off()precice finalize() 27 28 29 210 211 212 213 214100
101
102
103
104
105
106
6830335590
183958389
40701954
1002544606
175
2859
15 20 18
5
Number of Processes per Participant (NPP)
Tim
e[m
s]
Compute (Ateles)
Advance (preCICE)
Summary
Structure
Fluid
AcousticSOLVER A.1
ADAPTER
STEERING
EQUATION COUPLING
COMMUNICATION
DATA MAPPING
MASTER
SOLVER A.2
ADAPTER
SLAVE
SOLVER A.N
ADAPTER
SLAVE
SOLVER B.1
ADAPTER
STEERING
MASTER
SOLVER B.2
SLAVE
SOLVER B.M
SLAVE
ADAPTER
ADAPTER
turn on()precice create(“FLUID”, “precice config.xml”, index, size)precice set vertices(meshID, N, pos(dim*N), vertIDs(N))precice initialize()while time loop 6= done do
while precice action required(readCheckPoint) dopre write bvdata(stresID,N,vertIDs,stres(dim*N))solve timestep()precice advance()pre read bvdata(displID,N,vertIDs,displ(dim*N))
end whileend whileturn off()precice finalize() 27 28 29 210 211 212 213 214100
101
102
103
104
105
106
6830335590
183958389
40701954
1002544606
175
2859
15 20 18
5
Number of Processes per Participant (NPP)
Tim
e[m
s]
Compute (Ateles)
Advance (preCICE)
Summary
Structure
Fluid
AcousticSOLVER A.1
ADAPTER
STEERING
EQUATION COUPLING
COMMUNICATION
DATA MAPPING
MASTER
SOLVER A.2
ADAPTER
SLAVE
SOLVER A.N
ADAPTER
SLAVE
SOLVER B.1
ADAPTER
STEERING
MASTER
SOLVER B.2
SLAVE
SOLVER B.M
SLAVE
ADAPTER
ADAPTER
turn on()precice create(“FLUID”, “precice config.xml”, index, size)precice set vertices(meshID, N, pos(dim*N), vertIDs(N))precice initialize()while time loop 6= done do
while precice action required(readCheckPoint) dopre write bvdata(stresID,N,vertIDs,stres(dim*N))solve timestep()precice advance()pre read bvdata(displID,N,vertIDs,displ(dim*N))
end whileend whileturn off()precice finalize()
27 28 29 210 211 212 213 214100
101
102
103
104
105
106
6830335590
183958389
40701954
1002544606
175
2859
15 20 18
5
Number of Processes per Participant (NPP)
Tim
e[m
s]
Compute (Ateles)
Advance (preCICE)
Summary
Structure
Fluid
AcousticSOLVER A.1
ADAPTER
STEERING
EQUATION COUPLING
COMMUNICATION
DATA MAPPING
MASTER
SOLVER A.2
ADAPTER
SLAVE
SOLVER A.N
ADAPTER
SLAVE
SOLVER B.1
ADAPTER
STEERING
MASTER
SOLVER B.2
SLAVE
SOLVER B.M
SLAVE
ADAPTER
ADAPTER
turn on()precice create(“FLUID”, “precice config.xml”, index, size)precice set vertices(meshID, N, pos(dim*N), vertIDs(N))precice initialize()while time loop 6= done do
while precice action required(readCheckPoint) dopre write bvdata(stresID,N,vertIDs,stres(dim*N))solve timestep()precice advance()pre read bvdata(displID,N,vertIDs,displ(dim*N))
end whileend whileturn off()precice finalize() 27 28 29 210 211 212 213 214100
101
102
103
104
105
106
6830335590
183958389
40701954
1002544606
175
2859
15 20 18
5
Number of Processes per Participant (NPP)
Tim
e[m
s]
Compute (Ateles)
Advance (preCICE)