Compiling SIMULA: a historical study of technological genesis
High Performance Compu2ng Department, Simula Research Laboratory… · 2016. 1. 26. · Johannes...
Transcript of High Performance Compu2ng Department, Simula Research Laboratory… · 2016. 1. 26. · Johannes...
MonodomainSimula,onsofCardiacElectrophysiologyoverUnstructured3DHeartGeometry
Pa$entData
JohannesLangguthHighPerformanceCompu2ngDepartment,SimulaResearchLaboratory,Oslo,Norway
Par,,o
ning
ScalableHeterogeneousCPU-GPUComputa$onsforUnstructured
TetrahedralMeshes
MPIseparator
Nodeinteriorpart
MPIseparator
Nodeinteriorpart
Globalproblem
GPU1separator
GPU1interiorpart
GPU0separator
GPU0interiorpart
MPIseparator
CPUinteriorpart
CPU-GPUseparator
MPIseparator
Nodeinteriorpart
ff
3DTetrahedralMesh
QPI$32$GB/s$51.2$GB/s$ 51.2$GB/s$
8$GB/s$ 8$GB/s$
MPI$Infiniband$
HeterogeneousNodes
Compute(CPU(main(part(
Compute(MPI((separator(
Send(MPI(separator(to(other(nodes(
Receive(MPI(separator(from(other(node(
Compute(GPU(0(separator(
Compute(GPU(1(separator( Compute(GPU(1(main(part(
Compute(GPU(0(main(part(
Send(CPU<GPU((separator(to(GPU(0(
GPU(0(swap(
GPU(1(swap(
Send(CPU<GPU((separator(to(GPU(1(
Send(GPU(1((separator(to(GPU(0(
Send(GPU(0((separator(to(GPU(1(
CPU(swap(
Thread(
0(
1(
2(
15(
14(
3(
Time(during(compute(round(
13(
…(
Idle(Compute(CPU<GPU((separator(
Send(GPU(0(separator(to(host(
Send(GPU(1(separator(to(host(
Calcium
Hand
ling
-75
-50
-25
025
scalars
-87.2
40
CA Bt=150ms t=250ms t=500ms t=600ms
t=750ms t=1250ms t=1350mst=1100ms
0"
500"
1000"
1500"
2000"
2500"
3000"
16" 32" 64" 128"
GFLOPs'
GPU"only" Heterogeneous" Communica=on"Disabled"
ScalingPerformance