Integrating Trilinos Solvers to SEAM code Dagoberto A.R. Justo – UNM Tim Warburton – UNM Bill...

21
Integrating Trilinos Integrating Trilinos Solvers to SEAM code Solvers to SEAM code Dagoberto A.R. Justo – Dagoberto A.R. Justo – UNM UNM Tim Warburton – UNM Tim Warburton – UNM Bill Spotz – Sandia Bill Spotz – Sandia

Transcript of Integrating Trilinos Solvers to SEAM code Dagoberto A.R. Justo – UNM Tim Warburton – UNM Bill...

Page 1: Integrating Trilinos Solvers to SEAM code Dagoberto A.R. Justo – UNM Tim Warburton – UNM Bill Spotz – Sandia.

Integrating Trilinos Integrating Trilinos Solvers to SEAM codeSolvers to SEAM code

Dagoberto A.R. Justo – UNMDagoberto A.R. Justo – UNM

Tim Warburton – UNMTim Warburton – UNM

Bill Spotz – SandiaBill Spotz – Sandia

Page 2: Integrating Trilinos Solvers to SEAM code Dagoberto A.R. Justo – UNM Tim Warburton – UNM Bill Spotz – Sandia.

SEAM SEAM (NCAR(NCAR))SpectralSpectral

ElementElement

AtmosphericAtmospheric

MethodMethod

AztecOOAztecOO EpetraEpetra NoxNox IfpackIfpack PETScPETSc KomplexKomplex

Trilinos Trilinos (Sandia (Sandia Lab)Lab)

Page 3: Integrating Trilinos Solvers to SEAM code Dagoberto A.R. Justo – UNM Tim Warburton – UNM Bill Spotz – Sandia.

AztecOOAztecOO

SolversSolvers– CG, CGS, BICGStab, GMRES, TfqmrCG, CGS, BICGStab, GMRES, Tfqmr

PreconditionersPreconditioners– Diagonal Jacobi, Least Square, Neumann, Diagonal Jacobi, Least Square, Neumann,

Domain Decomposition, Symmetric Gauss-Domain Decomposition, Symmetric Gauss-Seidel Seidel

Matrix Free implementationMatrix Free implementation C++ (Fortran interface)C++ (Fortran interface) MPIMPI

Page 4: Integrating Trilinos Solvers to SEAM code Dagoberto A.R. Justo – UNM Tim Warburton – UNM Bill Spotz – Sandia.

ImplementationImplementation

SEAM CODE

.

.

. Pcg_solver

.

.

(F90)

Pcg_solver

.

.

Aztec_solvers( )

.

(F90)

Sub Aztec_solvers

.

AZ_Iterate( )

(C)

Matrix_vector_C

(C)

Matrix_vector

.

(F90)

Prec_Jacobi

.

(F90)

Prec_Jacobi_C

(C)

A

Z

T

E

C

Page 5: Integrating Trilinos Solvers to SEAM code Dagoberto A.R. Justo – UNM Tim Warburton – UNM Bill Spotz – Sandia.

Machines usedMachines used

Pentium III Notebook (serial)Pentium III Notebook (serial)– Linux, LAM-MPI, Intel CompilersLinux, LAM-MPI, Intel Compilers

Los Lobos at HPC@UNMLos Lobos at HPC@UNM– Linux ClusterLinux Cluster– 256 nodes256 nodes– IBM Pentium III 750 MHz, 256 KB L2 Cache, IBM Pentium III 750 MHz, 256 KB L2 Cache,

1 Gb RAM1 Gb RAM– Portland Group compilerPortland Group compiler– MPICH for Myrinet interconnectionsMPICH for Myrinet interconnections

Page 6: Integrating Trilinos Solvers to SEAM code Dagoberto A.R. Justo – UNM Tim Warburton – UNM Bill Spotz – Sandia.

Graphical Results from Graphical Results from SEAMSEAM

Energy

Mass

Page 7: Integrating Trilinos Solvers to SEAM code Dagoberto A.R. Justo – UNM Tim Warburton – UNM Bill Spotz – Sandia.

MemoryMemory(in Mbytes per processor)(in Mbytes per processor)

0

5

10

15

20

25

30

p=2 p=4 p=8 p=16

SEAM 6x6x6

SEAM+Aztec6x6x6SEAM12x12x6SEAM+Aztec12x12x6

Page 8: Integrating Trilinos Solvers to SEAM code Dagoberto A.R. Justo – UNM Tim Warburton – UNM Bill Spotz – Sandia.

Speed UpSpeed Up

From 1 to 160 processors.From 1 to 160 processors. Time of SimulationTime of Simulation

144 time iterations144 time iterations

x 300 s = 12 h simulationx 300 s = 12 h simulation Verify results using mass, energy,Verify results using mass, energy,

……– (Different result for 1 proc)(Different result for 1 proc)

Page 9: Integrating Trilinos Solvers to SEAM code Dagoberto A.R. Justo – UNM Tim Warburton – UNM Bill Spotz – Sandia.

Speed Up – SEAMSpeed Up – SEAMselecting # of elements ne=24x24x6selecting # of elements ne=24x24x6

Page 10: Integrating Trilinos Solvers to SEAM code Dagoberto A.R. Justo – UNM Tim Warburton – UNM Bill Spotz – Sandia.

Speed Up – SEAMSpeed Up – SEAMselecting order np=6selecting order np=6

Page 11: Integrating Trilinos Solvers to SEAM code Dagoberto A.R. Justo – UNM Tim Warburton – UNM Bill Spotz – Sandia.

Speed Up – Speed Up – SEAM+AztecSEAM+Aztecbest: cgs solverbest: cgs solver

Page 12: Integrating Trilinos Solvers to SEAM code Dagoberto A.R. Justo – UNM Tim Warburton – UNM Bill Spotz – Sandia.

Speed Up – Speed Up – SEAM+AztecSEAM+Aztecbest: cgs solver + Least Square best: cgs solver + Least Square preconditionerpreconditioner

Page 13: Integrating Trilinos Solvers to SEAM code Dagoberto A.R. Justo – UNM Tim Warburton – UNM Bill Spotz – Sandia.

Speed Up – Speed Up – SEAM+AztecSEAM+Aztecincreasing np -> increases speedupincreasing np -> increases speedup

Page 14: Integrating Trilinos Solvers to SEAM code Dagoberto A.R. Justo – UNM Tim Warburton – UNM Bill Spotz – Sandia.

Upshot – SEAMUpshot – SEAM(One CG iteration)(One CG iteration)

Page 15: Integrating Trilinos Solvers to SEAM code Dagoberto A.R. Justo – UNM Tim Warburton – UNM Bill Spotz – Sandia.

Upshot – SEAMUpshot – SEAM(matrix times vector communication)(matrix times vector communication)

Page 16: Integrating Trilinos Solvers to SEAM code Dagoberto A.R. Justo – UNM Tim Warburton – UNM Bill Spotz – Sandia.

Upshot – SEAM+AztecUpshot – SEAM+Aztec(One CG iteration)(One CG iteration)

Page 17: Integrating Trilinos Solvers to SEAM code Dagoberto A.R. Justo – UNM Tim Warburton – UNM Bill Spotz – Sandia.

Upshot – SEAM+AztecUpshot – SEAM+Aztec(Matrix times vector (Matrix times vector communication)communication)

Page 18: Integrating Trilinos Solvers to SEAM code Dagoberto A.R. Justo – UNM Tim Warburton – UNM Bill Spotz – Sandia.

Upshot – SEAM+AztecUpshot – SEAM+Aztec(Vector Reduction)(Vector Reduction)

Page 19: Integrating Trilinos Solvers to SEAM code Dagoberto A.R. Justo – UNM Tim Warburton – UNM Bill Spotz – Sandia.

Time (24x24x6 elements, 2 proc.)Time (24x24x6 elements, 2 proc.)

SolverSolver Iter.Iter. Time Time (loop) (loop)

Time/iterTime/iter

SEAM p=6SEAM p=6 33.0 it33.0 it 7.48 s7.48 s 0.22 s/it0.22 s/it

SEAM p=12SEAM p=12 56.9 it56.9 it 81.2 s81.2 s 1.42 s/it1.42 s/it

Cg p=6Cg p=6 87.1 it87.1 it 28.2 s28.2 s 0.32 s/it0.32 s/it

Cgs p=6Cgs p=6 74.1 it74.1 it 28.6 s28.6 s 0.38 s/it0.38 s/it

Tfqmr p=6Tfqmr p=6 75.2 it75.2 it 31.1 s31.1 s 0.41 s/it0.41 s/it

Bicg p=6Bicg p=6 94.1 it94.1 it 29.4 s29.4 s 0.31 s/it0.31 s/it

Cgs ls p=6Cgs ls p=6 35.1 it35.1 it 42.0 s42.0 s 1.19 s/it1.19 s/it

CG Jacobi CG Jacobi p=6p=6

45.8 it45.8 it 17.2 s17.2 s 0.37 s/it0.37 s/it

Cgs Cgs Jacobip=6Jacobip=6

31.7 it31.7 it 15.3 s15.3 s 0.48 s/it0.48 s/it

Cgs p=12Cgs p=12 60.4 it60.4 it 274. S274. S 4.53 s/it4.53 s/it

Page 20: Integrating Trilinos Solvers to SEAM code Dagoberto A.R. Justo – UNM Tim Warburton – UNM Bill Spotz – Sandia.

Conclusions &Conclusions &Suggested Future Suggested Future EffortsEfforts SEAM+Aztec works!SEAM+Aztec works! SEAM+Aztec is 2x slowerSEAM+Aztec is 2x slower

difference in CG algorithmsdifference in CG algorithms

SEAM+Aztec time-iteration is 50% SEAM+Aztec time-iteration is 50% slowerslower

0.1% of time lost in calls, preparation 0.1% of time lost in calls, preparation for Aztec.for Aztec.

More time More time better tune-up. better tune-up. Domain decomposition Domain decomposition

PreconditionersPreconditioners

Page 21: Integrating Trilinos Solvers to SEAM code Dagoberto A.R. Justo – UNM Tim Warburton – UNM Bill Spotz – Sandia.

SEAM + Aztec works!SEAM + Aztec works! More time More time better tune-up. better tune-up.

Conclusions &Conclusions &Suggested Future Suggested Future EffortsEfforts