Integrating Trilinos Solvers to SEAM code

21
Integrating Trilinos Integrating Trilinos Solvers to SEAM code Solvers to SEAM code Dagoberto A.R. Justo – Dagoberto A.R. Justo – UNM UNM Tim Warburton – UNM Tim Warburton – UNM Bill Spotz – Sandia Bill Spotz – Sandia

description

Integrating Trilinos Solvers to SEAM code. Dagoberto A.R. Justo – UNM Tim Warburton – UNM Bill Spotz – Sandia. SEAM (NCAR). Trilinos (Sandia Lab). AztecOO Epetra Nox Ifpack PETSc Komplex. Spectral Element Atmospheric Method. AztecOO. Solvers CG, CGS, BICGStab, GMRES, Tfqmr - PowerPoint PPT Presentation

Transcript of Integrating Trilinos Solvers to SEAM code

Page 1: Integrating Trilinos Solvers to SEAM code

Integrating Trilinos Integrating Trilinos Solvers to SEAM codeSolvers to SEAM code

Dagoberto A.R. Justo – UNMDagoberto A.R. Justo – UNM

Tim Warburton – UNMTim Warburton – UNM

Bill Spotz – SandiaBill Spotz – Sandia

Page 2: Integrating Trilinos Solvers to SEAM code

SEAM SEAM (NCAR(NCAR))SpectralSpectral

ElementElement

AtmosphericAtmospheric

MethodMethod

AztecOOAztecOO EpetraEpetra NoxNox IfpackIfpack PETScPETSc KomplexKomplex

Trilinos Trilinos (Sandia (Sandia Lab)Lab)

Page 3: Integrating Trilinos Solvers to SEAM code

AztecOOAztecOO

SolversSolvers– CG, CGS, BICGStab, GMRES, TfqmrCG, CGS, BICGStab, GMRES, Tfqmr

PreconditionersPreconditioners– Diagonal Jacobi, Least Square, Neumann, Diagonal Jacobi, Least Square, Neumann,

Domain Decomposition, Symmetric Gauss-Domain Decomposition, Symmetric Gauss-Seidel Seidel

Matrix Free implementationMatrix Free implementation C++ (Fortran interface)C++ (Fortran interface) MPIMPI

Page 4: Integrating Trilinos Solvers to SEAM code

ImplementationImplementation

SEAM CODE

.

.

. Pcg_solver

.

.

(F90)

Pcg_solver

.

.

Aztec_solvers( )

.

(F90)

Sub Aztec_solvers

.

AZ_Iterate( )

(C)

Matrix_vector_C

(C)

Matrix_vector

.

(F90)

Prec_Jacobi

.

(F90)

Prec_Jacobi_C

(C)

A

Z

T

E

C

Page 5: Integrating Trilinos Solvers to SEAM code

Machines usedMachines used

Pentium III Notebook (serial)Pentium III Notebook (serial)– Linux, LAM-MPI, Intel CompilersLinux, LAM-MPI, Intel Compilers

Los Lobos at HPC@UNMLos Lobos at HPC@UNM– Linux ClusterLinux Cluster– 256 nodes256 nodes– IBM Pentium III 750 MHz, 256 KB L2 Cache, IBM Pentium III 750 MHz, 256 KB L2 Cache,

1 Gb RAM1 Gb RAM– Portland Group compilerPortland Group compiler– MPICH for Myrinet interconnectionsMPICH for Myrinet interconnections

Page 6: Integrating Trilinos Solvers to SEAM code

Graphical Results from Graphical Results from SEAMSEAM

Energy

Mass

Page 7: Integrating Trilinos Solvers to SEAM code

MemoryMemory(in Mbytes per processor)(in Mbytes per processor)

0

5

10

15

20

25

30

p=2 p=4 p=8 p=16

SEAM 6x6x6

SEAM+Aztec6x6x6SEAM12x12x6SEAM+Aztec12x12x6

Page 8: Integrating Trilinos Solvers to SEAM code

Speed UpSpeed Up

From 1 to 160 processors.From 1 to 160 processors. Time of SimulationTime of Simulation

144 time iterations144 time iterations

x 300 s = 12 h simulationx 300 s = 12 h simulation Verify results using mass, energy,Verify results using mass, energy,

……– (Different result for 1 proc)(Different result for 1 proc)

Page 9: Integrating Trilinos Solvers to SEAM code

Speed Up – SEAMSpeed Up – SEAMselecting # of elements ne=24x24x6selecting # of elements ne=24x24x6

Page 10: Integrating Trilinos Solvers to SEAM code

Speed Up – SEAMSpeed Up – SEAMselecting order np=6selecting order np=6

Page 11: Integrating Trilinos Solvers to SEAM code

Speed Up – Speed Up – SEAM+AztecSEAM+Aztecbest: cgs solverbest: cgs solver

Page 12: Integrating Trilinos Solvers to SEAM code

Speed Up – Speed Up – SEAM+AztecSEAM+Aztecbest: cgs solver + Least Square best: cgs solver + Least Square preconditionerpreconditioner

Page 13: Integrating Trilinos Solvers to SEAM code

Speed Up – Speed Up – SEAM+AztecSEAM+Aztecincreasing np -> increases speedupincreasing np -> increases speedup

Page 14: Integrating Trilinos Solvers to SEAM code

Upshot – SEAMUpshot – SEAM(One CG iteration)(One CG iteration)

Page 15: Integrating Trilinos Solvers to SEAM code

Upshot – SEAMUpshot – SEAM(matrix times vector communication)(matrix times vector communication)

Page 16: Integrating Trilinos Solvers to SEAM code

Upshot – SEAM+AztecUpshot – SEAM+Aztec(One CG iteration)(One CG iteration)

Page 17: Integrating Trilinos Solvers to SEAM code

Upshot – SEAM+AztecUpshot – SEAM+Aztec(Matrix times vector (Matrix times vector communication)communication)

Page 18: Integrating Trilinos Solvers to SEAM code

Upshot – SEAM+AztecUpshot – SEAM+Aztec(Vector Reduction)(Vector Reduction)

Page 19: Integrating Trilinos Solvers to SEAM code

Time (24x24x6 elements, 2 proc.)Time (24x24x6 elements, 2 proc.)

SolverSolver Iter.Iter. Time Time (loop) (loop)

Time/iterTime/iter

SEAM p=6SEAM p=6 33.0 it33.0 it 7.48 s7.48 s 0.22 s/it0.22 s/it

SEAM p=12SEAM p=12 56.9 it56.9 it 81.2 s81.2 s 1.42 s/it1.42 s/it

Cg p=6Cg p=6 87.1 it87.1 it 28.2 s28.2 s 0.32 s/it0.32 s/it

Cgs p=6Cgs p=6 74.1 it74.1 it 28.6 s28.6 s 0.38 s/it0.38 s/it

Tfqmr p=6Tfqmr p=6 75.2 it75.2 it 31.1 s31.1 s 0.41 s/it0.41 s/it

Bicg p=6Bicg p=6 94.1 it94.1 it 29.4 s29.4 s 0.31 s/it0.31 s/it

Cgs ls p=6Cgs ls p=6 35.1 it35.1 it 42.0 s42.0 s 1.19 s/it1.19 s/it

CG Jacobi CG Jacobi p=6p=6

45.8 it45.8 it 17.2 s17.2 s 0.37 s/it0.37 s/it

Cgs Cgs Jacobip=6Jacobip=6

31.7 it31.7 it 15.3 s15.3 s 0.48 s/it0.48 s/it

Cgs p=12Cgs p=12 60.4 it60.4 it 274. S274. S 4.53 s/it4.53 s/it

Page 20: Integrating Trilinos Solvers to SEAM code

Conclusions &Conclusions &Suggested Future Suggested Future EffortsEfforts SEAM+Aztec works!SEAM+Aztec works! SEAM+Aztec is 2x slowerSEAM+Aztec is 2x slower

difference in CG algorithmsdifference in CG algorithms

SEAM+Aztec time-iteration is 50% SEAM+Aztec time-iteration is 50% slowerslower

0.1% of time lost in calls, preparation 0.1% of time lost in calls, preparation for Aztec.for Aztec.

More time More time better tune-up. better tune-up. Domain decomposition Domain decomposition

PreconditionersPreconditioners

Page 21: Integrating Trilinos Solvers to SEAM code

SEAM + Aztec works!SEAM + Aztec works! More time More time better tune-up. better tune-up.

Conclusions &Conclusions &Suggested Future Suggested Future EffortsEfforts