Patrick Marchesiello Brest, 13 Janvier 2005 Le modèle ROMS et son utilisation sur NYMPHEA Centre...

17
Patrick Marchesiello Patrick Marchesiello Brest, 13 Janvier 2005 Brest, 13 Janvier 2005 Le modèle ROMS et son Le modèle ROMS et son utilisation sur NYMPHEA utilisation sur NYMPHEA Centre IRD de Bretagne

Transcript of Patrick Marchesiello Brest, 13 Janvier 2005 Le modèle ROMS et son utilisation sur NYMPHEA Centre...

Patrick MarchesielloPatrick Marchesiello

Brest, 13 Janvier 2005Brest, 13 Janvier 2005

Le modèle ROMS et son Le modèle ROMS et son utilisation sur NYMPHEAutilisation sur NYMPHEA

Centre IRD de Bretagne

ROMS HistoryROMS History Descendant of SPEM & SCRUM Descendant of SPEM & SCRUM (relative of POM)

(Song & Haidvogel 1994; Barnier et al., 1998)(Song & Haidvogel 1994; Barnier et al., 1998)

UCLA: more like developer’s code UCLA: more like developer’s code (Shchepetkin et al., 1998, 2003, 2004; (Shchepetkin et al., 1998, 2003, 2004; Marchesiello et al., 2001, 2003 … )Marchesiello et al., 2001, 2003 … )

http://www.atmos.ucla.edu/cesr/ROMS_page.htmlhttp://www.atmos.ucla.edu/cesr/ROMS_page.html

Rutgers: larger user community & supportRutgers: larger user community & supporthttp://http://marine.rutgers.edu/po/index.phpmarine.rutgers.edu/po/index.php?model=roms?model=roms

IRD Brest & UCLA & INRIAIRD Brest & UCLA & INRIAhttp://www.brest.ird.fr/Roms_toolshttp://www.brest.ird.fr/Roms_tools

- - AGRIFAGRIF: Adaptive Grid Refinement In Fortran : Adaptive Grid Refinement In Fortran (Debreu 1999)(Debreu 1999)

- - Pre-processingPre-processing tools tools (Penven, Marchesiello)

Collaborators and UsersCollaborators and Users

FRANCEFRANCE IRD BrestIRD Brest: Penven, Marchesiello et al.: Penven, Marchesiello et al. LMC GrenobleLMC Grenoble: Debreu et al.: Debreu et al. LPO Brest:LPO Brest: Le Gentil et al. Le Gentil et al.

USAUSA UCLAUCLA: McWilliams, Shchepetkin, et al.: McWilliams, Shchepetkin, et al. JPL:JPL: Chao et al. Chao et al. Rutgers U.:Rutgers U.: Arango et al. Arango et al.

USERS USERS France: Brest, Paris, Toulouse, NoumeaFrance: Brest, Paris, Toulouse, Noumea Europe: Germany (U. Bremerhaven), Italy (JRC), Portugal (IPIMAR), Europe: Germany (U. Bremerhaven), Italy (JRC), Portugal (IPIMAR),

Spain (AZTI)Spain (AZTI) Africa: Morocco (INRH), Senegal (LPA), South Africa (U. Captown)Africa: Morocco (INRH), Senegal (LPA), South Africa (U. Captown) America: California, Peru (IMARPE), Chili (U. Conception), Brazil America: California, Peru (IMARPE), Chili (U. Conception), Brazil

ROMS Main featuresROMS Main features• Hydrostatic, Boussinesq Primitive EquationsHydrostatic, Boussinesq Primitive Equations• Free surfaceFree surface• Generalized vertical s-coordinateGeneralized vertical s-coordinate• Horizontal curvilinear coordinatesHorizontal curvilinear coordinates

• High order, low dispersion numericsHigh order, low dispersion numerics• Embedded domains: AGRIFEmbedded domains: AGRIF• Open boundary conditionsOpen boundary conditions• Boundary layers parameterizationsBoundary layers parameterizations

• Parallelization: OMP, MPIParallelization: OMP, MPI• Domain partitionningDomain partitionning• Optimized for vector computersOptimized for vector computers

• Fortran 95Fortran 95• UNIX/LinuxUNIX/Linux• C preprocessorC preprocessor• NetCDF library, used for all I/ONetCDF library, used for all I/O

Numerics: MotivationNumerics: Motivation

Kantha and Clayson (2000) after Durran (1991)

Numerics: StrategyNumerics: Strategy

High order accurate methods: Sanderson (1998): optimal choice (lower cost for a given accuracy) for general ocean circulation models is 3RD OR 4TH ORDER accurate methods

With special care to:• Numerical dispersion• Pressure gradient• Mode splitting• Combination of methods

Numerics in ROMSNumerics in ROMS(Shchepetkin & McWilliams, 1998, 2003, 2004)(Shchepetkin & McWilliams, 1998, 2003, 2004)

Horizontal (“C”) and vertical staggered gridsHorizontal (“C”) and vertical staggered grids Time steppingTime stepping

– Split-explicit barotropic and baroclinic modes with 2-way time filterSplit-explicit barotropic and baroclinic modes with 2-way time filter– Predictor-corrector Leapfrog-Adams-Molton 3rd order scheme Predictor-corrector Leapfrog-Adams-Molton 3rd order scheme

with feed-back between momentum & tracer equationswith feed-back between momentum & tracer equations– Non-uniform density in barotropic modeNon-uniform density in barotropic mode– Conservative & constancy preserving advection for tracers.Conservative & constancy preserving advection for tracers.

AdvectionAdvection– 3rd order upstream biased (QUICK)3rd order upstream biased (QUICK)

Vertical termsVertical terms– parabolic spline reconstruction for horiz. pressure gradient and parabolic spline reconstruction for horiz. pressure gradient and

advection terms (equivalent 8th order)advection terms (equivalent 8th order)– Implicite Crank-Nicholson scheme for vertical mixing termsImplicite Crank-Nicholson scheme for vertical mixing terms

POG - 0.25 deg ROMS – 0.25 deg

Numerics: PerfomancesNumerics: Perfomances

C. Blanc C. Blanc

ROMS_AGRIFROMS_AGRIF

• Each domain has its own input/output files

• Grid’s locations specified in AGRIF_FixedGrids.in

• Works in OPENMP/MPI

• Forcings, initial conditions generated with an interactive matlab tool: « nesting gui »

2

20 45 34 59 3 3 3

30 55 70 89 3 3 2

0

1

10 30 20 40 5 3 5

0

The same model (executable) runs on grids with different space/time resolutions

NympheaNymphea

ImplementationImplementation Compilation Compilation

– Software required: Fortran95, Unix, C preprocessor, NetCDF librarySoftware required: Fortran95, Unix, C preprocessor, NetCDF library

– Compilation interface in ROMS which defines machine dependent Compilation interface in ROMS which defines machine dependent

options (Tru64 UNIX)options (Tru64 UNIX)

ParallelisationParallelisation– OpenMP: 1 knot of 4 processorsOpenMP: 1 knot of 4 processors

– MPI: for process studies (S. Le Gentil); needs work for realistic MPI: for process studies (S. Le Gentil); needs work for realistic

applicationsapplications

ApplicationsApplications– Realistic: coastal regions of West Africa (Morocco and Senegal), Realistic: coastal regions of West Africa (Morocco and Senegal),

Iroise sea,Bay of BrestIroise sea,Bay of Brest

– Process studies at high resolutionProcess studies at high resolution

W. Africa 25 km

C. Vert

C. BlancC. Blanc

Sahara 5 km

Mercator

Levitus

Clipper

ROMS_AGRIF for West AFRICAROMS_AGRIF for West AFRICA

242*252*32 pointsdt=720s

PERFORMANCES: COSTPERFORMANCES: COSTCONFIGURATION

• 2 Embedded grids with refinement coef=5• Size (child grid): 242*252*32 points with dt=720s• Duration of simulation: 10 model years• Processors: 1 knot of 4 processors Alpha EV68 (1GHz)• Parallelization with OpenMP• Partitionning: 4*8

Cost: c = 6. 10-6 CPU seconds / grid point / time step

(Total run time = 15 days)

Comparisons: • PC Xeon 2.8Ghz: c=1.10-5

• SGI/CRAY Origin2000: c=8.10-5

• Earth Simulator (NEC SX): c=5. 10-7

PERFORMANCES: PERFORMANCES: SCALABILITYSCALABILITY

• Nymphea: 95 % for 1-4

proc.

• SGI/CRAY-Origin2000:

95% with saturation above

128 proc.

• Earth Simulator: 95-60%

for 1-512 proc.

OMP opt. part.

OMP (1 sub/proc)

MPI (1 sub/proc)

PartitioningPartitioning

Effet du partitionnementTiki, pentium III (biprocesseur)

05

1015202530

2 X 2

6 X 6

8 X 8

10 X

10

10 X

20

20 X

30

20 X

20

Partition du domaine en Latitude et Longitude

seco

nd

e/it

érat

ion

Senegal ideal case on Nymphea (P. Estrade)

• Domain: 150*500*40 with dt=480s• Partitioning 1*1 : Cost = 7.5 10-6

• Partitioning 1*64 : Cost = 6. 10-6

(units= CPU s/ grid point/ time step)

25 % gain due to optimal cache use

Domain: 159*171*20 with dt=480s

New Caledonia region on PC (J. Lefêvre)

100 % gain due to optimal cache use

NSUB_X

NS

UB

_E

CONCLUSIONCONCLUSION

ROMS is well optimized (code and methods) ROMS is well optimized (code and methods) and adapted to Nymphea which allows to and adapted to Nymphea which allows to perform large runs in a reasonable time perform large runs in a reasonable time without excessive queuing timewithout excessive queuing time

The model is ready for faster, more The model is ready for faster, more numerous processors (provided AGRIF is numerous processors (provided AGRIF is fully tested with MPI)fully tested with MPI)

More storage would be welcomeMore storage would be welcome