Www.ccsm.ucar.edu Running CCSM Tony Craig CCSM Software Engineering Group ccsm@ucar.edu.

Post on 17-Dec-2015

223 views 2 download

Tags:

Transcript of Www.ccsm.ucar.edu Running CCSM Tony Craig CCSM Software Engineering Group ccsm@ucar.edu.

www.ccsm.ucar.edu

Running CCSMRunning CCSM

Tony CraigCCSM Software Engineering Group

ccsm@ucar.edu

www.ccsm.ucar.edu

OutlineOutline

• General review of CCSM

• Setting up and running a simple case

• Datasets

• Production

• Modifying source code

• Errors

• Tools

• Performance

www.ccsm.ucar.edu

Review of CCSMReview of CCSM

• Five components / Ten models– Atmosphere(3) : atm, datm, latm– Ocean(2) : ocn, docn– Land(2) : lnd, dlnd– Ice(2+) : ice, ice (prescribed mode), ice (mixed

layer ocean mode), dice– Coupler(1) : cpl

• Communication via MPI between components and coupler only

• Each component runs on multiple processors via MPI, OpenMP, MPI/OpenMP

www.ccsm.ucar.edu

Component parallelizationComponent parallelization

• atm : MPI, OpenMP, or MPI/OpenMP• lnd : MPI, OpenMP, or MPI/OpenMP• Ice : MPI only• ocn : MPI only• cpl : OpenMP only• The data models, datm, docn, dice, dlnd, and

latm : serial only, 1 processor

www.ccsm.ucar.edu

ConfigurationsConfigurations

• A = datm, dlnd, docn, dice, cpl• B = atm, lnd, ocn, ice, cpl• C = datm, dlnd, ocn, dice, cpl• D = datm, dlnd, docn, ice, cpl• F = atm, lnd, docn, ice (prescribed mode), cpl• G = latm, dlnd, ocn, ice, cpl• H = atm, dlnd, docn, dice, cpl• I = datm, lnd, docn, dice, cpl• K = atm, lnd, docn, dice, cpl• M = latm, dlnd, docn, ice (ml ocn mode), cpl

www.ccsm.ucar.edu

ResolutionsResolutions

• atm/lnd/datm/dlnd = T42, T31

• ocn/ice/docn/dice = gx1v3, gx3, gx3v4

• latm = T62

• Scientifically validated combinations– B, T42_gx1v3 = b20.007 control run

(test.a1 case)– B, T31_gx3v4 = paleo control run (test.a2

case)

www.ccsm.ucar.edu

“Available” configurations“Available” configurations

A B C D F G H I K M

T42_gx1v3 * * * * * * * *T31_gx3 * * * * * * *T31_gx3v4 *T62_gx1v3 * *T62_gx3 * *

= supported (subject to change)

= b20.007 control

= paleo control

***

www.ccsm.ucar.edu

PlatformsPlatforms

• IBM

• SGI

• Compaq*

www.ccsm.ucar.edu

Review of scriptsReview of scripts

• Main script (test.a1.run)– Sets primary ccsm environment variables– Calls $model.setup.csh

• Gets input datasets• Builds components

– Runs model– Archives– Harvests

www.ccsm.ucar.edu

Setting up a simple caseSetting up a simple case

• Use the GUI !!– The GUI modifies the scripts and creates a new

case for you– Input $CASE, $CSMROOT, $CSMDATA,

$EXEROOT– Input resolution– Input configuration (A-M)– Sets processor layout based on configuration (first

guess)– Sets some batch environment variables– Works well in the NCAR environment, other sites

require post script-generation tuning

www.ccsm.ucar.edu

Setting up a simple case, without GUISetting up a simple case, without GUI

• Create new case directory under scripts, copy over test.a1 files

• Rename file test.a1.run to $CASE.run– Edit $CASE, $CSMROOT, $CSMDATA,

$EXEROOT, $ARCROOT– Edit batch environment parameters– Edit $GRID– Edit $SETUPS– Edit $NTASKS, $NTHRDS

www.ccsm.ucar.edu

$NTASKS, $NTHRDS, batch$NTASKS, $NTHRDS, batch

• $NTASKS are the total number of MPI tasks for each component

• $NTHRDS are the number of OpenMP threads per MPI task

• $NTASKS*$NTHRDS = total number of processors for each component

• Tuning required to get optimal load balance• Batch parameters should match processors

used, consistency important, task_geometry (loadleveler) is very powerful

www.ccsm.ucar.edu

Component parallelizationComponent parallelization

• atm : MPI, OpenMP, or MPI/OpenMP• lnd : MPI, OpenMP, or MPI/OpenMP• ice : MPI only, NTHRDS=1• ocn : MPI only, NTHRDS=1• cpl : OpenMP only, NTASKS=1• The data models, datm, docn, dice, dlnd, and

latm : serial only, 1 processor, NTASKS=1, NTHRDS=1

www.ccsm.ucar.edu

Main script configuration summaryMain script configuration summary

• B case

MODELS ( atm lnd ocn ice cpl)

SETUPS ( atm lnd ocn ice cpl)

NTASKS ( 8 2 40 8 1)

NTHRDS ( 4 4 1 1 4)

• datm/dlnd/ocn/ice case

MODELS ( atm lnd ocn ice cpl)

SETUPS ( datm dlnd ocn ice cpl)

NTASKS ( 1 1 64 16 1)

NTHRDS ( 1 1 1 1 4)

www.ccsm.ucar.edu

$RUNTYPE$RUNTYPE

• Startup - initial startup of model using arbitrary initialization– set $CASE, $BASEDATE

• Continue - continuation of case, bit-for-bit guaranteed, uses model restart files– set $CASE

• Branch - start new case as a bit-for-bit continuation of another case, uses model restart files, requires continuous date– set $CASE, $REFCASE, $REFDATE

• Hybrid - start new case, not bit-for-bit continuation, uses model initial files in atm and land, can change starting date– set $CASE,$BASEDATE,$REFCASE,$REFDATE

www.ccsm.ucar.edu

Coupler namelistCoupler namelist

• Stop_option: ndays, nmonths, newmonth, halfyear, newyear, newdecade

• Stop_n : integer (ndays, nmonths)

• Rest_freq : ndays, monthly, quarterly, halfyear, yearly• Rest_n : integer (ndays)

• Diag_freq : daily, weekly, biweekly, monthly, quarterly, yearly, ndays

• Diag_n : integer (ndays)

• info_bcheck : integer

www.ccsm.ucar.edu

Data SetsData Sets

• Types– Grid files, binary– Namelist input, ascii– Initial datasets, binary/netcdf– Restart datasets, binary– History datasets, netcdf– Log files, ascii

• inputdata directory– This is usually pointed to by $CSMDATA

www.ccsm.ucar.edu

Data Flow, InputData Flow, Input

• Everything is copied to $EXEROOT• Tools and scripts attempt to automate most of the

“get input files”• Main script variables include $CSMDATA, $LFSINP,

$LMSINP, $MACINP, $RFSINP, $RMSINP

$EXEROOT

Mass Store

$ARCROOT/restart

$CSMDATA = inputdata

scripts/$CASE

Setup scripts

www.ccsm.ucar.edu

Data Flow, OutputData Flow, Output

• Output files are moved out of $EXEROOT• Harvesting is a separate process• Writing of restart files coordinated by the coupler• Writing of history files is not coordinated between

components, monthly average is default• Main script variables include $LMSOUT, $MACOUT,

$RFSOUT

$EXEROOTMass Store

$ARCROOT

Scripts

archivingharvesting

www.ccsm.ucar.edu

Log FilesLog Files

• Each component produces a log file, $model.log.$LID• $LID is a system date stamp• Date stamps are the same on all log files for a run• Log files are written into the $EXEROOT/$model

directories during execution• Log files are copied to $SCRIPTS/logs at the end of a

run• There are separate stdout and stderr that sometimes

contain output information

www.ccsm.ucar.edu

Archiving, ccsm_archiveArchiving, ccsm_archive

• Means moving model output to a separate area on a local disk, ccsm_archive

• Local disk area is set by $ARCROOT in the main script

• Benefits– Allows separation of running and harvesting– Mass storage availability does not prevent

continued execution of the model– Allows users to run in volatile temporary space– Supports simple harvesting in a clustered

machine environment (like nirvana)

www.ccsm.ucar.edu

Harvesting, $CASE.harHarvesting, $CASE.har

• Means copying model output to the local mass store• Separate script in scripts/$CASE, $CASE.har• Typically submitted in batch, can also be run

interactively• Submitted by main script after model run, off by

default• Sources ccsm_joe for important environment

variables• Harvests all files in $ARCROOT/{atm,lnd,ocn,ice,cpl}• Verifies accurate copy on mass store before

removing• Can scp files to remote machines

www.ccsm.ucar.edu

Exact RestartExact Restart

• CCSM can stop and restart exactly

• The coupler controls the frequency of restart file writes

• Restart files guarantee bit-for-bit continuity at a checkpoint boundary

• rpointer files are updated in the scripts/$CASE directory after each run

www.ccsm.ucar.edu

Restart file management (1)Restart file management (1)

• ccsm_archive– In scripts/$CASE– Called from main script after model run is

complete, commented out by default– $ARCROOT/restart contains the latest full set of

restart files– ccsm_archive copies full set of restart datasets

into $ARCROOT/restart after each run– ccsm_archive then tars up that restart set into the

$ARCROOT/restart.tars directory– These tar files can be large, regular clean up

required

www.ccsm.ucar.edu

Restart file management (2)Restart file management (2)

• ccsm_getrestart– In scripts/tools– Called from main script before model run starts,

commented out by default– Copies the latest set of restart files from

$ARCROOT/restart to the appropriate directories

• To “backup” model run to previous model date– Assumes both ccsm_archive and ccsm_getrestart

have been active in the main script– Delete all files in $ARCROOT/restart– Untar an $ARCROOOT/restart.tars file into

$ARCROOT/restart– Resubmit

www.ccsm.ucar.edu

Auto-ResubmitAuto-Resubmit

• RESUBMIT file in scripts/$CASE directory– contains a single integer– If the integer is >0, main script resubmits

itself and decrements the integer

• Runaway jobs– FIRST! set value in RESUBMIT file to 0– Attempt to kill running jobs

www.ccsm.ucar.edu

ProductionProduction

• Modify coupler namelist in cpl.setup.csh, set run length and restart frequency, turn down diagnostic frequency, set info_bcheck to 0.

• Run a startup, hybrid, or branch case $RUNTYPE

• Transition to continue $RUNTYPE• Turn on archiving, harvesting, and

ccsm_getrestart• Edit RESUBMIT file to initiate auto-

resubmission

www.ccsm.ucar.edu

Monitoring a runMonitoring a run

• Monitor the batch jobs using llq, bjobs, qstat• Verify that runs complete successfully, check

for timing information at the end of a log file• Tail -f $EXEROOT/cpl/cpl.log*• If runs are not succeeding,

– tail each log file– grep for ENDRUN in atm and lnd log files– Check stdout and stderr files for component

messages or system messages– Look for core files in $EXEROOT/$model– Look for zero length files in $EXEROOT/$model– Check email

www.ccsm.ucar.edu

Modifying source codeModifying source code

• Modifying files in the ccsm models directory is not recommended

• Create directories under scripts/$CASE– src.atm, src.lnd, src.ocn, src.ice, src.cpl– Copy subset of model source code to these

directories and modify it– Has highest priority with respect to build

• Benefits include– Release source code remains unmodified and

available– Allows implementation of case dependent code

modifications

www.ccsm.ucar.edu

Multiple Machine SupportMultiple Machine Support

• Should run on blackforest, babyblue, and ute “out of the box”

• “Other” machines include seaborg, nirvana, eagle, falcon, cheetah

• Supported platforms are indicated in $OS, $SITE, $MACH, $ARCH environment variables in the main script

• See also scripts/tools/test.a1.mods.$MACH for suggested changes to test.a1.run for “other” machines.

www.ccsm.ucar.edu

Running on a “New” MachineRunning on a “New” Machine

• Main script– Set batch queue commands– Add new $OS, $SITE, $MACH, $ARCH options– Set standard CCSM path names, $CSMROOT, …– Harvester submission issues– Set data movement variables, $LMSINP, …

• Harvester script– May require modification

• Tools– May need to modify ccsm_msread, ccsm_mswrite

• Build– Modify models/bld/Macros.$OS file

www.ccsm.ucar.edu

ccsm_joeccsm_joe

• Created by main script

• Updated every time the main script runs

• Case dependent

• Records important ccsm environment variables

• Can be “sourced” by other scripts to inherit ccsm environment variables

www.ccsm.ucar.edu

Interactive/Batch IssuesInteractive/Batch Issues

• Can run main script interactively• Typically used to build and pre-stage initial

data• Uncomment “exit” command in main script to

stop the script before script starts ccsm execution

• Batch environment highly site dependent– NQS– Loadleveler– LSF– PBS

www.ccsm.ucar.edu

Common Errors (1)Common Errors (1)

• Model won’t build– Try rebuilding clean– Remove all obj directories, these are

$OBJROOT/model/obj which is normally equivalent to $EXEROOT/model/obj

– When rebuilding, make sure $SETBLD is true in main script

• Model won’t continue due to restart problem– Determine cause of problem; quota, hardware,

script, zero length files, rpointer problems– Fix if possible– Back up to latest “good” restart dataset– Rerun

www.ccsm.ucar.edu

Common Errors (2)Common Errors (2)

• Ice model stops due to mp transport error– Double ndte in ice.setup.csh ice model namelist– Back up to latest “good” restart dataset– Run past previous stop date– Reset ndte value

• Ocean model non-convergence– Add about 10% to the number of model

timesteps/hour in ocn.setup.csh, DT_COUNT– Back up to latest “good” restart dataset– Run past previous stop date– Reset DT_COUNT– Non-convergence on first timestep is special case

www.ccsm.ucar.edu

ToolsTools

• Under scripts/tools– ccsm_getfile : hierarchical search for file– ccsm_getinput : hierarchical search for input file– ccsm_msread : copies a file from local mass store– ccsm_mswrite : copies a file to local mass store– ccsm_checkenvs : echo ccsm environment

variables, used to created ccsm_joe– ccsm-getrestart : copies restart files from

$ARCROOT/restart to appropriate $EXEROOT and scripts/$CASE directories

www.ccsm.ucar.edu

PerformancePerformance

• This is complicated!• Issues

– Performance of components and system as a function of resolution and configuration

– Scalability of individual components, scaling efficiency of individual components

– Task/Thread counts– Components sharing nodes, overloading nodes

with multiple components, overloading threads, overloading tasks

– Load balance of coupled system

www.ccsm.ucar.edu

Component TimingsComponent Timings

0

50

100

150

200

250

300

4 8 16 32 64

Number of processors

Seconds/simulated day

atmlndiceocn

www.ccsm.ucar.edu

CCSM Load BalancingCCSM Load Balancing

40 ocean

32 atm

16 ice

12 land

04 cpl

104 total

9.4 3.0

6.2 15.0

8.6 40.4

53.2

10.0 10.0

55

3 2

Timings in seconds per day

5

processors

www.ccsm.ucar.edu

Component/Hardware layoutComponent/Hardware layout

• Machine, set of nodes• Nodes, group of processors that share

memory• Processors, individual computing elements• General rules

– Do not oversubscribe processors, place only 1 MPI task or 1 thread on each processor

– Minimize the number of nodes used for a given component and processor requirement

– Multiple components can share a node as long as there is no oversubscription of processors

– Test several decompositions, layouts, task/thread combinations to try to optimize performance

www.ccsm.ucar.edu

SummarySummary

• CCSM is a complicated multi-executable climate model, expect there to be “spin-up” time

• CCSM is a scientific research code• There are many possible components,

configurations, platforms, and resolutions; we are unable to test everything

• Users are responsible for validating their science• NCAR can help with software/configuration problems,

ccsm@ucar.edu• Please report bugs, fixes, improvements, and ports to

new hardware, so we can incorporate those changes! ccsm@ucar.edu