Post on 26-Mar-2015
Universität Karlsruhe (TH)Rechenzentrum
How to use the System
SSCK Workshop – Introduction to HP XC6000 ClusterKarlsruhe, March 9 – 11, 2005
Hartmut HäfnerSSCK
Universität Karlsruhe (TH)
haefner@rz.uni-karlsruhe.de
SSCK Workshop, Karlsruhe, March 9, 2005
page 2
Universität Karlsruhe (TH)Rechenzentrum
Interactive Login
SSCK Workshop, Karlsruhe, March 9, 2005
page 3
Universität Karlsruhe (TH)Rechenzentrum
Available Services (1/2)
HWW-Firewall
XC1
ssh (scp)
passive ftp
» No print manager
» No exported file system
SSCK Workshop, Karlsruhe, March 9, 2005
page 4
Universität Karlsruhe (TH)Rechenzentrum
Available Services (2/2)
» Login to HP XC6000 Cluster
ssh <user-id>@hwwxc1.hww.de
» or within University Karlsruhe
ssh <user-id>@xc1.rz.uni-karlsruhe.de
» SSH2 from RZ administrated workstations
ssh2 –p 22 <user-id>@hwwxc1.hww.de
SSCK Workshop, Karlsruhe, March 9, 2005
page 5
Universität Karlsruhe (TH)Rechenzentrum
File Systems (1/2)
10 TB
Quadrics QsNet II (single rail)
2x 2x16x 16x
2x 2x 2x 2x. . .. . .
16x 16x2x 2x
FC Network
2x 2x 2x 2x. . .
$TMP
$TMP
$TMP
$HOME $WORK
SSCK Workshop, Karlsruhe, March 9, 2005
page 6
Universität Karlsruhe (TH)Rechenzentrum
File Systems (2/2)
environment variable
global/local permanent/ temporary
quotas backup
$HOME global permanent no, but monitored
yes
$WORK global one week no no
$TMP local temporary no no
» global - all nodes access the parallel file system HP SFS, based on Lustre
» local – each node has ist own file system
» permanent – files are stored permanently
» temporary – files are removed at end of job or session
SSCK Workshop, Karlsruhe, March 9, 2005
page 7
Universität Karlsruhe (TH)Rechenzentrum
Moving Files (HP XC Workstations)
» Either by the command scp or by passive ftp
scp <user-id>@ws.institute.uni-karlsruhe.de:mydata $HOME
ftp ws.institute.uni-karlsruhe.de
SSCK Workshop, Karlsruhe, March 9, 2005
page 8
Universität Karlsruhe (TH)Rechenzentrum
Module Concept
» module is a user interface to the Modules package.
» Typically modulefiles instruct the module command to set or alter environment variables like PATH, MANPATH, … .
» Syntax is:module [switches] [sub-command] [modulefile…|path…|directory…]
» Important switches are:
– --force, -f Force active dependency resolution. This will result in modules found on a prereq command inside a modulefile being loaded automatically.
– --verbose, -v Enable verbose messages during module comand execution.
Further switches control the amount of output of the module command.
SSCK Workshop, Karlsruhe, March 9, 2005
page 9
Universität Karlsruhe (TH)Rechenzentrum
Modules (1/2)
» module help [modulefile...]Print the useage of each subcommand. If an argument is given, print the Module specific help information for the modulefile.
» module add|load modulefile [modulefile...]Load modulefile into the shell environment.
» module unload|rm modulefile [modulefile...]Remove modulefile from the shell environment.
» module switch|swap modulefile1 modulefile2Switch loaded modulefile1 with modulefile2.
» module display|switch modulefile [modulefile...]Display information about the modulefile.
» module listList loaded modules.
» module avail [path...]List all available modulefiles in the current MODULEPATH.
» module purgeUnload all loaded modulefiles.
Further commands to add directories to MODULEPATH and to add|remove modulefiles to|from the shell
dependent startup files.
SSCK Workshop, Karlsruhe, March 9, 2005
page 10
Universität Karlsruhe (TH)Rechenzentrum
Modules (2/2)
Modulefile Description
dot adds the current directory to your env. Variable PATH
intel-compilers/7.1
loads Intel Fortran and C/C++ compiler in version 7.1
intel-compilers loads Intel Fortran and C/C++ compiler in version 8.1
nag-compilers loads NAG Fortran90/95 compiler in version 4.2
hp-mpi loads HP MPI in version 2.0
intel-debuggers loads Intel debugger in version 8.0
ddt loads graphical Streamline debugger in version 1.8
mkl loads Intel MKL library in version 7.2
mlib/7.1 loads HP MLIB library in version 7.1
mlib loads HP MLIB library in version 8.0
naglib loads NAG Fortran library in version 8.0
SSCK Workshop, Karlsruhe, March 9, 2005
page 11
Universität Karlsruhe (TH)Rechenzentrum
Modulefiles – containing modifications to the environment
» modulefile is a file containing Tcl code + extensions for the Modules package.
» modulefile contains the changes to a users’ environment needed to access an application.
» modulefiles can also be used to implement site policies regarding the access and use of applications.
» modulefiles also hide the notion of different types of shells. From the modulefile writers’ perspective, this means one set of information will take care of every type of shell.
» Change default module environment by inserting module add <modulefile>in the setup file .bash_profile.
» Add your own Modulefiles by extending the $MODULEPATH environment variable.
SSCK Workshop, Karlsruhe, March 9, 2005
page 12
Universität Karlsruhe (TH)Rechenzentrum
Compilers (1/4)
» Fortran: 2 Intel Compilers (ifort in V8.1 and efc in V7.1), NAG Compiler (f95), GNU Compiler (g77 - only Fortran77)
» C/C++: 2 Intel Compilers (icc in V8.1 and ecc in V7.1), GNU Compiler (gcc)
-- General options: -c, -I<path>, -g, -0{0,1,2,3}, -L<path>, -l<library>, -o <name>
» NAG Fortran Compiler - best choice to check the Fortran90/95 conformity of your program
» Important specific options of the NAG Fortran Compiler – -Ounsafe performs possibly unsafe optimizations– -dusty allows the compilation of „legacy“ software (errors warning)– -ieee=full|nonstd|stop enables|disables all IEEE and deallocation facilities – -C compiles code with all possible runtime checks– -mtrace traces memory allocation and deallocation– -gline compiles code to generate a traceback in case of runtime errors– -gc enables automatic garbage collection of the executable– -tread_safe compiles code for safe execution in a multi-threaded
environment– -static prevents linking with shared libraries
SSCK Workshop, Karlsruhe, March 9, 2005
page 13
Universität Karlsruhe (TH)Rechenzentrum
Compilers (2/4)
» Intel Fortran suffix names
» NAG Fortran suffix names
Command File name suffix Source format Language level
ifort .f .ftn .for .i -fixed -72 -nostand
ifort .F .FTN .FOR .fpp .FPP
-fixed -72 -fpp -nostand
ifort .f90 .i90 -free -nostand
ifort .F90 -free -fpp -nostand
Command File name suffix Source format Language level
f95 .f .ftn .for -fixed Fortran95 Standard; -strict95 for strict Fortran95 code; -dusty for „legacy“ code.
f95 .F -fixed -fpp
f95 .f90 .f95 -free
f95 .F90 .F95 -free -fpp
SSCK Workshop, Karlsruhe, March 9, 2005
page 14
Universität Karlsruhe (TH)Rechenzentrum
Compilers (3/4)
» Change compiler by a simple module command (by default the Intel compiler in version 8.1 is used) :
module add|load intel-compilers/7.1
» Using different compilers
– don´t use explicit compiler names – use the $FC environment variable for the Fortran compiler – use the $CC environment variable for the C/C++ compiler name
SSCK Workshop, Karlsruhe, March 9, 2005
page 15
Universität Karlsruhe (TH)Rechenzentrum
Compilers (4/4)
» Compiling Fortran90/95 source code with Intel compiler
ifort –c –O3 my_prog.f90
» Compiling Fortran90/95 source code with an arbitrary Fortran compiler
$FC –c –O3 my_prog.f90
» Compiling C source code with Intel compiler
icc –c –O3 my_prog.c
» Compiling C++ source code with Intel compiler
$CC –c –O3 my_prog.C
SSCK Workshop, Karlsruhe, March 9, 2005
page 16
Universität Karlsruhe (TH)Rechenzentrum
Linking
» Special compiler scripts to (compile and) link MPI programs (the scripts don´t work together with the GNU compilers)
mpicc – (compile and) link C programs
mpicc.mpich – (compile and) link C programs in MPICH compatibility mode
mpiCC – (compile and) link C++ programs
mpiCC.mpich – (compile and) link C++ programs in MPICH compatibility mode
mpif77 or mpif90 – (compile and) link Fortran programs
If MPICH compatibility mode is required, call mpif77.mpich or mpif90.mpich
» Example for Fortran90/95 object code with Intel compiler
mpif90 –o my_prog my_prog.o sub1.o sub2.o
SSCK Workshop, Karlsruhe, March 9, 2005
page 17
Universität Karlsruhe (TH)Rechenzentrum
Benchmarks Measurements of Itanium2 (1.5 GHz) on HP XC6000 Cluster
Vector-length
Min/Max Addition Mult. Division Linked Triad
Vector Triad
Dot Product
1 MinMax
5152
5152
2626
99100
106107
106107
10 MinMax
177182
177178
131131
284366
351352
647650
100 MinMax
734742
734739
254256
10311452
10611237
13161325
103 MinMax
14441458
11181454
280288
29062928
20322044
14701482
104 MinMax
12011486
11191420
281285
23962891
15141750
14901496
105 MinMax
10301046
10281042
286288
21352147
16101629
14091413
106 MinMax
156161
150160
145154
299318
252262
766777
107 MinMax
163168
164166
157159
329333
271274
766776
Peak 3000 3000 ----- 6000 6000 6000
Max. L2-c. 2000 2000 ----- 4000 3000 6000
Max. L3-c. 2000 2000 ----- 4000 3000 6000
Max. mem 267 267 ----- 534 400 800
η a,L20.49 0.48 ----- 0.49 0.34 0.25
η a,L30.35 0.35 ----- 0.36 0.27 0.24
η a,mem0.06 0.06 ----- 0.06 0.05 0.13
What is remarkable?
The dot product runs very slow!
The scattering of the performance rates, if the data are stored in the L2-cache is very high (up to 40 percent!!!).
SSCK Workshop, Karlsruhe, March 9, 2005
page 18
Universität Karlsruhe (TH)Rechenzentrum
Benchmarks – Ping Pong within a node
Neighbor send/receive speed test
--------------------------------- --- Multiple simple Ping/Pong --- ---------------------------------
Clock overhead is 0.1736E-07 secs per snd/rcv.
bytes ms MB/s 0 0.001 0.000 4 0.001 4.590 8 0.001 7.875 16 0.001 15.526 32 0.001 34.528 64 0.001 73.807 128 0.001 127.790 256 0.001 209.114 512 0.001 436.936 1024 0.002 674.397 2048 0.007 308.211 4096 0.007 550.674 8192 0.010 834.013 16384 0.014 1181.921 32768 0.022 1507.639 65536 0.036 1835.203 131072 0.071 1854.967 262144 0.126 2074.492 524288 0.254 2060.7271048576 0.502 2089.745
Neighbor send/receive speed test
--------------------------------- --- Multiple double Ping/Pong --- ---------------------------------
Clock overhead is 0.2670E-08 secs per snd/rcv.
bytes ms MB/s 0 0.003 0.000 4 0.004 1.131 8 0.003 2.381 16 0.003 4.744 32 0.004 8.936 64 0.003 19.438 128 0.003 37.425 256 0.004 65.514 512 0.004 134.188 1024 0.004 253.168 2048 0.006 343.425 4096 0.008 541.139 8192 0.011 729.931 16384 0.018 914.383 32768 0.033 1002.130 65536 0.064 1021.981 131072 0.124 1055.018 262144 0.233 1127.460 524288 0.486 1078.485 1048576 0.911 1151.049
SSCK Workshop, Karlsruhe, March 9, 2005
page 19
Universität Karlsruhe (TH)Rechenzentrum
Benchmarks – Ping Pong between nodes
Neighbor send/receive speed test
--------------------------------- --- Multiple simple Ping/Pong --- ---------------------------------
Clock overhead is 0.1736E-07 secs per snd/rcv.
bytes ms MB/s 0 0.003 0.000 4 0.003 1.441 8 0.003 2.905 16 0.003 5.828 32 0.003 11.605 64 0.004 16.514 128 0.004 30.021 256 0.006 45.949 512 0.006 87.778 1024 0.006 161.227 2048 0.008 271.353 4096 0.010 408.196 8192 0.015 546.295 16384 0.025 659.058 32768 0.045 735.468 65536 0.084 781.339
131072 0.164 797.490 262144 0.320 818.153 524288 0.660 794.346 1048576 1.266 828.447
Neighbor send/receive speed test
--------------------------------- --- Multiple double Ping/Pong --- ---------------------------------
Clock overhead is 0.2666E-08 secs per snd/rcv.
bytes ms MB/s 0 0.009 0.000 4 0.009 0.443 8 0.009 0.899 16 0.009 1.739 32 0.009 3.497 64 0.010 6.495 128 0.010 12.508 256 0.012 22.125 512 0.012 43.344 1024 0.013 80.759 2048 0.016 129.800 4096 0.021 197.767 8192 0.031 267.897 16384 0.050 329.511 32768 0.089 367.656 65536 0.172 381.144 131072 0.334 392.161 262144 0.673 389.346 524288 1.313 399.440 1048576 2.816 372.366
SSCK Workshop, Karlsruhe, March 9, 2005
page 20
Universität Karlsruhe (TH)Rechenzentrum
Benchmarks – Overlap for short messages between nodes
Neighbor send/receive overlap test ---------------------------------- ------ Short messages --------- ----------------------------------
The used message length during computation is ... 10 the used vectorlength during computation is . . . 10 all times in seconds, >>ol_fac in percent!!!
Bal_fac Rep_fac_comm Rep_fac_comp T_comm T_comp T_all T_ol ol_fac 1 103548 12267030 1.03 1.03 1.90 0.16 15.6 3 103548 12267030 1.02 3.08 3.94 0.17 16.6
The used message length during computation is ... 100 the used vectorlength during computation is . . . 100 all times in seconds, >>ol_fac in percent!!!
Bal_fac Rep_fac_comm Rep_fac_comp T_comm T_comp T_all T_ol ol_fac 1 69722 5725738 1.03 1.03 1.72 0.34 32.9 3 69722 5725738 1.03 3.09 3.78 0.35 33.8
The used message length during computation is ... 1000 the used vectorlength during computation is . . . 1000 all times in seconds, >>ol_fac in percent!!!
Bal_fac Rep_fac_comm Rep_fac_comp T_comm T_comp T_all T_ol ol_fac 1 28496 979641 1.03 1.03 1.42 0.64 62.1 3 28496 979641 1.03 3.09 3.49 0.63 61.4
SSCK Workshop, Karlsruhe, March 9, 2005
page 21
Universität Karlsruhe (TH)Rechenzentrum
Benchmarks – Overlap for long messages between nodes
Neighbor send/receive overlap test
---------------------------------- ------ Long messages ---------- ----------------------------------
The used message length during computation is ... 10000 the used vectorlength during computation is . . . 10000 all times in seconds, >>ol_fac in percent!!!
Bal_fac Rep_fac_comm Rep_fac_comp T_comm T_comp T_all T_ol ol_fac 1 4670 68699 1.03 1.03 1.13 0.92 89.7 3 4670 68699 1.03 3.00 3.23 0.80 78.0
The used message length during computation is ... 100000 the used vectorlength during computation is . . . 100000 all times in seconds, >>ol_fac in percent!!!
Bal_fac Rep_fac_comm Rep_fac_comp T_comm T_comp T_all T_ol ol_fac 1 503 6101 1.06 1.05 1.13 0.98 92.2 3 503 6101 1.06 3.13 3.19 1.00 94.0
The used message length during computation is ... 1000000 the used vectorlength during computation is . . . 1000000 all times in seconds, >>ol_fac in percent!!!
Bal_fac Rep_fac_comm Rep_fac_comp T_comm T_comp T_all T_ol ol_fac 1 49 101 1.05 1.06 1.30 0.82 78.0 3 49 101 1.05 3.18 3.35 0.88 83.7
SSCK Workshop, Karlsruhe, March 9, 2005
page 22
Universität Karlsruhe (TH)Rechenzentrum
Debugging with DDT
» Commands
module add ddtddt hello
SSCK Workshop, Karlsruhe, March 9, 2005
page 23
Universität Karlsruhe (TH)Rechenzentrum
HP MPI – Execution of Parallel Programs
» The syntax to start a parallel application interactively is
mpirun [mpirun_options] <program> ormpirun [mpirun_options] –f <appfile>
mpirun Options Brief Explanation
-n # or -np # MPI job is run on # processors (option is ignored in batch mode)
-m block or –m cycle
MPI processes will be mapped blockwise or cyclically to the processors
-T prints user and system time for each MPI rank
-1sided enables one-sided communication
-i <fname:[options]> enables runtime instrumentation profiling for all processes
-stdio=<options> specifies standard IO options (refer to HP MPI User Guide)
-mpich runs the application in MPICH compatibility mode
SSCK Workshop, Karlsruhe, March 9, 2005
page 24
Universität Karlsruhe (TH)Rechenzentrum
HP MPI – Environment Variables
» Many environment variables
HP MPI Env. Variables Brief Explanation
MPI_FLAGS modifies the general behaviour of MPI
– l reports memory leaks caused by not freeing memory– f forces MPI errors to be fatal, ignoring the programmer´s choice of error handlers– z enables zero-buffering mode (MPI_SEND and MPI_RSEND MPI_SSEND)
MPI_INSTR enables counter instrumentation for profiling HP MPI applications
MPIRUN_OPTIONS sets mpirun options
..... .
SSCK Workshop, Karlsruhe, March 9, 2005
page 25
Universität Karlsruhe (TH)Rechenzentrum
Numerical Libraries
» HP XC Mathematical LIBrary (MLIB)
» Intel Mathematical Kernel Library (MKL)
» NAG Libraries (non-commercial users)
» LINear SOLver package (LINSOL)
SSCK Workshop, Karlsruhe, March 9, 2005
page 26
Universität Karlsruhe (TH)Rechenzentrum
Well Established Open Source Libraries
» BLAS
– BLAS{1,2,3} included in HP XC MLIB and Intel MKL
» LAPACK
– included in HP XC MLIB and Intel MKL• contains many functions for the solution of linear systems
• and eihenvalue problems for dense and banded matrices
» ScaLAPACK
– included in HP XC MLIB• contains above mentioned functions for parallel computers
» Metis
– included in HP XC MLIB• contains a special implementation of the graph partitioning and matrix
reordering library
SSCK Workshop, Karlsruhe, March 9, 2005
page 27
Universität Karlsruhe (TH)Rechenzentrum
HP XC MLIB (1/2)
» Functions from several areas: linear equations, least squares, eigenvalue problems, singular value decomposition, vector and matrix computations, convolutions and Fourier Transforms
» Four components: VECLIB, LAPACK, ScaLAPACK and SuperLU_DIST
» VECLIB includes all BLAS{1,2,3} and sparse BLAS subroutines, sparse linear equation solvers, sparse eigenvalue and eigenvector solvers, FFTs, correlation and convolution subprograms, random number generators and METIS V4.0.1
» Load bevor use module add hp-mlib/7.1 for Intel compiler V7.1 andmodule add hp-mlib for Intel compiler V8.1
SSCK Workshop, Karlsruhe, March 9, 2005
page 28
Universität Karlsruhe (TH)Rechenzentrum
HP XC MLIB (2/2)
» Appropriate options at link time:
– VECLIB
$FC –L$MLIBPATH –lveclib –openmp –o myprog myprog.f90
– LAPACK
$FC –L$MLIBPATH –llapack –openmp –o myprog myprog.f90
– ScaLAPCK
mpif90 –L$MLIBPATH –lscalapack –openmp –o myprog myprog.f90
– SuperLU_DIST
mpif90 –L$MLIBPATH –lsuperlu_dist –openmp –o myprog myprog.f90
» More details: http://www.rz.uni-karlsruhe.de/ssc/hpxc-mlib
SSCK Workshop, Karlsruhe, March 9, 2005
page 29
Universität Karlsruhe (TH)Rechenzentrum
Intel MKL (1/2)
» Many components:
– BLAS, – Sparse BLAS,– LAPACK,– direct sparse solver PARDISO,– Vector Mathematical Library (VML) for core mathematical functions on
vector arguments,– Vector Statistical Library (VSL) for generating vectors of pseudorandom
numbers,– general Discrete Fourier Transform functions (DFT) and – a subset of FFTs
» Load bevor use module add mkl
SSCK Workshop, Karlsruhe, March 9, 2005
page 30
Universität Karlsruhe (TH)Rechenzentrum
Intel MKL (2/2)
» Appropriate options at link time:
– BLAS, FFT, VML, VSL etc.
$FC –L$MKLPATH –lmkl_ipf –lguide –lpthread –o myprog myprog.f90
– LAPACK
$FC –L$MKLPATH –lmkl_lapack –lmkl_ipf –lguide –lpthread –o myprog myprog.f90
– PARDISO
mpif90 –L$MKLPATH –lmkl_solver –lmkl_ipf –lguide –lpthread –o myprog myprog.f90
» More details: http://www.rz.uni-karlsruhe.de/ssc/hpxc-mkl
SSCK Workshop, Karlsruhe, March 9, 2005
page 31
Universität Karlsruhe (TH)Rechenzentrum
NAG Libraries
» NAG Fortran, NAG Fortran90 and NAG C libraries only for non-commercial customers
» Load bevor use module add naglib/7.1module add mkl/7.1 for Intel compiler V7.1 andmodule add naglib module add mkl for Intel compiler V8.1
» Appropriate options at compile and link time:– NAG Fortran Library
$FC myprog.f90 –I$NAGLIBPATH/interface_blocks –LNAGLIBPATH \ –lnag-mkl –L$MKLPATH –lmkl_lapack –lmkl_ipf –lguide -lpthread
– NAG Fortran90 Library
$FC myprog.f90 –I$NAGLIBPATH/nag_mod_dir –LNAGLIBPATH \ –lnagfl90-noblas –L$MKLPATH –lmkl_lapack –lmkl_ipf –lguide -lpthread
– NAG C Library
$CC myprog.c –I$NAGLIBPATH/include –L$NAGLIBPATH/nagc
» More details: http://www.rz.uni-karlsruhe.de/ssc/hpxc-nag
SSCK Workshop, Karlsruhe, March 9, 2005
page 32
Universität Karlsruhe (TH)Rechenzentrum
LINSOL
» LINSOL is a program package to solve large sparse linear systems
– many iterative solvers– several polyalgorithms– (I)LU direct solvers as preconditioners– optimized for workstations (cache reuse), vectorcomputers and parallel
computers (MPI)– supporting 7 different storage patterns for sparse matrices (automatic
optimization to the architecture of the computer)
» Load bevor usemodule add linsol
» Appropriate options at compile and link time:mpif90 –L$LINSOLPATH –llinsol –lMPI myprog.o running a MPI job$FC –L$LINSOLPATH –llinsol –lnocomm myprog.o running a serial job
» More details: http://www.rz.uni-karlsruhe.de/produkte/linsol