1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.
-
date post
19-Dec-2015 -
Category
Documents
-
view
229 -
download
0
Transcript of 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.
![Page 1: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/1.jpg)
1
Friday, October 06, 2006
Measure twice, cut once.
- Carpenter’s Motto
![Page 2: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/2.jpg)
2
Sources of overhead
Inter-process communicationIdlingReplicated computation
![Page 3: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/3.jpg)
3
Sources of overhead
Inter-process communicationIdlingReplicated computation
![Page 4: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/4.jpg)
4
Ts: The original single-processor serial time. Tis: The additional serial time spent on average for
• Inter-processor communications• Setup• Depends on N.
Tp: The original single-processor parallelizable time. Tip: The additional time spent on average by each
processor • Setup• Idle time
![Page 5: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/5.jpg)
5
Simplified expression
S(N) = Ts + Tp
Ts+ N*Tis + Tp/N + Tip
![Page 6: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/6.jpg)
6
0.00
20.00
40.00
60.00
80.00
100.00
120.00
140.00
160.00
1 20 40 60 80 100 120 140
N (Processors)
SN
) S
pe
ed
up Tp=10
Tp=100
Tp=1000
Tp=10000
Tp=100000
Linear
Ts=10, Tip=1, Tis=0
Communication time negligible compared to computation. What you would expect from Amdahl’s law alone.
Straight line reference for linear speedup
![Page 7: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/7.jpg)
7
0.00
20.00
40.00
60.00
80.00
100.00
120.00
140.00
160.00
1 20 40 60 80 100 120 140
N (Processors)
SN
) S
pe
ed
up Tp=10
Tp=100
Tp=1000
Tp=10000
Tp=100000
Linear
Ts=10, Tip=1, Tis=10
Adding small serial time. Adding more processors results in lower speedup.
![Page 8: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/8.jpg)
8
0.00
20.00
40.00
60.00
80.00
100.00
120.00
140.00
160.00
1 20 40 60 80 100 120 140
N (Processors)
SN
) S
pe
ed
up Tp=10
Tp=100
Tp=1000
Tp=10000
Tp=100000
Linear
Ts=10, Tip=1, Tis=1
Quadratic N dependence, e.g. every processor speaks to all others.
![Page 9: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/9.jpg)
9
Adding processors won’t provide additional speedup unless the problem is scaled up as well.
Should not distribute calculations with small Tp/Tis over a large number of processors.
![Page 10: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/10.jpg)
10
Scaling a problem
Does number of tasks scale with the problem size?
Increase in problem size should increase the number of tasks rather than the size of individual tasks. Should be able to solve larger
problems when more processors are available.
![Page 11: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/11.jpg)
11
What can we tell from our observations?
We implemented an algorithm on parallel computer X and achieved a speedup of 10.8 on 12 processors with problem size N=100 .
![Page 12: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/12.jpg)
12
What can we tell from our observations?
We implemented an algorithm on parallel computer X and achieved a speedup of 10.8 on 12 processors with problem size N=100 .
Region of observation is too narrow.What if N=10 or N=1000?
![Page 13: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/13.jpg)
13
What can we tell from our observations?
T is the execution time, P is number of processors and N is problem size
T= N + N2/PT= (N + N2)/P + 100T= (N + N2)/P + 0.6 P2
All these algorithms all achieve a speedup of about 10.8 when P=12 and N=100 .
![Page 14: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/14.jpg)
14
![Page 15: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/15.jpg)
15
Addition example
![Page 16: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/16.jpg)
16
Addition example
Speedup :Ratio of time taken to solve a problem on a
single processor to time required to solve it on a parallel computer with p identical processing elements
Speedup for addition example?
![Page 17: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/17.jpg)
17
Speedup :Comparison with best known serial
algorithm
![Page 18: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/18.jpg)
18
Efficiency :
Fraction of time which a processor spends doing useful work.
E = S/p
![Page 19: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/19.jpg)
19
Cost :Product of parallel runtime and the number of processors.
Cost: pTp (Note: Tp here stands to the parallel runtime. The time from the moment the parallel computation starts to the moment last processing element finishes execution)
![Page 20: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/20.jpg)
20
Cost optimal :
If cost of solving a problem on a parallel computer has same asymptotic growth as a function of input size as the fastest known sequential algorithm on a single processor.
Cost for addition example: O(n logn)
![Page 21: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/21.jpg)
21
Cost optimal :If cost of solving a problem on a parallel computer has same asymptotic growth as a function of input size as the fastest known sequential algorithm on a single processor.
Cost for addition example: O(n logn)Not cost optimal.
![Page 22: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/22.jpg)
22
Effect of non-cost-optimality
![Page 23: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/23.jpg)
23
.
![Page 24: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/24.jpg)
24
.
![Page 25: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/25.jpg)
25
.
![Page 26: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/26.jpg)
26
If overhead increases sub-linearly with respect to problem size.
Keep efficiency fixed by increasing both the problem size and number of processors
![Page 27: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/27.jpg)
27
Keep efficiency fixed by increasing both the problem size and number of processors
Scalable parallel systems
Ability to utilize increasing processing elements effectively
![Page 28: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/28.jpg)
28
Scalability and cost-optimality are related
Scalable system can always be made cost-optimal if number of processing elements and size of problem are chosen carefully
![Page 29: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/29.jpg)
29
Scalability and cost-optimality are related
Scalable system can always be made cost-optimal if number of processing elements and size of problem are chosen carefully
![Page 30: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/30.jpg)
30
Speedup that is greater than linear: Super-linear
Speedup Anomalies
![Page 31: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/31.jpg)
31
Cache effects. Each processor has a small amount of cache When a problem is executed on a greater number of
processors, more of its data can be placed in cache and as a result, total computation time will tend to decrease.
If reduction in computation time due to this cache effect offsets increases in communication and idle time from use of additional processors then super-linearity results.
Similarly, the increased physical memory available in a multiprocessor may reduce the cost of memory accesses by avoiding the need for virtual memory paging.
Speedup Anomalies
![Page 32: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/32.jpg)
32
Search anomalies.
If a search tree contains solutions at varying depths, then multiple depth-first searches will, on average, explore fewer tree nodes before finding a solution than will a sequential depth-first search.
Speedup Anomalies
![Page 33: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/33.jpg)
33
Message Passing
Partitioned address spaceData explicitly decomposed and placed by
programmerLocality of access.Cooperation for send receive operations.Structured and static requirements
![Page 34: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/34.jpg)
34
Message Passing
Most message passing programs are written using SPMD
![Page 35: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/35.jpg)
35
Message Passing
The need for a standard.
![Page 36: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/36.jpg)
36
The Message Passing Interface (MPI) standard is the de-facto industry standard for parallel applications. Designed by leading industry and academic
researchers
MPI Library that is widely used to parallelize
scientific and compute-intensive programs
![Page 37: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/37.jpg)
37
LAM (Indiana University), MPICH (Argonne National Laboratory, Chicago) are popular open source implementations of MPI library.
![Page 38: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/38.jpg)
38
Implementations of MPI (such as LAM, MPICH) provide an API of library calls that allow users to pass messages between nodes of a parallel application.
Run on a wide variety of systems, from desktop workstations, clusters to large supercomputers (and everything in between).
![Page 39: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/39.jpg)
39
MPI: the Message Passing Interface
The minimal set of MPI routines.
MPI_Init Initializes MPI.
MPI_Finalize Terminates MPI. MPI_Comm_size Determines the number of processes. MPI_Comm_rank Determines the label of calling process. MPI_Send Sends a message.
MPI_Recv Receives a message.
![Page 40: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/40.jpg)
40
Starting and Terminating the MPI Library MPI_Init is called prior to any calls to other MPI routines. Its
purpose is to initialize the MPI environment. MPI_Finalize is called at the end of the computation, and it
performs various clean-up tasks to terminate the MPI environment. The prototypes of these two functions are:
int MPI_Init(int *argc, char ***argv)
int MPI_Finalize() MPI_Init also strips off any MPI related command-line
arguments. All MPI routines, data-types, and constants are prefixed by “MPI_”.
The return code for successful completion is MPI_SUCCESS. (mpi.h)
![Page 41: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/41.jpg)
41
Hello World MPI Program#include <stdio.h>#include <mpi.h>int main(int argc, char *argv[]){ int rank, size; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); printf("Hello, world! I am %d of %d\n", rank, size); MPI_Finalize(); return 0;}
![Page 42: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/42.jpg)
42
LAM
Before any MPI programs can be executed, the LAM run-time environment must be launched. This is typically called “booting LAM.”
![Page 43: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/43.jpg)
43
LAM
Before any MPI programs can be executed, the LAM run-time environment must be launched. This is typically called “booting LAM.”
A text file is required that lists the hosts on which to launch the LAM run-time environment. This file is typically referred to as a “boot schema”, “hostfile”, or “machinefile.”
![Page 44: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/44.jpg)
44
Sample machinefile
hpcc.lums.edu.pk
compute-0-0.local
compute-0-1.local
compute-0-2.local
compute-0-3.local
compute-0-4.local
compute-0-5.local
compute-0-6.local
![Page 45: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/45.jpg)
45
LAM
Settings have been done on your accounts and the following files have been copied in your home directory.
ssh_scriptmachinefilehellompi.c
![Page 46: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/46.jpg)
46
First time commands
(Logout of all old sessions and re-login)
source ssh_script
![Page 47: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/47.jpg)
47
First time commandssource ssh_script Warning: Permanently added 'compute-0-0.local' (RSA) to the list of known hosts./bin/bashWarning: Permanently added 'compute-0-1.local' (RSA) to the list of known hosts./bin/bashWarning: Permanently added 'compute-0-2.local' (RSA) to the list of known hosts./bin/bashWarning: Permanently added 'compute-0-3.local' (RSA) to the list of known hosts./bin/bashWarning: Permanently added 'compute-0-4.local' (RSA) to the list of known hosts./bin/bashWarning: Permanently added 'compute-0-5.local' (RSA) to the list of known hosts./bin/bashWarning: Permanently added 'compute-0-6.local' (RSA) to the list of known hosts./bin/bash
![Page 48: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/48.jpg)
48
First time commands
source ssh_script /bin/bash/bin/bash/bin/bash/bin/bash/bin/bash/bin/bash/bin/bash
![Page 49: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/49.jpg)
49
First time commandslamboot -v machinefileLAM 7.1.1/MPI 2 C++/ROMIO - Indiana Universityn-1<13857> ssi:boot:base:linear: booting n0 (hpcc.lums.edu.pk)n-1<13857> ssi:boot:base:linear: booting n1 (compute-0-0.local)n-1<13857> ssi:boot:base:linear: booting n2 (compute-0-1.local)n-1<13857> ssi:boot:base:linear: booting n3 (compute-0-2.local)n-1<13857> ssi:boot:base:linear: booting n4 (compute-0-3.local)n-1<13857> ssi:boot:base:linear: booting n5 (compute-0-4.local)n-1<13857> ssi:boot:base:linear: booting n6 (compute-0-5.local)n-1<13857> ssi:boot:base:linear: booting n7 (compute-0-6.local)n-1<13857> ssi:boot:base:linear: finished
![Page 50: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/50.jpg)
50
First time commands
lamnodesn0 hpcc.lums.edu.pk:1:origin,this_noden1 compute-0-0.local:1:n2 compute-0-1.local:1:n3 compute-0-2.local:1:n4 compute-0-3.local:1:n5 compute-0-4.local:1:n6 compute-0-5.local:1:n7 compute-0-6.local:1:
![Page 51: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/51.jpg)
51
First time commands
mpicc hellompi.c -o hello
![Page 52: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/52.jpg)
52
First time commands
mpirun -np 8 hello
Hello, world! I am 0 of 8
Hello, world! I am 4 of 8
Hello, world! I am 2 of 8
Hello, world! I am 6 of 8
Hello, world! I am 3 of 8
Hello, world! I am 5 of 8
Hello, world! I am 7 of 8
Hello, world! I am 1 of 8
![Page 53: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/53.jpg)
53
First time commandslamhaltLAM 7.1.1/MPI 2 C++/ROMIO - Indiana University
lamwipe machinefileLAM 7.1.1/MPI 2 C++/ROMIO - Indiana University
lamnodes-----------------------------------------------------------------------------It seems that there is no lamd running on the host hpcc.lums.edu.pk.This indicates that the LAM/MPI runtime environment is not operating.The LAM/MPI runtime environment is necessary for the "lamnodes" command.Please run the "lamboot" command the start the LAM/MPI runtimeenvironment. See the LAM/MPI documentation for how to invoke"lamboot" across multiple machines.
![Page 54: 1 Friday, October 06, 2006 Measure twice, cut once. -Carpenter’s Motto.](https://reader035.fdocuments.net/reader035/viewer/2022062516/56649d375503460f94a0ffa1/html5/thumbnails/54.jpg)
54
Sequence whenever you want to run an MPI program
1. Compile using mpicc
2. Start LAM runtime environment using lamboot
3. Run MPI program using mpirun
4. When you are done, shut down LAM universe using lamhalt and lamwipe
5. lamclean can be useful if a parallel job crashes to remove all running programs