Friday, October 13, 2006
description
Transcript of Friday, October 13, 2006
![Page 1: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/1.jpg)
1
Friday, October 13, 2006
The biggest difference between time and space is that you can't reuse time.
- M. Furst
![Page 2: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/2.jpg)
2
machinefile is a text file (also called boot schema file) containing the following:
hpcc.lums.edu.pkcompute-0-0.localcompute-0-1.localcompute-0-2.localcompute-0-3.localcompute-0-4.localcompute-0-5.localcompute-0-6.local
![Page 3: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/3.jpg)
3
hpcc.lums.edu.pkcompute-0-0.localcompute-0-1.localcompute-0-2.localcompute-0-3.localcompute-0-4.localcompute-0-5.localcompute-0-6.local
lamboot –v machinefile Launches LAM runtime environment
mpirun –np 4 hello launches 4 copies of hello
Scheduling of copies is implementation dependent.
LAM will schedule in a round-robin fashion on every node depending on the number of CPUs listed per node.
![Page 4: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/4.jpg)
4
#include <stdio.h>#include <mpi.h>int main(int argc, char *argv[]){ int rank, size; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); printf("Hello, world! I am %d of %d\n", rank, size); MPI_Finalize(); return 0;}
#include <stdio.h>#include <mpi.h>int main(int argc, char *argv[]){ int rank, size; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); printf("Hello, world! I am %d of %d\n", rank, size); MPI_Finalize(); return 0;}
#include <stdio.h>#include <mpi.h>int main(int argc, char *argv[]){ int rank, size; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); printf("Hello, world! I am %d of %d\n", rank, size); MPI_Finalize(); return 0;}
#include <stdio.h>#include <mpi.h>int main(int argc, char *argv[]){ int rank, size; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); printf("Hello, world! I am %d of %d\n", rank, size); MPI_Finalize(); return 0;}
hpcc.lums.edu.pk
compute-0-1.local
compute-0-0.local
compute-0-2.local
![Page 5: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/5.jpg)
5
#include <stdio.h>#include <mpi.h>
int main(int argc, char *argv[]){ int rank, size, namelen; char name[100]; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); MPI_Get_processor_name(name, &namelen); printf("Rank:%d Name:%s\n", rank,name); MPI_Finalize(); return 0;}
![Page 6: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/6.jpg)
6
mpirun -np 4 pname
Rank:0 Name:hpcc.lums.edu.pk
Rank:2 Name:compute-0-1.local
Rank:1 Name:compute-0-0.local
Rank:3 Name:compute-0-2.local
![Page 7: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/7.jpg)
7
mpirun -np 4 pname
Rank:0 Name:hpcc.lums.edu.pk
Rank:2 Name:compute-0-1.local
Rank:1 Name:compute-0-0.local
Rank:3 Name:compute-0-2.local
Processes on remote nodes have their stdout redirected to that of mpirun
![Page 8: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/8.jpg)
8
mpirun -np 8 pname
Rank:0 Name:hpcc.lums.edu.pk
Rank:2 Name:compute-0-1.local
Rank:1 Name:compute-0-0.local
Rank:3 Name:compute-0-2.local
Rank:4 Name:compute-0-3.local
Rank:5 Name:compute-0-4.local
Rank:6 Name:compute-0-5.local
Rank:7 Name:compute-0-6.local
![Page 9: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/9.jpg)
9
mpirun -np 16 pnameRank:0 Name:hpcc.lums.edu.pkRank:8 Name:hpcc.lums.edu.pkRank:1 Name:compute-0-0.localRank:3 Name:compute-0-2.localRank:11 Name:compute-0-2.localRank:7 Name:compute-0-6.localRank:4 Name:compute-0-3.localRank:2 Name:compute-0-1.localRank:5 Name:compute-0-4.localRank:6 Name:compute-0-5.localRank:9 Name:compute-0-0.localRank:15 Name:compute-0-6.localRank:12 Name:compute-0-3.localRank:10 Name:compute-0-1.localRank:13 Name:compute-0-4.localRank:14 Name:compute-0-5.local
![Page 10: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/10.jpg)
10
Suppose boot schema file contains:
hpcc.lums.edu.pk cpu=2compute-0-0.local cpu=2compute-0-1.local cpu=2compute-0-2.local cpu=2compute-0-3.local cpu=2compute-0-4.local cpu=2compute-0-5.local cpu=2compute-0-6.local cpu=2
![Page 11: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/11.jpg)
11
mpirun -np 8 pname
Rank:0 Name:hpcc.lums.edu.pkRank:1 Name:hpcc.lums.edu.pkRank:4 Name:compute-0-1.localRank:2 Name:compute-0-0.localRank:6 Name:compute-0-2.localRank:3 Name:compute-0-0.localRank:7 Name:compute-0-2.localRank:5 Name:compute-0-1.local
![Page 12: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/12.jpg)
12
mpirun -np 16 pnameRank:0 Name:hpcc.lums.edu.pkRank:1 Name:hpcc.lums.edu.pkRank:8 Name:compute-0-3.localRank:2 Name:compute-0-0.localRank:6 Name:compute-0-2.localRank:10 Name:compute-0-4.localRank:14 Name:compute-0-6.localRank:4 Name:compute-0-1.localRank:12 Name:compute-0-5.localRank:3 Name:compute-0-0.localRank:7 Name:compute-0-2.localRank:9 Name:compute-0-3.localRank:13 Name:compute-0-5.localRank:11 Name:compute-0-4.localRank:15 Name:compute-0-6.localRank:5 Name:compute-0-1.local
![Page 13: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/13.jpg)
13
mpirun C hello Launch one copy of hello on every CPU that
was listed in the boot schema
mpirun N hello Launch one copy of hello on every node in
the LAM universe (disregards CPU count)
![Page 14: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/14.jpg)
14
int main(int argc, char *argv[]) { int rank, size; MPI_Status status; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); if(rank %2==0){ printf("Rank:%d, I am EVEN\n", rank); } else { printf("Rank:%d, I am ODD\n", rank); } MPI_Finalize(); return 0;}
![Page 15: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/15.jpg)
15
mpirun -np 8 rpdtRank:0, I am EVENRank:2, I am EVENRank:1, I am ODDRank:5, I am ODDRank:3, I am ODDRank:7, I am ODDRank:6, I am EVENRank:4, I am EVEN
![Page 16: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/16.jpg)
16
Point to point communication
MPI_Send (void* buf,int count,MPI_Datatype
datatype,int dest,int tag,MPI_Comm
comm)
MPI_Recv (void* buf,
int count,
MPI_Datatype datatype,
int source,
int tag,
MPI_Comm comm,
MPI_Status *status
)
![Page 17: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/17.jpg)
17
int main(int argc, char *argv[]){ int rank, size, source=0, dest=1, tag=12; float sent=23.65, recv; MPI_Status status; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); if(rank ==0){ MPI_Send(&sent, 1, MPI_FLOAT, dest, tag, MPI_COMM_WORLD); printf("I am %d of %d Sent %f\n", rank, size, sent); } else { MPI_Recv(&recv, 1, MPI_FLOAT, source, tag, MPI_COMM_WORLD, &status); printf("I am %d of %d Received %f\n",rank,size,recv); }
MPI_Finalize(); return 0;}
![Page 18: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/18.jpg)
18
lamboot -v mf2LAM 7.1.1/MPI 2 C++/ROMIO - Indiana Universityn-1<10818> ssi:boot:base:linear: booting n0 (hpcc.lums.edu.pk)n-1<10818> ssi:boot:base:linear: booting n1 (compute-0-0.local)n-1<10818> ssi:boot:base:linear: finished
mpirun -np 2 sendrecvI am 0 of 2 Sent 23.650000I am 1 of 2 Received 23.650000
mf2 is a text file containing the following:hpcc.lums.edu.pkcompute-0-0.local
![Page 19: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/19.jpg)
19
lamboot -v mf2LAM 7.1.1/MPI 2 C++/ROMIO - Indiana Universityn-1<10818> ssi:boot:base:linear: booting n0 (hpcc.lums.edu.pk)n-1<10818> ssi:boot:base:linear: booting n1 (compute-0-0.local)n-1<10818> ssi:boot:base:linear: finished
mpirun -np 2 sendrecvI am 0 of 2 Sent 23.650000I am 1 of 2 Received 23.650000
mf2 is a text file containing the following:hpcc.lums.edu.pkcompute-0-0.local
What will happen if I use np >2 ?
What will happen if I use np = 1 ?
![Page 20: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/20.jpg)
20
MPI_Recv is a blocking receive operationMPI allows two different implementations for
MPI_Send : buffered and un-buffered.
MPI programs must be able to run correctly regardless of which of the two methods is used for implementing MPI_Send.
Such programs are called safe.
![Page 21: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/21.jpg)
21
int a[10], b[10], myrank;MPI_Status status;...MPI_Comm_rank(MPI_COMM_WORLD, &myrank);if (myrank == 0) { MPI_Send(a, 10, MPI_INT, 1, 1, MPI_COMM_WORLD); MPI_Send(b, 10, MPI_INT, 1, 2, MPI_COMM_WORLD);}else if (myrank == 1) { MPI_Recv(b, 10, MPI_INT, 0, 2, MPI_COMM_WORLD); MPI_Recv(a, 10, MPI_INT, 0, 1, MPI_COMM_WORLD);}...
Note: count entries of datatype
![Page 22: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/22.jpg)
22
Avoiding Deadlocks
int a[10], b[10], myrank;MPI_Status status;...MPI_Comm_rank(MPI_COMM_WORLD, &myrank);if (myrank == 0) { MPI_Send(a, 10, MPI_INT, 1, 1, MPI_COMM_WORLD); MPI_Send(b, 10, MPI_INT, 1, 2, MPI_COMM_WORLD);}else if (myrank == 1) { MPI_Recv(b, 10, MPI_INT, 0, 2, MPI_COMM_WORLD); MPI_Recv(a, 10, MPI_INT, 0, 1, MPI_COMM_WORLD);}...
If MPI_Send is blocking nonbuffered, there is a deadlock.
![Page 23: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/23.jpg)
23
int main(int argc, char *argv[]) { int rank, size, source=0, dest=1; float sent[5]={10,20,30,40,50}; float recv; MPI_Status status;
MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD,
&rank); MPI_Comm_size(MPI_COMM_WORLD,
&size);
![Page 24: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/24.jpg)
24
if(rank ==0){ MPI_Send(&sent[0], 1, MPI_FLOAT, dest, 12, MPI_COMM_WORLD); printf("Rank:%d Sent %f\n", rank, sent[0]); MPI_Send(&sent[1], 1, MPI_FLOAT, dest, 13, MPI_COMM_WORLD); printf("Rank:%d Sent %f\n", rank, sent[1]); MPI_Send(&sent[2], 1, MPI_FLOAT, dest, 14, MPI_COMM_WORLD); printf("Rank:%d Sent %f\n", rank, sent[2]); } else { MPI_Recv(&recv, 1, MPI_FLOAT, source, 12, MPI_COMM_WORLD, &status); printf("Rank:%d Received %f\n", rank, recv); MPI_Recv(&recv, 1, MPI_FLOAT, source, 13, MPI_COMM_WORLD, &status); printf("Rank:%d Received %f\n", rank, recv); MPI_Recv(&recv, 1, MPI_FLOAT, source, 14, MPI_COMM_WORLD, &status); printf("Rank:%d Received %f\n", rank, recv); }MPI_Finalize();return 0;}
![Page 25: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/25.jpg)
25
Rank:0 Sent 10.000000
Rank:0 Sent 20.000000
Rank:0 Sent 30.000000
Rank:1 Received 10.000000
Rank:1 Received 20.000000
Rank:1 Received 30.000000
![Page 26: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/26.jpg)
26
if(rank ==0){ MPI_Send(&sent[0], 1, MPI_FLOAT, dest, 14, MPI_COMM_WORLD); printf("Rank:%d Sent %f\n", rank, sent[0]); MPI_Send(&sent[1], 1, MPI_FLOAT, dest, 13, MPI_COMM_WORLD); printf("Rank:%d Sent %f\n", rank, sent[1]); MPI_Send(&sent[2], 1, MPI_FLOAT, dest, 12, MPI_COMM_WORLD); printf("Rank:%d Sent %f\n", rank, sent[2]); } else { MPI_Recv(&recv, 1, MPI_FLOAT, source, 12, MPI_COMM_WORLD,
&status); printf("Rank:%d Received %f\n", rank, recv); MPI_Recv(&recv, 1, MPI_FLOAT, source, 13, MPI_COMM_WORLD,
&status); printf("Rank:%d Received %f\n", rank, recv); MPI_Recv(&recv, 1, MPI_FLOAT, source, 14, MPI_COMM_WORLD,
&status); printf("Rank:%d Received %f\n", rank, recv); }
NOTE:
Unsafe: depends on whether system buffering provided or not
![Page 27: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/27.jpg)
27
Rank:0 Sent 10.000000
Rank:0 Sent 20.000000
Rank:0 Sent 30.000000
Rank:1 Received 30.000000
Rank:1 Received 20.000000
Rank:1 Received 10.000000
![Page 28: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/28.jpg)
28
NOTE: Unsafe: depends on whether system buffering provided or not
if(rank ==0){ MPI_Send(&sent[0], 1, MPI_FLOAT, dest, 14, MPI_COMM_WORLD); printf("Rank:%d Sent %f\n", rank, sent[0]); MPI_Send(&sent[1], 1, MPI_FLOAT, dest, 14, MPI_COMM_WORLD); printf("Rank:%d Sent %f\n", rank, sent[1]); MPI_Send(&sent[2], 1, MPI_FLOAT, dest, 13, MPI_COMM_WORLD); printf("Rank:%d Sent %f\n", rank, sent[2]); MPI_Send(&sent[3], 1, MPI_FLOAT, dest, 12, MPI_COMM_WORLD); printf("Rank:%d Sent %f\n", rank, sent[3]); } else { MPI_Recv(&recv, 1, MPI_FLOAT, source, 12, MPI_COMM_WORLD, &status); printf("Rank:%d Received %f\n", rank, recv); MPI_Recv(&recv, 1, MPI_FLOAT, source, 13, MPI_COMM_WORLD, &status); printf("Rank:%d Received %f\n", rank, recv); MPI_Recv(&recv, 1, MPI_FLOAT, source, 14, MPI_COMM_WORLD, &status); printf("Rank:%d Received %f\n", rank, recv); MPI_Recv(&recv, 1, MPI_FLOAT, source, 14, MPI_COMM_WORLD, &status); printf("Rank:%d Received %f\n", rank, recv); }
![Page 29: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/29.jpg)
29
Rank:0 Sent 10.000000
Rank:0 Sent 20.000000
Rank:0 Sent 30.000000
Rank:0 Sent 40.000000
Rank:1 Received 40.000000
Rank:1 Received 30.000000
Rank:1 Received 10.000000
Rank:1 Received 20.000000
![Page 30: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/30.jpg)
30
Sending and Receiving Messages
MPI allows specification of wildcard arguments for both source and tag.
If source is set to MPI_ANY_SOURCE, then any process of the communication domain can be the source of the message.
If tag is set to MPI_ANY_TAG, then messages with any tag are accepted.
On the receive side, the message must be of length equal to or less than the length field specified.
![Page 31: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/31.jpg)
31
Example
Numerical Integration
![Page 32: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/32.jpg)
32
Numerical Integration (Serial)#include <stdio.h>main() { float integral, a, b, h, x; int n, i float f(float x); /* Function we're integrating */ printf("Enter a, b, and n\n"); scanf("%f %f %d", &a, &b, &n); h = (b-a)/n; integral = (f(a) + f(b))/2.0; x = a; for (i = 1; i <= n-1; i++) { x = x + h; integral = integral + f(x); } integral = integral*h; printf("With n = %d trapezoids, our estimate\n", n); printf("of the integral from %f to %f = %f\n", a, b, integral);} /* main */
![Page 33: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/33.jpg)
33
Numerical Integration (Parallel)main(int argc, char** argv) { int my_rank; /* My process rank */ int p; /* The number of processes */ float a = 0.0; /* Left endpoint */ float b = 1.0; /* Right endpoint */ int n = 1024; /* Number of trapezoids */ float h; /* Trapezoid base length */ float local_a; /* Left endpoint my process */ float local_b; /* Right endpoint my process */ int local_n; /* Number of trapezoids for */ /* my calculation */ float integral; /* Integral over my interval */ float total; /* Total integral */ int source; /* Process sending integral */ int dest = 0; /* All messages go to 0 */ int tag = 0; MPI_Status status;
![Page 34: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/34.jpg)
34
Numerical Integration (Parallel)MPI_Init(&argc, &argv);MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);MPI_Comm_size(MPI_COMM_WORLD, &p);
h = (b-a)/n; /* h is the same for all processes */local_n = n/p; /* So is the number of trapezoids */local_a = a + my_rank*local_n*h;local_b = local_a + local_n*h;integral = Trap(local_a, local_b, local_n, h);/* Add up the integrals calculated by each process */if (my_rank == 0) { total = integral; for (source = 1; source < p; source++) { MPI_Recv(&integral, 1, MPI_FLOAT, source, tag, MPI_COMM_WORLD, &status); total = total + integral; }
![Page 35: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/35.jpg)
35
Numerical Integration (Parallel) } else { MPI_Send(&integral, 1, MPI_FLOAT, dest, tag, MPI_COMM_WORLD);}
/* Print the result */if (my_rank == 0) { printf("With n = %d trapezoids, our estimate\n", n); printf("of the integral from %f to %f = %f\n", a, b, total);}
/* Shut down MPI */ MPI_Finalize();} /* main */
![Page 36: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/36.jpg)
36
Numerical Integration (Parallel)float Trap(float local_a, float local_b, int local_n, float h) {
float integral; /* Store result in integral */ float x; int i; float f(float x); /* function */ integral = (f(local_a) + f(local_b))/2.0; x = local_a; for (i = 1; i <= local_n-1; i++) { x = x + h; integral = integral + f(x); } integral = integral*h; return integral;}
![Page 37: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/37.jpg)
37
Avoiding DeadlocksConsider the following piece of code, in which process i
sends a message to process i + 1 (modulo the number of processes) and receives a message from process i - 1
(module the number of processes).
int a[10], b[10], npes, myrank;MPI_Status status;...MPI_Comm_size(MPI_COMM_WORLD, &npes);MPI_Comm_rank(MPI_COMM_WORLD, &myrank);MPI_Send(a, 10, MPI_INT, (myrank+1)%npes, 1,
MPI_COMM_WORLD);MPI_Recv(b, 10, MPI_INT, (myrank-1+npes)%npes, 1,
MPI_COMM_WORLD);...
![Page 38: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/38.jpg)
38
Avoiding DeadlocksConsider the following piece of code, in which process i
sends a message to process i + 1 (modulo the number of processes) and receives a message from process i - 1
(module the number of processes).
int a[10], b[10], npes, myrank;MPI_Status status;...MPI_Comm_size(MPI_COMM_WORLD, &npes);MPI_Comm_rank(MPI_COMM_WORLD, &myrank);MPI_Send(a, 10, MPI_INT, (myrank+1)%npes, 1,
MPI_COMM_WORLD);MPI_Recv(b, 10, MPI_INT, (myrank-1+npes)%npes, 1,
MPI_COMM_WORLD);...
Once again, we have a deadlock if MPI_Send is blocking.
![Page 39: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/39.jpg)
39
Avoiding DeadlocksWe can break the circular wait to avoid deadlocks as follows:
int a[10], b[10], npes, myrank;MPI_Status status;...MPI_Comm_size(MPI_COMM_WORLD, &npes);MPI_Comm_rank(MPI_COMM_WORLD, &myrank);if (myrank%2 == 1) {
MPI_Send(a, 10, MPI_INT, (myrank+1)%npes, 1, MPI_COMM_WORLD);
MPI_Recv(b, 10, MPI_INT, (myrank-1+npes)%npes, 1, MPI_COMM_WORLD);
}else {
MPI_Recv(b, 10, MPI_INT, (myrank-1+npes)%npes, 1, MPI_COMM_WORLD);
MPI_Send(a, 10, MPI_INT, (myrank+1)%npes, 1, MPI_COMM_WORLD);
}...
![Page 40: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/40.jpg)
40
Sending and Receiving Messages SimultaneouslyTo exchange messages, MPI provides the following function:
int MPI_Sendrecv(void *sendbuf, int sendcount, MPI_Datatype senddatatype, int dest, int sendtag, void *recvbuf, int recvcount, MPI_Datatype recvdatatype, int source, int recvtag, MPI_Comm comm, MPI_Status *status)
![Page 41: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/41.jpg)
41
All-to-All broadcast in Hypercube
2
0
3
1
6
4
7
5
0
2
1
3
5
6 7
4
![Page 42: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/42.jpg)
42
All-to-All broadcast in Hypercube
2
0
3
1
6
4
7
5
0,1
2,3
0,1
2,3
4,5
6,7 6,7
4,5
![Page 43: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/43.jpg)
43
All-to-All broadcast in Hypercube
2
0
3
1
6
4
7
5
0,1,2,3
0,1,2,3
0,1,2,3
0,1,2,3
4,5,6,7
4,5,6,7 4,5,6,7
4,5,6,7
![Page 44: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/44.jpg)
44
All-to-All broadcast in Hypercube
2
0
3
1
6
4
7
5
0,1,2,3,4,5,6,7
0,1,2,3,4,5,6,7
0,1,2,3,4, 5,6,7
0,1,2,3,4,5,6,7
0,1,2,3,4,5,6,7
0,1,2,3,4,5,6,7 0,1,2,3,4,5,6,7
0,1,2,3,4,5,6,7
![Page 45: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/45.jpg)
45
![Page 46: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/46.jpg)
46
Possibility of deadlock if implemented as shown and system buffering not provided.
![Page 47: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/47.jpg)
47
#include <stdio.h>#include <mpi.h>#include <string.h>#include <time.h>
#define MAXMSG 100#define SINGLEMSG 10
int main(int argc, char *argv[]){ int i,j, rank, size, bytes_read, d=3, nbytes=SINGLEMSG, partner, tag=11; char *result, *received; MPI_Status status;
MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size);
received = (char *) malloc (SINGLEMSG + 1); result = (char *) malloc (MAXMSG);
if (argc != (size+1)){ perror("Command line arguments missing"); MPI_Finalize(); exit(1); }
strcpy(result, argv[rank+1]); for (i=0; i<d; i++){ partner = rank ^ (1<<i); MPI_Sendrecv(result, strlen(result)+1, MPI_CHAR, partner, tag, received, MAXMSG, MPI_CHAR,
partner, tag, MPI_COMM_WORLD, &status); printf("I am node %d: Sent %s\t Received %s\n", rank, result, received); strcat(result, received); } printf("I am node %d: My final result is %s\n", rank, result);
MPI_Finalize(); return 0;}
![Page 48: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/48.jpg)
48
int main(int argc, char *argv[]){ // initializations MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); received = (char *) malloc (SINGLEMSG + 1); result = (char *) malloc (MAXMSG);
//error checks strcpy(result, argv[rank+1]); for (i=0; i<d; i++){ partner = rank ^ (1<<i); MPI_Sendrecv(result, strlen(result)+1, MPI_CHAR,
partner, tag, received, MAXMSG, MPI_CHAR, partner, tag, MPI_COMM_WORLD, &status);
printf("I am node %d: Sent %s\t Received %s\n", rank, result, received);
strcat(result, received); } printf("I am node %d: My final result is %s\n", rank,
result); MPI_Finalize(); return 0;}
![Page 49: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/49.jpg)
49
mpirun -np 8 hbroadcast "one " "two " "three " "four " "five " "six " "seven " "eight "
I am node 0: Sent one Received twoI am node 4: Sent five Received sixI am node 5: Sent six Received fiveI am node 1: Sent two Received oneI am node 3: Sent four Received threeI am node 2: Sent three Received fourI am node 0: Sent one two Received three fourI am node 3: Sent four three Received two oneI am node 7: Sent eight Received sevenI am node 1: Sent two one Received four threeI am node 2: Sent three four Received one twoI am node 7: Sent eight seven Received six fiveI am node 6: Sent seven Received eight
![Page 50: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/50.jpg)
50
I am node 0: Sent one two three four Received five six seven eightI am node 0: My final result is one two three four five six seven eightI am node 5: Sent six five Received eight sevenI am node 6: Sent seven eight Received five sixI am node 7: Sent eight seven six five Received four three two oneI am node 3: Sent four three two one Received eight seven six fiveI am node 5: Sent six five eight seven Received two one four threeI am node 4: Sent five six Received seven eightI am node 3: My final result is four three two one eight seven six fiveI am node 1: Sent two one four three Received six five eight sevenI am node 1: My final result is two one four three six five eight sevenI am node 5: My final result is six five eight seven two one four threeI am node 6: Sent seven eight five six Received three four one twoI am node 4: Sent five six seven eight Received one two three fourI am node 4: My final result is five six seven eight one two three fourI am node 2: Sent three four one two Received seven eight five sixI am node 7: My final result is eight seven six five four three two oneI am node 2: My final result is three four one two seven eight five sixI am node 6: My final result is seven eight five six three four one two
![Page 51: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/51.jpg)
51
mpirun -np 8 hbroadcast "one " "two " "three " "four " "five " "six " "seven " "eight " | grep "node 0 "
I am node 0: Sent one Received twoI am node 0: Sent one two Received three fourI am node 0: Sent one two three four Received five
six seven eightI am node 0: My final result is one two three four five
six seven eight
![Page 52: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/52.jpg)
52
mpirun -np 8 hbroadcast "one " "two " "three " "four " "five " "six " "seven " "eight " | grep "node 7 "
I am node 7: Sent eight Received sevenI am node 7: Sent eight seven Received six fiveI am node 7: Sent eight seven six five Received four
three two oneI am node 7: My final result is eight seven six five
four three two one
![Page 53: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/53.jpg)
53
MPI_Bsend Buffered blocking send
Permits the programmer to allocate the required amount of buffer space into which data can be copied until it is delivered.
Insulates against the problems associated with insufficient system buffer space.
Routine returns after the data has been copied from application buffer space to the allocated send buffer.
Must be used with the MPI_Buffer_attach routine.
![Page 54: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/54.jpg)
54
MPI_Ssend Synchronous blocking send
Send a message and block until the application buffer in the sending task is free for reuse and the destination process has started to receive the message.
![Page 55: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/55.jpg)
55
MPI_Rsend Blocking ready send.
Should only be used if the programmer is certain that the matching receive has already been posted.
![Page 56: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/56.jpg)
56
Non-blocking: Non-blocking:
Non-blocking send and receive routines behave similarly - they will return almost immediately.
They do not wait for any communication events to complete, such as message copying from user memory to system buffer space or the actual arrival of message.
It is unsafe to modify the application buffer until you
know for a fact the requested non-blocking operation was actually performed by the library. There are "wait" routines used to do this.
Non-blocking communications are primarily used to overlap computation with communication and exploit possible performance gains.
![Page 57: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/57.jpg)
57
Blocking non-buffered Non-blocking non-buffered
![Page 58: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/58.jpg)
58
In case of non-blocking buffered send operation, it does not wait for message copying from user memory (application buffer) to system buffer space.
Must check when it is safe for the programmer to touch the application buffer.
![Page 59: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/59.jpg)
59
Non-blocking:
MPI_IsendIdentifies an area in memory to serve
as a send buffer. Processing continues immediately
without waiting for the message to be copied out from the application buffer.
![Page 60: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/60.jpg)
60
Non-blocking: MPI_IsendA communication request handle is
returned for handling the pending message status.
The program should not modify the application buffer until subsequent calls to MPI_Wait or MPI_Test indicate that the non-blocking send has completed. MPI_Isend (&buf,count,datatype,dest,tag,comm,&request)
![Page 61: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/61.jpg)
61
Non-blocking:
MPI_IrecvIdentifies an area in memory to serve
as a receive buffer. Processing continues immediately
without actually waiting for the message to be received and copied into the application buffer.
![Page 62: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/62.jpg)
62
Non-blocking: MPI_IrecvA communication request handle is
returned for handling the pending message status. The program must use calls to MPI_Wait or MPI_Test to determine when the non-blocking receive operation completes and the requested message is available in the application buffer.
MPI_Irecv (&buf,count,datatype,source,tag,comm,
&request)
![Page 63: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/63.jpg)
63
MPI_Wait blocks until a specified non-blocking send or receive operation has completed.
MPI_Wait (&request,&status)
MPI_Waitany (count,&array_of_requests,&index,&status)
MPI_Waitall (count,&array_of_requests,&array_of_statuses)
MPI_Waitsome (incount,&array_of_requests,&outcount, ...... &array_of_offsets, &array_of_statuses)
![Page 64: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/64.jpg)
64
MPI_Test checks the status of a specified non-blocking send or receive operation.
The "flag" parameter is returned logical true (1) if the operation has completed, and logical false (0) if not.
MPI_Test (&request,&flag,&status)
MPI_Testany (count,&array_of_requests,&index,&flag,&status)
MPI_Testall (count,&array_of_requests,&flag,&array_of_statuses)
MPI_Testsome (incount,&array_of_requests,&outcount,...... &array_of_offsets, &array_of_statuses)
![Page 65: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/65.jpg)
65
int main() {int numtasks, rank, next, prev, buf[2], tag1=1, tag2=2;MPI_Request reqs[4];MPI_Status stats[4];MPI_Init(&argc,&argv);MPI_Comm_size(MPI_COMM_WORLD, &numtasks); MPI_Comm_rank(MPI_COMM_WORLD, &rank); prev = rank-1; next = rank+1; if (rank == 0) prev = numtasks - 1; if (rank == (numtasks - 1)) next = 0; MPI_Irecv(&buf[0], 1, MPI_INT, prev, tag1, MPI_COMM_WORLD,
&reqs[0]);MPI_Irecv(&buf[1], 1, MPI_INT, next, tag2, MPI_COMM_WORLD,
&reqs[1]);MPI_Isend(&rank, 1, MPI_INT, prev, tag2, MPI_COMM_WORLD,
&reqs[2]); MPI_Isend(&rank, 1, MPI_INT, next, tag1, MPI_COMM_WORLD,
&reqs[3]); //{ do some work } MPI_Waitall(4, reqs, stats); MPI_Finalize(); }
![Page 66: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/66.jpg)
66
MPI_Issend Non-blocking synchronous send. Similar to
MPI_Isend(), except MPI_Wait() or MPI_Test() indicates when the destination process has received the message.
MPI_Ibsend Non-blocking buffered send. Similar to MPI_Bsend()
except MPI_Wait() or MPI_Test() indicates when the destination process has received the message. Must be used with the MPI_Buffer_attach routine.
MPI_Irsend Non-blocking ready send. Similar to MPI_Rsend()
except MPI_Wait() or MPI_Test() indicates when the destination process has received the message. Should only be used if the programmer is certain that the matching receive has already been posted.
![Page 67: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/67.jpg)
67
Collective communication
Collective communication must involve all processes in the scope of a communicator.
It is the programmer's responsibility to ensure that all processes within a communicator participate in any collective operations.
All processes in the communicator must specify the same source or target processes.
![Page 68: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/68.jpg)
68
Collective communication
MPI_BarrierCreates a barrier synchronization in a
group. Each task, when reaching the MPI_Barrier
call, blocks until all tasks in the group reach the same MPI_Barrier call.
MPI_Barrier (comm)
![Page 69: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/69.jpg)
69
Collective communication
MPI_BcastBroadcasts a message from one process to all
other processes in the group. MPI_Bcast (&buffer,count,datatype,source,comm)
count and datatype must match on all processes
![Page 70: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/70.jpg)
70
Collective communicationMPI_Scatter Distributes distinct messages from a single source task to each task in
the group. MPI_Scatter (&sendbuf,sendcnt,sendtype,&recvbuf,
recvcnt,recvtype,source,comm) Process i receives sendcnt contiguous elements starting from i*sendcnt location of sendbuf.
sendcnt is the number of elements sent to each individual process.
![Page 71: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/71.jpg)
71
Collective communicationMPI_Gather Gathers distinct messages from each task in the group to a single
destination task. This routine is the reverse operation of MPI_Scatter. MPI_Gather (&sendbuf,sendcnt,sendtype,&recvbuf,
recvcount,recvtype,root,comm) Data is stored in recvbuf in a rank order.
Data from process i is stored in recvbuf at location i*sendcnt.
Information about receive buffer is applicable to the recipient process and is ignored for all others.
![Page 72: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/72.jpg)
72
Collective communicationMPI_Allgather Concatenation of data to all tasks in a group. Each task in
the group, in effect, performs a one-to-all broadcasting operation within the group.
![Page 73: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/73.jpg)
73
Collective communicationMPI_Reduce Applies a reduction operation on all tasks in the group and
places the result in one task.
![Page 74: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/74.jpg)
74
![Page 75: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/75.jpg)
75
Collective communicationMPI_Allreduce Applies a reduction operation and places the result in all tasks in the
group. This is equivalent to an MPI_Reduce followed by an
MPI_Bcast.
![Page 76: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/76.jpg)
76
Collective communicationMPI_Alltoall Each task in a group performs a scatter operation, sending a distinct
message to all the tasks in the group in order by index.
![Page 77: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/77.jpg)
77
#define SIZE 4 int main() {int numtasks, rank, sendcount, recvcount, source;float sendbuf[SIZE][SIZE] = { {1.0, 2.0, 3.0, 4.0}, {5.0,
6.0, 7.0, 8.0}, {9.0, 10.0, 11.0, 12.0}, {13.0, 14.0, 15.0, 16.0} };
float recvbuf[SIZE]; MPI_Init(&argc,&argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank);MPI_Comm_size(MPI_COMM_WORLD, &numtasks);if (numtasks == SIZE) {
source = 1;sendcount = SIZE;recvcount = SIZE;
MPI_Scatter(sendbuf,sendcount,MPI_FLOAT, recvbuf, recvcount, MPI_FLOAT,source,MPI_COMM_WORLD);
printf("rank= %d Results: %f %f %f %f\n",rank,recvbuf[0], recvbuf[1],recvbuf[2],recvbuf[3]);
} else
printf("Must specify %d processors. Terminating.\n",SIZE);
MPI_Finalize(); }
![Page 78: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/78.jpg)
78
mpirun -np 4 scatterrank= 0 Results: 1.000000 2.000000 3.000000 4.000000
rank= 1 Results: 5.000000 6.000000 7.000000 8.000000
rank= 2 Results: 9.000000 10.000000 11.000000 12.000000
rank= 3 Results: 13.000000 14.000000 15.000000 16.000000
![Page 79: Friday, October 13, 2006](https://reader034.fdocuments.net/reader034/viewer/2022051401/5681437c550346895daffbc7/html5/thumbnails/79.jpg)
79
Collective communication
MPI_GathervMPI_Allgatherv Allows different number of data elements to be sent by
each processMPI_Scatterv Allows different amounts of data to be sent to different
processesMPI_Alltoallv Allows different amounts of data to be sent to and
received from each process
…