1 Parallel Computing—Introduction to Message Passing Interface (MPI)
-
date post
15-Jan-2016 -
Category
Documents
-
view
241 -
download
0
Transcript of 1 Parallel Computing—Introduction to Message Passing Interface (MPI)
![Page 1: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/1.jpg)
1
Parallel Computing—Introduction to Message Passing Interface
(MPI)
![Page 2: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/2.jpg)
2
Two Important Concepts• Two fundamental concepts of parallel
programming are: • Domain decomposition• Functional decomposition
![Page 3: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/3.jpg)
3
Domain Decomposition
![Page 4: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/4.jpg)
4
Functional Decomposition
![Page 5: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/5.jpg)
5
Message Passing Interface (MPI)• MPI is a standard (an interface or an API):
• It defines a set of methods that are used by application developers to write their applications
• MPI library implement these methods
• MPI itself is not a library—it is a specification document that is followed!
• MPI-1.2 is the most popular specification version
• Reasons for popularity:• Software and hardware vendors were involved
• Significant contribution from academia
• MPICH served as an early reference implementation
• MPI compilers are simply wrappers to widely used C and Fortran compilers
• History: • The first draft specification was produced in 1993
• MPI-2.0, introduced in 1999, adds many new features to MPI
• Bindings available to C, C++, and Fortran
• MPI is a success story:• It is the mostly adopted programming paradigm of IBM Blue Gene systems
• At least two production-quality MPI libraries:• MPICH2 (http://www-unix.mcs.anl.gov/mpi/mpich2/)
• OpenMPI (http://open-mpi.org)
• There’s even a Java library: • MPJ Express (http://mpj-express.org)
![Page 6: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/6.jpg)
6
Message Passing Model• Message passing model allows processors to
communicate by passing messages: • Processors do not share memory
• Data transfer between processors required cooperative operations to be performed by each processor:• One processor sends the message while other receives the
message
![Page 7: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/7.jpg)
7
Proc 6
Proc 0
Proc 1
Proc 3
Proc 2
Proc 4
Proc 5
Proc 7
message
CPU
Memory LANEthernetMyrinet
Infiniband etc
Distributed Memory Cluster
![Page 8: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/8.jpg)
8
Writing “Hello World” MPI Program
• MPI is very simple: • Initialize MPI environment:
• MPI_Init(&argc,&argv); // C Code • MPI.Init(args); // Java Code
• Send or receive message:• MPI_Send(..); // C Code • MPI.COMM_WORLD.Send(); // Java Code
• Finalize MPI environment• MPI_Finalize(); // C Code • MPI.Finalize(); // Java Code
![Page 9: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/9.jpg)
9
Hello World in C#include <stdio.h>#include <string.h>#include “mpi.h”
..
// Initialize MPI MPI_Init(&argsc,&&argsv);
// Find out the `id’ or `rank’ of current processMPI_Comm_Rank(MPI_COMM_WORLD,&my_rank); //get the rank
// Get total number of processesMPI_Comm_Size(MPI_COMM_WORLD,&p); //get total processor
// Print the rank of the processprintf(“Hello World from process no %d”,my_rank);
MPI_Finalize();
..
![Page 10: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/10.jpg)
10
Hello World in Java
import java.util.*;import mpi.*;
.. // Initialize MPI MPI.Init(args); // start up MPI
// Get total number of processes and ranksize = MPI.COMM_WORLD.Size(); rank = MPI.COMM_WORLD.Rank();
System.out.println(“Hello World <”+rank+”>”);
MPI_Finalize();
..
![Page 11: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/11.jpg)
11
After Initializationimport java.util.*;import mpi.*;
.. // Initialize MPI MPI.Init(args); // start up MPI
// Get total number of processes and ranksize = MPI.COMM_WORLD.Size(); rank = MPI.COMM_WORLD.Rank();
..
![Page 12: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/12.jpg)
12
What is size?
• Total number of processes in a communicator:• The size of MPI.COMM_WORLD is 6
import java.util.*;import mpi.*;
..
// Get total number of processessize = MPI.COMM_WORLD.Size();
..
![Page 13: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/13.jpg)
13
What is rank?
• The “unique” identify (id) of a process in a communicator:• Each of the six processes in MPI.COMM_WORLD has a distinct rank
or id
import java.util.*;import mpi.*;
..
// Get total number of processesrank = MPI.COMM_WORLD.Rank();
..
![Page 14: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/14.jpg)
14
Running “HelloWorld” in C• Write parallel code• Start MPICH2 daemon• Write machines file• Start the parallel job
![Page 15: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/15.jpg)
15
![Page 16: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/16.jpg)
16
![Page 17: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/17.jpg)
17
Running “Hello World” in Java• The code is executed on a cluster called
“Starbug”: • One head-node “holly” and eight compute-nodes
• Steps: • Write machines files• Bootstrap MPJ Express (or any MPI library) runtime• Write parallel application• Compile and execute
![Page 18: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/18.jpg)
18
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
![Page 19: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/19.jpg)
19
Write machines files
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
![Page 20: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/20.jpg)
20
Bootstrap MPJ Express runtime
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
![Page 21: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/21.jpg)
21
Write Parallel Program
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
![Page 22: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/22.jpg)
22
Compile and Execute
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
![Page 23: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/23.jpg)
23
Single Program Multiple Data (SPMD) Model
import java.util.*;import mpi.*;
public class HelloWorld { MPI.Init(args); // start up MPI
size = MPI.COMM_WORLD.Size(); rank = MPI.COMM_WORLD.Rank();
if (rank == 0) { System.out.println(“I am Process 0”); } else if (rank == 1) { System.out.println(“I am Process 1”); }
MPI.Finalize();}
![Page 24: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/24.jpg)
24
Single Program Multiple Data (SPMD) Model
import java.util.*;import mpi.*;
public class HelloWorld { MPI.Init(args); // start up MPI
size = MPI.COMM_WORLD.Size(); rank = MPI.COMM_WORLD.Rank();
if (rank%2 == 0) { System.out.println(“I am an even process”); } else if (rank%2 == 1) { System.out.println(“I am an odd process”); }
MPI.Finalize();}
![Page 25: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/25.jpg)
25
Point to Point Communication• The most fundamental facility provided by MPI• Basically “exchange messages between two
processes”: • One process (source) sends message• The other process (destination) receives message
![Page 26: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/26.jpg)
26
Point to Point Communication• It is possible to send message for each basic
datatype:• Floats, Integers, Doubles …
• Each message contains a “tag”—an identifier
Tag1
Tag2
![Page 27: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/27.jpg)
27
Process 6
Process 0
Process 1
Process 3
Process 2
Process 4
Process 5
Process 7
message
Integers Process 4 Tag COMM_WORLD
Point to Point Communication
![Page 28: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/28.jpg)
28
Blocking and Non-blocking • There are blocking and non-blocking version of send
and receive methods• Blocking versions:
• A process calls send() or recv(), these methods return when the message has been physically sent or received
• Non-blocking versions: • A process calls isend() or irecv(), these methods return
immediately • The user can check the status of message by calling test() or
wait()
• Note the “i” in isend() and irecv()• Non-blocking versions provide overlapping of
computation and communication: • It also depends on the “quality” of the implementation
![Page 29: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/29.jpg)
29
CPU waits
“Blocking”
send() recv()
Sender Receiver
time CPU waits
“Non Blocking”
isend() irecv()
Sender Receiver
time CPU
perform task
iwait()
CPU waitsiwait()
CPU waits
CPU perform task
![Page 30: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/30.jpg)
30
Modes of Send
• The MPI standard defines four modes of send:• Standard• Synchronous• Buffered• Ready
![Page 31: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/31.jpg)
31
Standard Mode (Eager send protocol used for small messages)
time ->control message to receiver
actual data sent
sender receiver
![Page 32: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/32.jpg)
32
Synchronous Mode (Rendezvous Protocol used for large messages)
time ->
control message to receiver
actual data sent
acknowledgement
sender receiver
![Page 33: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/33.jpg)
33
Performance Evaluation of Point to Point Communication
• Normally ping pong benchmarks are used to calculate: • Latency: How long it takes to send N bytes from
sender to receiver?• Throughput: How much bandwidth is achieved?
• Latency is a useful measure for studying the performance of “small” messages
• Throughput is a useful measure for studying the performance of “large” messages
![Page 34: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/34.jpg)
34
Latency on Fast Ethernet
![Page 35: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/35.jpg)
35
Throughput on Fast Ethernet
![Page 36: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/36.jpg)
36
Latency on Gigabit Ethernet
![Page 37: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/37.jpg)
37
Throughput on GigE
![Page 38: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/38.jpg)
38
Latency on Myrinet
![Page 39: 1 Parallel Computing—Introduction to Message Passing Interface (MPI)](https://reader036.fdocuments.net/reader036/viewer/2022081514/56649d545503460f94a302cb/html5/thumbnails/39.jpg)
39
Throughput on Myrinet