Distributed Memory Programming with MPI

DilumBandara 542 views 49 slides Jan 08, 2024
Slide 1
Slide 1 of 49
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49

About This Presentation

Introduction to distributed memory programming with MPI. C++ examples on MPI functions, Single-Program Multiple-Data. and collective communication.


Slide Content

Distributed Memory Programming with MPI Slides extended from An Introduction to Parallel Programming by Peter Pacheco Dilum Bandara [email protected]

Distributed Memory Systems We discuss about developing programs for these systems using MPI MPI – Message Passing Interface Are set of libraries that can be called from C, C++, & Fortran Copyright © 2010, Elsevier Inc. All rights Reserved

Why MPI? Standardized & portable message-passing system One of the oldest libraries Wide-spread adoption Minimal requirements on underlying hardware Explicit parallelization Achieves high performance Scales to a large no of processors Intellectually demanding Copyright © 2010, Elsevier Inc. All rights Reserved

Our First MPI Program Copyright © 2010, Elsevier Inc. All rights Reserved

Compilation Copyright © 2010, Elsevier Inc. All rights Reserved mpicc -g -Wall -o mpi_hello mpi_hello.c wrapper script to compile turns on all warnings source file create this executable file name (as opposed to default a.out ) produce debugging information

Execution Copyright © 2010, Elsevier Inc. All rights Reserved mpiexec -n <no of processes> <executable> mpiexec -n 1 ./ mpi_hello mpiexec -n 4 ./ mpi_hello run with 1 process run with 4 processes

Execution Copyright © 2010, Elsevier Inc. All rights Reserved mpiexec -n 1 ./ mpi_hello mpiexec -n 4 ./ mpi_hello Greetings from process 0 of 1 ! Greetings from process 0 of 4 ! Greetings from process 1 of 4 ! Greetings from process 2 of 4 ! Greetings from process 3 of 4 !

MPI Programs Need to add mpi.h header file Identifiers defined by MPI start with “MPI_” 1 st letter following underscore is uppercase For function names & MPI-defined types Helps to avoid confusion Copyright © 2010, Elsevier Inc. All rights Reserved

6 Golden MPI Functions Copyright © 2010, Elsevier Inc. All rights Reserved

MPI Components MPI_Init Tells MPI to do all necessary setup e.g., allocate storage for message buffers, decide rank of a process argc_p & argv_p are pointers to argc & argv arguments in main( ) Function returns error codes Copyright © 2010, Elsevier Inc. All rights Reserved

MPI_Finalize Tells MPI we’re done, so clean up anything allocated for this program MPI Components (Cont.) Copyright © 2010, Elsevier Inc. All rights Reserved

Communicators Collection of processes that can send messages to each other Messages from others communicators are ignored MPI_Init defines a communicator that consists of all processes created when the program is started Called MPI_COMM_WORLD Copyright © 2010, Elsevier Inc. All rights Reserved

Communicators (Cont.) Copyright © 2010, Elsevier Inc. All rights Reserved My rank (process making this call) No of processes in the communicator

Single-Program Multiple-Data (SPMD) We compile 1 program Process 0 does something different Receives messages & prints them while the other processes do the work if-else construct makes our program SPMD We can run this program on any no of processors e.g., 4, 8, 32, 1000, … Copyright © 2010, Elsevier Inc. All rights Reserved

Communication msg_buf_p, msg_size, msg_type Determines content of message dest – destination processor’s rank tag – use to distinguish messages that are identical in content Copyright © 2010, Elsevier Inc. All rights Reserved

Data Types Copyright © 2010, Elsevier Inc. All rights Reserved

Communication (Cont.) Copyright © 2010, Elsevier Inc. All rights Reserved MPI_ANY_SOURCE to receive messages (from any source) in order for arrival

Message Matching Copyright © 2010, Elsevier Inc. All rights Reserved MPI_Send src = q MPI_Recv dest = r r q

Receiving Messages Receiver can get a message without knowing Amount of data in message Sender of message Tag of message How can those be found out? Copyright © 2010, Elsevier Inc. All rights Reserved

How Much Data am I Receiving? Copyright © 2010, Elsevier Inc. All rights Reserved

Exact behavior is determined by MPI implementation MPI_Send may behave differently with regard to buffer size, cutoffs, & blocking Cutoff if message size < cutoff  buffer if message size ≥ cutoff  MPI_Send will block MPI_Recv always blocks until a matching message is received Preserve message ordering from a sender Know your implementation Don’t make assumptions! Issues With Send & Receive Copyright © 2010, Elsevier Inc. All rights Reserved

Trapezoidal Rule Copyright © 2010, Elsevier Inc. All rights Reserved

Trapezoidal Rule (Cont.) Copyright © 2010, Elsevier Inc. All rights Reserved

Serial Pseudo-code Copyright © 2010, Elsevier Inc. All rights Reserved

Parallel Pseudo-Code Copyright © 2010, Elsevier Inc. All rights Reserved

Tasks & Communications for Trapezoidal Rule Copyright © 2010, Elsevier Inc. All rights Reserved

First Version Copyright © 2010, Elsevier Inc. All rights Reserved

First Version (Cont.) Copyright © 2010, Elsevier Inc. All rights Reserved

First Version (Cont.) Copyright © 2010, Elsevier Inc. All rights Reserved

Collective communication Copyright © 2010, Elsevier Inc. All rights Reserved

Collective Communication Copyright © 2010, Elsevier Inc. All rights Reserved A tree-structured global sum

Alternative Tree-Structured Global Sum Copyright © 2010, Elsevier Inc. All rights Reserved Which is most optimum? Can we do better?

MPI_Reduce Copyright © 2010, Elsevier Inc. All rights Reserved

Predefined Reduction Operators Copyright © 2010, Elsevier Inc. All rights Reserved

Collective vs. Point-to-Point Communications All processes in the communicator must call the same collective function e.g., a program that attempts to match a call to MPI_Reduce on 1 process with a call to MPI_Recv on another process is erroneous Program will hang or crash Arguments passed by each process to an MPI collective communication must be “compatible” e.g., if 1 process passes in 0 as dest_process & another passes in 1, then the outcome of a call to MPI_Reduce is erroneous Program is likely to hang or crash Copyright © 2010, Elsevier Inc. All rights Reserved

Collective vs. P-to-P Communications (Cont.) output_data_p argument is only used on dest_process However, all of the processes still need to pass in an actual argument corresponding to output_data_p , even if it’s just NULL Point-to-point communications are matched on the basis of tags & communicators Collective communications don’t use tags Matched solely on the basis of communicator & order in which they’re called Copyright © 2010, Elsevier Inc. All rights Reserved

MPI_Allreduce Useful when all processes need result of a global sum to complete some larger computation Copyright © 2010, Elsevier Inc. All rights Reserved

Copyright © 2010, Elsevier Inc. All rights Reserved Global sum followed by distribution of result MPI_Allreduce (Cont.)

Butterfly-Structured Global Sum Copyright © 2010, Elsevier Inc. All rights Reserved Processes exchange partial results

Broadcast Data belonging to a single process is sent to all of the processes in communicator Copyright © 2010, Elsevier Inc. All rights Reserved

Tree-Structured Broadcast Copyright © 2010, Elsevier Inc. All rights Reserved

Data Distributions – Compute a Vector Sum Copyright © 2010, Elsevier Inc. All rights Reserved Serial implementation

Partitioning Options Copyright © 2010, Elsevier Inc. All rights Reserved Block partitioning Assign blocks of consecutive components to each process Cyclic partitioning Assign components in a round robin fashion Block-cyclic partitioning Use a cyclic distribution of blocks of components

Parallel Implementation Copyright © 2010, Elsevier Inc. All rights Reserved

MPI_Scatter Can be used in a function that reads in an entire vector on process 0 but only sends needed components to other processes Copyright © 2010, Elsevier Inc. All rights Reserved

Reading & Distributing a Vector Copyright © 2010, Elsevier Inc. All rights Reserved

MPI_Gather Collect all components of a vector onto process 0 Then process 0 can process all of components Copyright © 2010, Elsevier Inc. All rights Reserved

MPI_Allgather Concatenates contents of each process’ send_buf_p & stores this in each process’ recv_buf_p recv_count is the amount of data being received from each process Copyright © 2010, Elsevier Inc. All rights Reserved

Summary Copyright © 2010, Elsevier Inc. All rights Reserved Source: https://computing.llnl.gov/tutorials/mpi/