MHPCC

Message Passing Overview


Note: This tutorial is no longer being maintained. Some information contained in it may be out of date and/or no longer valid. Hyperlinks may also be invalid.

Table of Contents

  1. Message Passing Model
  2. Message Passing Libraries
    1. Terminology
    2. Point-to-Point Communication
    3. Collective Communication
    4. Performance Guidelines
  3. Library Comparison
    1. SP2 Performance Results
    2. Recommendations
  4. Acknowledgements and References


Message Passing Model


The message passing model is one of several computational models for conceptualizing program operations. The message passing model is defined as:

Other models include: Note: these models are machine/architecture independent; any of the models can be implemented on any hardware given appropriate operating system support. An effective implementation is one which closely matches its target hardware.


Message Passing Model (Cont.)


The message passing model has gained wide use in the field of parallel computing due to advantages that include:

The principle drawback of message passing is the responsiblity it places on the programmer. The programmer must explicitly implement a data distribution scheme and all interprocess communication and synchronization. In so doing, it is the programmer's responsibility to resolve data dependencies and avoid deadlock and race conditions.


Message Passing Libraries


The set of communication operations that are allowed by an implementation of the message passing model form the components of a message passing library. Examples of message passing libraries include public domain packages that do not target a specific machine (PICL, PVM, PARMACS, P4, MPI, etc.) as well as machine dependent vendor implementations (MPL, NX, CMMD, etc.).

The common components of message passing libraries include:

Until recently, users of message passing libraries had to choose between using public domain packages for improved code portability, and vendor implementations for improved performance on a given machine. The MPI (Message Passing Interface) Library has been developed to meet the dual goals of portability and performance on a wide range of machines.


Message Passing Libraries: Terminology


The terminology used by the various message passing libraries has not been standardized. Occasionally, communications routines from different libraries having the same semantics are described in conflicting terms. (Example: pvm_send() from the PVM library and csend() from the NX library.) As a promotion of standardization, the terminology used by the MPI standard is presented here.

Buffering

Temporary copying of messages performed by system as a part of its transmission protocols. Copying occurs between user buffer space (defined by process) and system buffer space (defined by library).

Blocking Communication

A communication routine is blocking if the completion of the call is dependent on certain "events". For sends, the data must be successfully sent or safely copied so that the buffer that contained the data is available for reuse. For receives, the data must be safely stored in the receive buffer so that it is ready for use.

Nonblocking Communication

A communication routine is nonblocking if the call returns without waiting for any communications events to complete (such as copying of message from user memory to system memory or arrival of message).

Synchronous Communication

Communication in which the sender does not return until the matching receive has been posted on the destination process.

Asynchronous Communication

Communication in which the sender and receiver place no constraints on each other in terms of completion.
Note: MPI uses nonblocking routines to provide this capability.


Message Passing Libraries: Point-to-Point Communication


The basic components of any message passing library are its point-to-point communications routines for data transfer between two processes, the send and receive operations.

The basic send and receive operations look like:

where

Typical message passing libraries subdivide the basic sends and receives into two types:


Point-to-Point Communication (Cont.)


Blocking Routines

Blocking routines are the simplest but can be "unsafe":

    Process 0          Process 1
    ---------          ---------
     bsend(1)           bsend(0)
     brecv(1)           brecv(0)
Completion depends on size of message and amount of system buffering. If the message size exceeds the system buffer space, the sends may not complete and the receives will not be reached. This situation is called deadlock, where two or more processes cannot proceed with computation because they depend on each other for a result they cannot get.

Note: Unsafe programs should be viewed as incorrect and steps taken to insure program correctness.


Point-to-Point Communication (Cont.)


Possible solutions to unsafe programs


Point-to-Point Communication (Cont.)


Nonblocking sends and receives:


Point-to-Point Communication (Cont.)


Some libraries have additional send operations, such as

These may be available in both blocking and nonblocking forms.

The synchronous and buffered send routines can have a negative impact on performance and should generally be avoided:

The basic point-to-point communication routines in many libraries are designed to handle data stored contiguously in memory. In this case, there are usually additional routines to handle non-contiguous data.


Message Passing Libraries: Collective Communication


Collective operations are coordinated among a "group" of processes


Message Passing Libraries: Performance Guidelines


There are some basic steps that can be taken improve the performance of programs written using a message passing library.


Library Comparison: SP2 Performance Results


The three main criteria for choosing a message passing library are:

Latency (time to transmit 0 length message) and bandwidth vary greatly from machine to machine. The best performance on a specific machine is typically obtained from the native message passing library written specifically for that machine.

Measured latency and bandwidth for the MPI, MPL, PVMe and PVM libraries on the SP2.

Message Passing Library Latency
(usecs)
Bandwidth
(MB/sec)
Portable?
MPI 43 34 yes
MPL 45 34 no
MPICH 58 33 yes
PVMe (interrupts off) 83 31 yes
PVMe 220 27 yes
PVM (RouteDirect w/in place packing) 642 12 yes
PVM (DontRoute w/in place packing) 1450 3 yes


Library Comparison: Recommendations


The following recommendations concern the choice of MPL, MPI on the IBM SP2.

MPL
====
Message Passing Library - SP2 native message passing library


Library Comparison: Recommendations


PVM
====
Parallel Virtual Machine - public domain package available from Oak Ridge National Lab


Library Comparison: Recommendations


MPI
====
Message Passing Interface


Acknowledgements and References



Maui High Performance Computing Center. All rights reserved.

Revised: 20 August 1997 webmaster@mhpcc.hpc.mil