MPI Performance Topics Exercise


  1. Login to the SP machine

    Make sure that you are logged into your assigned SP node with your assigned userid for this exercise. Ask the instructor if you have any questions.

  2. Verify the environment variable $WORKSHOP

    The $WORKSHOP variable defines the root location for the workshop files, and may vary from workshop to workshop. Find out if this has already been setup for your workshop:

    echo $WORKSHOP

    If this environment variable is not set, check with the instructor for the correct location. Then, depending upon your shell, set $WORKSHOP:

    csh/tcsh
    setenv WORKSHOP instructor/specified/path
    
    bsh/ksh
    export WORKSHOP=instructor/specified/path
    

  3. Copy the example files

    In your SP home directory, create a subdirectory for the MPI Performance Topics test codes and cd to it.

    mkdir ~/mpi_performance
    cd ~/mpi_performance

    Then, copy the exercise files to your mpi_performance subdirectory (only C examples are available):

    cp $WORKSHOP/mpi_performance/samples/* ~/mpi_performance

  4. List the contents of your mpi_performance subdirectory

    You should notice the following files:

    Example File Description
    buffsend.c Buffered sends (not used)
    datatypes.c Derived datatypes
    datatypes2.c Derived datatype example
    eager_vs_rend.c Eager and redezvous protocols
    msgsize.c Effect of message size on bandwidth
    persist.c Persistent communications (not used)
    persist2.c Persistent communications (not used)
    unsafe.c Unsafe program due to dependence upon system buffer space

  5. System buffer depletion: unsafe MPI program

    The unsafe.c sample code demonstrates dependence upon limited MPI system buffer space. Review the code, and then compile it with mpcc:

    mpcc unsafe.c -o unsafe

    For the first execution, run the code using two tasks with IP communications, accepting the default system buffer space (approx. 3 MB):

    unsafe -procs 2 -euilib ip

    It should fail after several iterations.

    Now try running the same code with the maximum IBM MPI system buffer space:

    unsafe -procs 2 -euilib ip -buffer_mem 64000000

    What happens this time? Feel free to terminate the execution (CTRL-C) after you're convinced of what is happening.

  6. Eager vs. rendezvous protocols

    Review the eager_vs_rend.c sample code, and then compile it with mpcc.

    mpcc eager_vs_rend.c -o eager_vs_rend

    For the first execution, run your executable with the maximum message size for eager protocol:

    eager_vs_rend -procs 2 -euilib ip -eager_limit 64000

    Note the average times for each message size.

    Now try running the code with a small eager limit:

    eager_vs_rend -procs 2 -euilib ip -eager_limit 2000

    Note the difference in timings between this execution and the first one. Which ones reflect better performance? Why? Try experimenting with several other eager limit sizes if desired.

  7. Effect of message size on bandwidth

    The MPI Performance Topics tutorial provided several graphs which demonstrated the significant effect of message size on communications bandwidth. Using the msgsize.c sample code, demonstrate this for yourself.

    Review the code and notice the section at the top where the START and FINISH constants are set. These define the starting and ending message sizes which will be tested. Then compile the sample code:

    mpcc msgsize.c -o msgsize

    Run the executable with 2 tasks and IP communications, and notice the output:

    msgsize -procs 2 -euilib ip

    Now edit the msgsize.c file and change the START and FINISH constants to reflect a range of larger sized messages. You may also wish to change the INCR constant proportionately. Compile and run the code, again observing the output.

    Try other ranges of message sizes if desired. If you happen to try message sizes in the 1MB+ range, you should notice the bandwidth levels off. The greatest differences in bandwidth occur with smaller messages.

  8. Derived datatypes: Case 1

    The datatypes2.c sample code demonstrates a clear case where the use of derived datatypes can significantly improve performance.

    Review the code and note how the same data is being sent by two different methods. The first method employs an MPI derived datatype. The second method does individual sends/receives instead. Then, compile the code:

    mpcc datatypes2.c -o datatypes2

    Run the executable with 2 tasks and IP communications, and notice the output.

    datatypes2 -procs 2 -euilib ip

  9. Derived datatypes: Case 2

    The datatypes.c sample code is slightly more complex than the previous example in that it demonstrates 4 different ways of sending the same data. It also demonstrates how the use of derived datatypes may not always be optimal. Review the code and note the four different ways the data is being sent:

    1. Vector derived datatype
    2. Struct derived datatype
    3. User "hand" packing
    4. Individual sends/receives

    Use mpcc to compile this sample code:

    mpcc datatypes.c -o datatypes

    Then run the executable with 2 tasks and IP communications.

    datatypes -procs 2 -euilib ip

    NOTE: The results for the first three methods will appear quickly, however the fourth method (worst case) will take several minutes to complete.

This concludes the MPI Performance Topics lab exercise.