| MPI Performance Topics Exercise |
Make sure that you are logged into your assigned SP node with your assigned userid for this exercise. Ask the instructor if you have any questions.
The $WORKSHOP variable defines the root location for the workshop files, and may vary from workshop to workshop. Find out if this has already been setup for your workshop:
echo $WORKSHOP
If this environment variable is not set, check with the instructor for the correct location. Then, depending upon your shell, set $WORKSHOP:
| csh/tcsh | setenv WORKSHOP instructor/specified/path |
|---|---|
| bsh/ksh | export WORKSHOP=instructor/specified/path |
In your SP home directory, create a subdirectory for the MPI Performance Topics test codes and cd to it.
mkdir ~/mpi_performance
cd ~/mpi_performance
Then, copy the exercise files to your mpi_performance subdirectory (only C examples are available):
cp $WORKSHOP/mpi_performance/samples/* ~/mpi_performance
You should notice the following files:
| Example File | Description | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
buffsend.c
| Buffered sends (not used)
|
datatypes.c
| Derived datatypes
|
datatypes2.c
| Derived datatype example
|
eager_vs_rend.c
| Eager and redezvous protocols
|
msgsize.c
| Effect of message size on bandwidth
|
persist.c
| Persistent communications (not used)
|
persist2.c
| Persistent communications (not used)
|
unsafe.c
| Unsafe program due to dependence upon system buffer space
| |
The unsafe.c sample code demonstrates dependence upon limited MPI system buffer space. Review the code, and then compile it with mpcc:
mpcc unsafe.c -o unsafe
For the first execution, run the code using two tasks with IP communications, accepting the default system buffer space (approx. 3 MB):
unsafe -procs 2 -euilib ip
It should fail after several iterations.
Now try running the same code with the maximum IBM MPI system buffer space:
unsafe -procs 2 -euilib ip -buffer_mem 64000000
What happens this time? Feel free to terminate the execution (CTRL-C) after you're convinced of what is happening.
Review the eager_vs_rend.c sample code, and then compile it with mpcc.
mpcc eager_vs_rend.c -o eager_vs_rend
For the first execution, run your executable with the maximum message size for eager protocol:
eager_vs_rend -procs 2 -euilib ip -eager_limit 64000
Note the average times for each message size.
Now try running the code with a small eager limit:
eager_vs_rend -procs 2 -euilib ip -eager_limit 2000
Note the difference in timings between this execution and the first one. Which ones reflect better performance? Why? Try experimenting with several other eager limit sizes if desired.
The MPI Performance Topics tutorial provided several graphs which demonstrated the significant effect of message size on communications bandwidth. Using the msgsize.c sample code, demonstrate this for yourself.
Review the code and notice the section at the top where the START and FINISH constants are set. These define the starting and ending message sizes which will be tested. Then compile the sample code:
mpcc msgsize.c -o msgsize
Run the executable with 2 tasks and IP communications, and notice the output:
msgsize -procs 2 -euilib ip
Now edit the msgsize.c file and change the START and FINISH constants to reflect a range of larger sized messages. You may also wish to change the INCR constant proportionately. Compile and run the code, again observing the output.
Try other ranges of message sizes if desired. If you happen to try message sizes in the 1MB+ range, you should notice the bandwidth levels off. The greatest differences in bandwidth occur with smaller messages.
The datatypes2.c sample code demonstrates a clear case where the use of derived datatypes can significantly improve performance.
Review the code and note how the same data is being sent by two different methods. The first method employs an MPI derived datatype. The second method does individual sends/receives instead. Then, compile the code:
mpcc datatypes2.c -o datatypes2
Run the executable with 2 tasks and IP communications, and notice the output.
datatypes2 -procs 2 -euilib ip
The datatypes.c sample code is slightly more complex than the previous example in that it demonstrates 4 different ways of sending the same data. It also demonstrates how the use of derived datatypes may not always be optimal. Review the code and note the four different ways the data is being sent:
Use mpcc to compile this sample code:
mpcc datatypes.c -o datatypes
Then run the executable with 2 tasks and IP communications.
datatypes -procs 2 -euilib ip
NOTE: The results for the first three methods will appear quickly, however the fourth method (worst case) will take several minutes to complete.
This concludes the MPI Performance Topics lab exercise.