| Parallel Operating Environment (POE) Exercise |
Make sure that you are logged into your assigned SP node with your assigned userid for this exercise. Ask the instructor if you have any questions.
The $WORKSHOP variable defines the root location for the workshop files, and may vary from workshop to workshop. Find out if this has already been setup for your workshop:
echo $WORKSHOP
If this environment variable is not set, check with the instructor for the correct location. Then, depending upon your shell, set $WORKSHOP:
| csh/tcsh | setenv WORKSHOP instructor/specified/path |
| bsh/ksh | export WORKSHOP=instructor/specified/path |
Your instructor should have already explained the overall configuration of the system you are using, noting which nodes/pools are available for workshop use.
Find out which type of AIX authorization is being used for the nodes your job will run on.
When done, make sure that the file resides in your home directory, is called .rhosts, and has write permission for your userid only.
mkdir ~/poe
cd ~/poe
| C: | cp $WORKSHOP/poe/samples/C/* ~/poe |
| Fortran: | cp $WORKSHOP/poe/samples/Fortran/* ~/poe |
You should notice two files:
| Language | File Name | Description | |||||||
|---|---|---|---|---|---|---|---|---|---|
| C: |
poe_hello.c
| Simple MPI program which prints a task's rank and hostname.
C version.
|
poe_bandwidth.c
| An MPI communications bandwidth test between two nodes.
C version.
| Fortran:
|
poe_hello.f
| Simple MPI program which prints a task's rank and hostname.
Fortran version.
|
poe_bandwidth.f
| An MPI communications bandwidth test between two nodes. Fortran version.
| |
Depending upon your language preference, use one of the IBM parallel compilers to compile the poe_hello program.
| C: | mpcc -o poe_hello poe_hello.c |
| Fortran: | mpxlf -o poe_hello poe_hello.f |
In this step you'll set a few POE environment variables. Specifically, those which answer the three questions:
Depending upon your shell, set the following environment variables as shown:
| Description | csh/tcsh | ksh/bsh |
|---|---|---|
| Request 4 nodes for 4 tasks | setenv MP_PROCS 4 | export MP_PROCS=4 |
| Non-specific allocation (let the Resource Manager do it) | setenv MP_RESD yes | export MP_RESD=yes |
| Set poolid to the workshop nodes pool number.
Use the | setenv MP_RMPOOL poolid | export MP_RMPOOL=poolid |
| Use IP communications since you're running interactive with other users | setenv MP_EUILIB ip | export MP_EUILIB=ip |
| Use the high performance switch network interface | setenv MP_EUIDEVICE css0 | export MP_EUIDEVICE=css0 |
poe_hello
Total number of tasks = 4 Hello! From task 1 on host node1.abc.edu Hello! From task 2 on host node8.abc.edu Hello! From task 3 on host node23.abc.edu Hello! From task 0 on host node3.abc.edu
| C: | mpcc -o poe_bandwidth poe_bandwidth.c |
| Fortran: | mpxlf -o poe_bandwidth poe_bandwidth.f |
| csh/tcsh | setenv MP_PROCS 2 |
| ksh/bsh | export MP_PROCS=2 |
poe_bandwidth
As the program runs, it will display the effective communications bandwidth between nodes for a given message size. Since the MP_EUILIB variable was not modified from the last code, what you will be seeing is the bandwidth for Internet protocol (ip) over the high performance switch.
| csh/tcsh | setenv MP_EUILIB us |
| ksh/bsh | export MP_EUILIB=us |
This is because there may be others in the workshop using nodes in User Space mode at the same time as you. Remember that there can only be one User Space job per node if your POE version is less than 2.4. If you get this error message, just try running again in a few seconds/minutes.
POE has a number of other environment variables which may be useful. Try running the poe_hello code again after setting the following:
| Shell | Command | Description |
|---|---|---|
| csh/tcsh | setenv MP_PROCS 4 | Use 4 tasks/nodes again |
setenv MP_EUILIB ip | Go back to Internet protocol | |
setenv MP_LABELIO yes | Prepend I/O with the task rank | |
setenv MP_SAVEHOSTFILE myhosts | Save the names of the nodes used in a file called "myhosts" | |
| ksh/bsh | export MP_PROCS=4 | Use 4 tasks/nodes again |
export MP_EUILIB=ip | Go back to Internet protocol | |
export MP_LABELIO=yes | Prepend I/O with the task rank | |
export MP_SAVEHOSTFILE=myhosts | Save the names of the nodes used in a file called "myhosts" |
What happens? Look closely at the screen output and compare it to what you saw the first time you ran the code. Also, check the contents of the file "myhosts". It should confirm what you see on the screen as output.
Generally speaking, there aren't many cases where you'll need to "manually" select which nodes should be used to run your POE job. This step will demonstrate how to do it though, should you ever have the need.
| Shell | Command | Description |
|---|---|---|
| csh/tcsh | setenv MP_RESD no | Turn off selection by the Resource Manager - just to be sure |
setenv MP_HOSTFILE hostfile | Specify the host file you created | |
| csh/tcsh | export MP_RESD=no | Turn off selection by the Resource Manager - just to be sure |
export MP_HOSTFILE=hostfile | Specify the host file you created |
This concludes the POE exercise.