| SP Parallel Programming Workshop |
| P a r a l l e l O p e r a t i n g E n v i r o n m e n t ( POE P2SC) |
| What is the Parallel Operating Environment? |
|
| POE Definitions |
|
Before learning how to use POE, understanding some basic definitions may be useful.
| Using POE: Overview |
|
POE can be used both interactively and within a batch (scheduler) system to compile, load and run parallel jobs. The typical progression of steps is outlined below, and discussed in more detail in following sections.
| Understanding Your System Configuration |
|
Some hints/commands which may be used to answer these questions are discussed below.
llstatus: This LoadLeveler command has several options. If the -l (lowercase "L") option is used, a great amount of detailed information will be displayed for all nodes in the system. If this command is not in your path, try looking in your site's LoadLeveler directory, typically /usr/lpp/LoadL/full/bin/ or /usr/lpp/LoadL/bin/.
lscfg: This AIX command should be available on all AIX systems. It can be used to obtain detailed configuration (memory, disk, adapters, etc.) information for a specific node. Note: you must be logged into, or have rsh access, to a node in order to use this command.
| Establishing Authorization |
|
Only AIX authorization will be covered here.
hostname1 userid hostname2 userid hostname3 userid hostname4 userid ...where "hostname" is the actual name of an SP node and "userid" is your actual login userid on that node. If your userid is that same on all hosts, then you may omit it.
| Compiling and Linking a Parallel Program |
|
[compiler] | [options] | [source_files] |
| For example: | ||
mpxlf | -g -O3 -qlist -o myprog | mprog.f |
| IBM Compiler Invocation Commands | ||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Serial |
xlc / cc
| ANSI C compiler; subset of C++ compiler
|
xlC
| C++ compiler
|
xlf / f77
| Fortran 77 compatible; subset of Fortran 90 compiler
|
xlf90
| Full Fortran 90 with IBM extensions
| Pthreads
|
xlc_r
| xlc for use with pthreads programs
|
xlC_r
| xlC for use with pthreads programs
|
xlf_r
| xlf for use with pthreads programs
|
xlf90_r
| xlf90 for use with pthreads programs
| MPI
|
mpcc
| Compiler script for parallel C programs using MPI
|
mpCC
| Compiler script for parallel C++ programs using MPI
|
mpxlf
| Compiler script for parallel Fortran 77 programs using MPI
| MPI with | Pthreads
mpcc_r
| Parallel C compiler script for hybrid MPI/pthreads programs
|
mpCC_r
| Parallel C++ compiler script for hybrid MPI/pthreads programs
|
mpxlf_r
| Parallel Fortran 77 compiler script for hybrid MPI/pthreads
programs
| HPF
|
xlhpf
| High Performance Fortran; subset of xlhpf90
|
xlhpf90
| High Performance Fortran 90
| |
| -bmaxdata:bytes -bmaxstack:bytes | Required for large memory use. Default data and stack (combined) size is 256 MB. |
| -c | Compile only, producing a ".o" file. Does not link object files. |
| -g | Produce information required by debuggers and some profiler tools |
| -I (upper case i) | Names directories for additional include files. |
| -L | Specifies pathname where additional libraries reside directories will be searched in the order of their occurrence on the command line. |
| -l (lower case L) | Names additional libraries to be searched. |
| -O -O2 -O3 -O4 | Various levels of optimization |
| -o | Specifies the name of the executable (a.out by default) |
| -p -pg | Generate profiling support code |
| -qarch=arch -qtune=arch | Permits maximum optimization for the SP processor architecture being used. Can significantly improve performance at the expense of portability. |
| -qautodouble=setting | Automatic conversion of single precision to double precision, or double precision to extended precision |
| -qlanglvl=level | Specifies which language standard (or superset or subset of a standard) to check against for nonconformance |
| -qlist -qsource -qxref | Compiler listing/reporting options |
/etc/*cfg*
/usr/lpp/ppe.poe/lib/poe.cfg
| Setting Up The Execution Environment |
|
| Setting POE Environment Variables | |
setenv MP_PROCS 64
export MP_PROCS=64
This can be done in several ways:
| Setting Up The Execution Environment |
|
| Basic POE Environment Variables | |
| How many tasks / nodes do I require? |
| How will nodes be allocated - should I choose them myself or let the LoadLeveler automatically choose them for me? |
Batch systems typically override/ignore user settings for this environment variable. Additional details on using the host list file are available here.
| Which communications protocol and network interface should I use? |
| Setting Up The Execution Environment |
|
| Example Basic Environment Variable Settings | |
| csh/tcsh | ksh/bsh |
|---|---|
|
setenv MP_PROCS 4 setenv MP_RMPOOL 0 setenv MP_RESD YES setenv MP_HOSTFILE "NULL" setenv MP_EUILIB us setenv MP_EUIDEVICE css0 |
export MP_PROCS=4 export MP_RMPOOL=0 export MP_RESD=YES export MP_HOSTFILE="NULL" export MP_EUILIB=us export MP_EUIDEVICE=css0 |
| csh/tcsh | ksh/bsh |
|---|---|
|
setenv MP_NODES 4 setenv MP_TASKS_PER_NODE 4 setenv MP_RMPOOL 2 setenv MP_RESD YES setenv MP_HOSTFILE "NULL" setenv MP_EUILIB us setenv MP_EUIDEVICE css0 |
export MP_NODES=4 export MP_TASKS_PER_NODE=4 export MP_RMPOOL=2 export MP_RESD=YES export MP_HOSTFILE="NULL" export MP_EUILIB=us export MP_EUIDEVICE=css0 |
| csh/tcsh | ksh/bsh |
|---|---|
|
setenv MP_PROCS 32 unsetenv MP_RMPOOL setenv MP_RESD no setenv MP_HOSTFILE myhosts setenv MP_EUILIB ip setenv MP_EUIDEVICE css0 |
export MP_PROCS=32 unset MP_RMPOOL export MP_RESD=no export MP_HOSTFILE=myhosts export MP_EUILIB=ip export MP_EUIDEVICE=css0 |
| Setting Up The Execution Environment |
|
| Miscellaneous POE Environment Variables | |
A list of some commonly used, or potentially useful, POE environment variables appears below. A complete list of the POE environment variables can be viewed quickly in the POE man page. A much fuller discussion is available in the "IBM AIX Parallel Environment for AIX: Operation and Use Volume 1" manual.
|
MP_PROCS MP_NODES MP_TASKS_PER_NODE MP_RMPOOL MP_EUIDEVICE MP_HOSTFILE MP_SAVEHOSTFILE
MP_PMDSUFFIX | MP_RESD MP_RETRY MP_RETRYCOUNT MP_ADAPTER_USE MP_CPU_USE |
| Invoking the Executable |
|
[executable_name] | [POE_option_flags] | [executable_arguments] |
| For example: | ||
myprog | -procs 3 |
|
C Shell: setenv MP_PGMMODEL mpmd
Korn Shell: export MP_PGMMODEL=mpmd
0:node1> master
1:node2> worker
2:node3> worker
3:node4> worker
4:node5> worker
poe cp ~/input.file /tmp/input.file
poe my_serial_job
poe rm /tmp/input.file
| CPU and Communications Adapter Usage |
|
| The remainder of this section is intended primarily for interactive POE usage. SP batch systems, in general, do not permit users to share nodes, making the rest of the information in this section largely irrelevant. |
| Terminating a POE Job |
|
| poekill | [progname] | [poe_options] |
| For example: | |||
poe | poekill | myprog |
|
| poekill | myprog | |
where myhosts is a host list file containing the names of the nodes where your tasks are running.
| Run-time Analysis Tools |
|
| Parallel File Copy Utilities |
|
Note: see the associated hyperlinked man page for examples of each utility's use.
| Parallel File Copy Utilities | |
|---|---|
mcp | Copies a single file from the home node to a number of remote nodes. |
mcpscat | Copies a number of files from task 0 and scatter them in sequence to all tasks, in a round robin order. |
mcpgath | Copies a number of files from all tasks back to task 0. |
mprcp | Copies a file from the home node to a list of remote hosts. |
| Site Specific Information and Recommendations |
|
This section covers site specific details for POE usage at the MHPCC.
#@ requirements = (Adapter == "hps_user")
#@ environment = MP_EUILIB=us;MP_INFOLEVEL=3;MP_LABELIO=yes
| References and More Information |
|
| Appendix A: Programming Considerations |
|
poe myprog \"this is one arg\"
The message passing library uses an interval timer to manage message
traffic, specifically to ensure that messages progress even when message
passing calls are not being made. When this interval timer expires, a
SIGALRM signal is sent to the program, interrupting whatever computation
is in progress. The message passing library has a signal handler set,
and normally handles the signal and returns to the user's program without
the program's knowledge. However, the following library and system calls
are interrupted and do not complete normally. The user is responsible for
testing whether an interrupt occurred and recovering from the interrupt.
In many cases, this is accomplished by just retrying the call.