SP Parallel Programming Workshop
l o a d l e v e l e r



  Table of Contents
  1. What Is LoadLeveler?
  2. LoadLeveler Overview
  3. Basic LoadLeveler Tasks
    1. Building a Job Command File
    2. Submitting a Job
    3. Displaying Job Status
    4. Changing a Job's Priority
    5. Holding / Releasing a Job
    6. Displaying a Machine's Status
    7. Canceling a Job
    8. Displaying the Central Manager
  4. Submitting Multiple Jobs
  5. Using the Job Command File as the Executable
  6. Submitting Parallel Jobs - General Notes
  7. Submitting MPI Parallel Jobs
  8. Submitting PVM Parallel Jobs
  9. Submitting PVMe Parallel Jobs
  10. LoadLeveler Internals
  11. How LoadLeveler Schedules Parallel Jobs
  12. LoadLeveler at the MHPCC
  13. LoadLeveler Job Command File Keywords Reference
  14. LoadLeveler Commands Reference
  15. References and More Information
  16. Exercise


 
What Is LoadLeveler?


 
LoadLeveler Overview


 
Basic LoadLeveler Tasks
Building a Job Command File


 
Basic LoadLeveler Tasks
Submitting a Job


 
Basic LoadLeveler Tasks
Displaying Job Status


 
Basic LoadLeveler Tasks
Changing a Job's Priority


 
Basic LoadLeveler Tasks
Holding a Job


Releasing a Held Job

 
Basic LoadLeveler Tasks
Displaying a Machine's Status


 
Basic LoadLeveler Tasks
Canceling a Job


 
Basic LoadLeveler Tasks
Displaying the Central Manager


 
Submitting Multiple Jobs


 
Using the Job Command File as the Executable


 
Submitting Parallel Jobs - General Notes


 
Submitting Parallel MPI Jobs


 
LoadLeveler Internals


 
How LoadLeveler Schedules Parallel Jobs

Note Note: This section does not apply to MHPCC users. The MHPCC has implemented its own batch scheduler which supercedes the LoadLeveler scheduling mechanisms. Please see the "LoadLeveler at the MHPCC" section of this tutorial for details.



 
LoadLeveler at the MHPCC

Warning Important: The MHPCC has implemented a batch scheduler which replaces most of LoadLeveler's scheduling mechanisms. Additionally, there are a number of site specific details unique to the MHPCC. Users should be certain to become familiar with this section before attempting to submit batch jobs on the MHPCC systems.

  1. LoadLeveler Documentation: IBM's LoadLeveler manuals are available by using InfoExplorer.

  2. MHPCC Scheduler Documentation: some general information is available from the MHPCC Technical Documentation WWW page. Most other documentation for users is included here. Curious users may also review the scheduler configuration file and docs located in /u/loadl/bqs, though most of this information will probably not be of much use for users.

  3. Job Scheduling: The MHPCC scheduler is not a FIFO scheduler. Some jobs may "jump" ahead of others for various reasons including backfilling (see below) and job priority. There are many scheduling parameters which are configurable by the MHPCC and subject to change. Additionally, "special scheduling" is used to reserve nodes for a particular user or group. (Note: A full discussion of job scheduling is beyond the scope of this tutorial).

  4. showq: This utility replaces the llq command. It lists the running, queued and non-queued jobs.

  5. Path Variable: Include the LoadLeveler executables location in your path. For example, if your login shell is the C Shell, add the following line to your .cshrc file and then "source .cshrc" to make it take effect.
    
        set path=($path /usr/lpp/LoadL/full/bin)
      

  6. Batch Classes: All batch nodes are configured as a single batch class. Users, therefore, do not need to specify a batch class in their LoadLeveler command file.

  7. Number of Nodes per Job: Users may routinely specify anywhere from 1 to 128 nodes to run their job.

  8. Account_no: The account_no keyword is mandatory at MHPCC. Your job WILL NOT be scheduled without a valid account number in your loadleveler command file, specific to your project. Please review account_no for further details.

  9. Time Limits: The maximum amount of time permitted for a job to run is determined by the minimum number of nodes required for that job. The table below is current as of 8/8/97. For the most recent table, please see the MHPCC Configuration Summary.

    Time Limit (hrs) Minimum# processors
    8 65 - 128
    16 33 - 64
    24 5 - 32
    36 1 - 4

  10. Special Requests: Users who desire to run jobs with longer time limits or more nodes may submit a Special Request Form.

  11. llsubmit : This command includes a front end screen which validates command files for a number of possible errors. Job commands file which are rejected will usually be accompanied by an error message stating why they were rejected.

  12. llprio : Due to various constraints imposed by the scheduler backfilling feature, llprio may not be effective for users attempting to order their own jobs. The llhold command may be used instead.

  13. Job Prioritization: Is entirely controlled by the MHPCC scheduler's configuration parameters and subject to change. In general:

    1. Jobs are first screened for eligibility using site configurable parameters that define MHPCC scheduling approaches and fairness policies. Additionally, MHPCC system administrators are able to set higher/lower priorities for any given job.

    2. Jobs are then passed along to a "feasibility pool," where they are prioritized based on job-related parameters such as the quality of service, size, length, and queue time.

    3. Prioritized jobs are then scheduled for execution and assigned nodes on an immediate or future start time.

  14. Backfilling: The MHPCC scheduler routinely runs smaller/shorter jobs ahead of larger/longer jobs if the available nodes would otherwise sit idle. One of three backfill algorithms may be used:

    1. FIRSTFIT - Backfills jobs in the order it finds them in the queue.

    2. BESTFIT - Selects the job from the queue that best fits the backfill window.

    3. GREEDY - Identifies all combination of jobs that can run in the window, analyzing the quality of each. Jobs contained in the best schedule are backfilled.

    Currently (8/8/97), the scheduler is using FIRSTFIT algorithm to backfill jobs. Users can take advantage of this by setting their wall_clock_limit keyword to the shortest amount of time required by their job.

  15. showbf: Use this utility to view the current backfill "window".

  16. wall_clock_limit: This keyword is required for a command script to be accepted by llsubmit at the MHPCC. This works to the user's advantage by allowing for a backfill capability.

  17. Using large memory: AIX compilers have a maximum default of 256 MB for program data and stack size. To obtain more than this, you must do the following:

    1. Compile your program with the -bmaxdata: option. For example, the following compile command will allocate 512 MB data segment:
       xlf -bmaxdata:512000000 -o myprog mprog.f  
      See the xlf man page for details about the -bmaxdata: option.

    2. Make sure that the memory limits set in your shell are not too small. To do this, simply put the unlimit command in your .cshrc shell.

  18. Examples. A few "getting started" examples are available in the LoadLeveler Exercise. Be sure to modify the job command files for your own use before attempting to run them in LoadLeveler. In particular:

  19. Number of Jobs Running and Queued: The MHPCC scheduler has parameters which control the number of jobs a user may have running and/or scheduled to be run. Be reasonable about the number of jobs you queue up - no more than 20 please. There is no advantage to queing more jobs as the MHPCC scheduler will not permit "queue stuffing".

  20. Pathnames: When specifying your home directory in a job command file, do not use pathnames that begin with "/a" such as /a/raid2fr2sw/u8/jsmith. Instead, use something like /u/jsmith. Use of the "/a" paths will cause the command file to fail.

  21. If you plan on using LoadLeveler's GUI, xloadl, add the xloadl X resource specifications to your .Xdefaults file. This step is optional, but if you're using xloadl, it makes it look a lot nicer.

    1. Copy the xloadl X resource specification file to your own directory:
      
          cp $WORKSHOP/samples/loadl/Xdefaults.xloadl  .
        

    2. Edit your .Xdefaults file to include the Xdefaults.xloadl file.

    3. Make sure your DISPLAY variable and xhost permissions are set correctly.

    4. Start xloadl with the command "xloadl &"

Other helpful hints

  1. If you have multiple #@ environment statements, only the last will have effect. If you need to specify multiple environment variables, separate them by semi-colons with a single #@ environment statment. For example:
    
    #@ environment  = MP_Shared_MEMORY=yes;MP_INFOLEVEL=3;MP_LABELIO=yes
    

  2. Don't forget to set up the network adapter and communications library for optimum performance:
    
    #@ network.MPI  = css0,not_shared,US 
    

  3. Communications throughput - rough estimates. You should expect something close to these for jobs with message sizes greater than 500,000 bytes in length.

  4. LoadLeveler will "source" both your .cshrc and .login files. C Shell users may wish to exclude LoadLeveler from running interactive only commands in these files by doing something like:
    
    if ($?prompt) then
      setenv TERM vt100
      set filec
      set prompt =  "`hostname -s`% "
      setenv MP_EUILIB us
         :
         :
    endif
    
    

  5. Do not use the #@ executable statement if you are running parallel jobs. Parallel jobs use the job command file as the executable.

  6. Use your FULL email address if you specify "notify_user" in your command script.

  7. Do not try to use LoadLeveler macros, such as $(job_name), $(cluster) or $(process) as script variables. They will not be recognized by the shell.


 
LoadLeveler Job Command File Keywords Reference

An alphabetical list of the keywords you can use in a LoadLeveler job command file is provided below. All of these keywords are linked to their descriptions from the llsubmit man page.



 
LoadLeveler Commands Reference

The following commands permit you to perform LoadLeveler related activities. Each is linked to its LoadLeveler man page.



 
References and More Information