SP Parallel Programming Workshop
h i g h     p e r f o r m a n c e     f o r t r a n     ( h p f )



  Table of Contents
  1. History
  2. Why Fortran / HPF?
  3. What is HPF
  4. Steps for porting to HPF
  5. Data distribution
    1. Distribute
    2. Align
    3. Processors
    4. Template
    5. Realign and Redistribute
  6. INDEPENDENT do loops
  7. Data Parallel Constructs and Attributes
    1. Array Processing
    2. Masked array assignments - WHERE
    3. Non-conformable array assignments - FORALL
    4. "PURE" Procedures
    5. Intrinsics
    6. Extrinsic
  8. References and More Information
  9. Exercise


 
History of Fortran
More History of Fortran Fortran 90 (1992) HPF History

 
Why High Performance Fortran?

In the late 1980's many research groups developed data parallel Fortran compilers for distributed-memory parallel machines. these demonstrated that data parallel Fortran compilers were: However these languages had different syntaxes. HPF standardized data parallel Fortran languages so that a portable version of data parallel Fortran is available across a wide range of machines.

 
What is High Performance Fortran?


 
Steps for porting to HPF




It is very important to fully understand your requirements for performance and scalability when porting to High Performance Fortran. Be sure to research performance and scalability of the various HPF compilers before starting development. It is often wise to do your own benchmarks with code segments which are representative of the code stream before starting a large software development effort.
  1. You should understand the performance and scalability of your serial code before starting. This is important to determine:



  2. Compile and run your serial code with the HPF compiler you intend to use. This will determine if the HPF compiler supports all the Fortran features your program uses.

  3. An optional step is to try one of the various tools which will analyze a code and automatically parallelize the code. Such tools often inform you of the computational hot spots and may indicate code that is difficult to parallelize and thus needs to be rewritten.

  4. Add data distribution directives. HPF provides statements like DISTRIBUTE, ALIGN, PROCESSORS, and TEMPLATE that give hints to the compiler on how to partition data structures and distribute to processors.

  5. Fortran 90 and HPF provide many constructs to improve the compiler's ability identify which code must be parallelized.
  6. Try a different data distribution scheme.

  7. Try a parallel algorithm that is better suited to your machine.

  8. If the performance or scalability does not meet your needs, you have two choices. You can either wait for the compiler to improve or write your code using message passing.

     
    Data distribution with High Performance Fortran


    !HPF$ DISTRIBUTE


    Distribution Examples




    Distribution Examples

    This page includes several new slides (not included in your original hardcopy printout) that illustrate several good and bad distribution choices for a simple loop.

      REAL A(N, N)  B(N,N)  C(N, N)  D(N)  E(N)
    
      DO I = 1, N
    
        DO J = 1, N
    
           A(I, J) = B(I, J) + C(J, I) + D(J) + func(E(J))
    
        END DO
    
      END DO
    



    !HPF$ ALIGN


    Examples of ALIGN


    !HPF$ PROCESSORS


    !HPF$ Template


    REALIGN and REDISTRIBUTE

     
    INDEPENDENT Do Loops


     
    Data Parallel Constructs and Attributes


    Array Processing

    Array processing is one of the most attractive features in the F90 / HPF. It is particularly important to numerical intensive high performance scientific computation.

    A whole array is now an object. Operations can be performed on a whole array rather than one element at a time.


    Processing with arrays - definitions


    Array Specifications
    
    type [,DIMENSION (extent-list),[,attribute] ...  ::] entry-list
    
    Array Operations Array Sections


    Example of array constructor and operations
    
    program Arrays
       real :: a(0:6), b(3,3), c(-1:1)
    
       a = (/ (sqrt(real(i)), i=1,7) /) 
       b = reshape( source = (/a, 2.5, 2.6/), &
                 shape = (/3,3/) )   
       c = 10.0
       c = b(1,:) + b(:,3) + c
    
       print *, "This is the 2nd element of array c:", c(0)
       print *, "This is array c:", c
    
    end program Arrays
    


    Masked array assignments - WHERE
    The WHERE construct allows for array assignments and calculations based on a conditional mask array. All arrays must be conformable.

    There are two types of WHERE. The WHERE construct may not be nested.

    An example of a WHERE construct:
    
            subroutine example (a,b,c,above,below)
            real, dimension (N,N)::a,b,c,above,below
    
            b = 0 
            c = 0
    
            where (a > 0.50)
               above = 1
               b = a
            else where 
               below = 1
               c = a
            end where
            end
    
    
    


    Non-conformable array assignments - Forall


    There are two forms of FORALL



    Examples of the FORALL construct


    Pure Procedures


    The INDEPENDENT directive effects FORALL

    The !HPF$ INDEPENDENT command


    Intrinsic Array Functions

    There are a total of 17 new intrinsic array functions defined in Fortran 90



    ALL (MASK,DIM) - Determine whether all values are true in MASK along dimension DIM.

    ANY (MASK, DIM) - Determine whether any value is true in MASK along dimension DIM.

    COUNT (MASK,DIM) - Count the number of true elements of MASK along dimension DIM.

    CSHIFT (ARRAY,SHIFT,DIM) - Perform a circular shift on an array expression of rank one or perform circular shifts on all the complete rank one sections along a given dimension of an array expression of rank two or greater. Elements shifted out at one end of a section are shifted in at the other end.

    EOSHIFT (Array,Shift,Boundary,Dim) - Perform an end-off shift on an array expression of rank one or perform end-off shifts on all complete rank-one sections along specified given dimension of an array expression of rank two or greater. Elements are shifted off at one end of a section and copies of a boundary values and may be shifted by different amounts in different directions

    MAXVAL (ARRAY,DIM,MASK) - Computes the value of the elements of ARRAY along dimension DIM corresponding to the true elements of MASK.

    MERGE (TSOURCE,FSOURCE,MASK) - Choose alternative value according to the value of a mask.

    MINVAL(ARRAY,DIM,MASK) - Minimum value of all the elements of ARRAY along dimension DIM corresponding to true elements of MASK.

    PACK(ARRAY,MASK,VECTOR) - Pack an array into an array of rank one under the control of a mask.

    PRODUCT (ARRAY,DIM,MASK) - Multiply all of the elements of ARRAY along dimension DIM corresponding to the true elements of MASK.

    RESHAPE (SOURCE,SHAPE,PAD,ORDER) - Construct an array of a specified shape from the elements of a given array.

    SPREAD (SOURCE,DIM,NCOPIES) - Replicate an array by adding a dimension. Broadcast several copies of SOURCE along a specified dimension and thus forms an array of rank one or greater.

    SUM (ARRAY,DIM,MASK) - Add all the elements of ARRAY along dimension DIM corresponding to the true elements of MASK.

    TRANSPOSE (MATRIX) - Transpose an array of rank two.

    UNPACK (VECTOR,MASK,FIELD) - Unpack an array of rank one into an array of shape MASK under control of MASK.


    Extrinsics

    HPF, provides the ability to call routines written in other programming paradigms and languages. Since these procedures are outside of HPF they are called extrinsic.



     
    References and More Information