The GO_BOARD array will have its rows distributed cyclically.
The "*" specifies that GO_BOARD is not to be distributed along its
second dimension; thus an entire row is to be distributed as one object.
Distribution Examples
Distribution Examples
This page includes several new slides (not included in your original hardcopy
printout) that illustrate several good and bad distribution choices for
a simple loop.
REAL A(N, N) B(N,N) C(N, N) D(N) E(N)
DO I = 1, N
DO J = 1, N
A(I, J) = B(I, J) + C(J, I) + D(J) + func(E(J))
END DO
END DO
!HPF$ ALIGN
- The ALIGN directive is used to specify that certain data objects
are mapped in the same way as certain other data objects.
- Elements of two different objects that are aligned will be stored on the
same processor.
- Certain operations between aligned data objects will be more efficient
than operations between non-aligned data objects.
- Objects can be aligned by matching DISTRIBUTE statements; however, ALIGN
can implement more general alignments.
- Common implementations of the ALIGN directive for non-comformable arrays
actually create two variables of the same size: the smaller array
takes up as much memory as the larger array.
Examples of ALIGN
- ALIGNing a smaller array inside a larger array:
DIMENSION A(10,10), B(8,8)
!HPF$ ALIGN B(I,J) WITH A(I+1,J+1)
- ALIGNing a two dimensional array with a one dimensional array. The :
signifies which dimensions are aligned and the * indicates positions
not used.
INTEGER Y (N)
REAL, DIMENSION (N,N) :: X
!HPF$ ALIGN X(:,*) WITH Y (:)
- Example of transposing two axes:
!HPF$ ALIGN X(J,K) WITH Y(K,J)
- Example of reversing both axes:
!HPF$ ALIGN X(J,K) WITH Y (M-J+1,N-K+1)
!HPF$ PROCESSORS
- Optional directive used to provide additional information
useful in distributing data to specific geometry
- Useful for machines with non-uniform communication
capabilities.
- Used to define an array of Abstract Processors
- Defines a linear array and two matrices
- The compiler will try to map these to Actual Processors
- Computer need not have this geometry
- Computer may not have this many processors
- Processor arrays can have a rank up to 7
!HPF$ Template
- An abstract space of indexed positions
- Useful in aligning arrays.
- Specifically useful when aligning partially overlapping arrays.
!HPF$ TEMPLATE, DISTRIBUTE (BLOCK,BLOCK) :: overlap (30,30)
real, dimension (20,20) :: a,b
!HPF$ ALIGN a(i,j) with overlap (i,j)
!HPF$ ALIGN b(i,j) with overlap (i+10,j+10)
REALIGN and REDISTRIBUTE
- Similar to ALIGN and DISTRIBUTE
- An array can be moved in two ways
- Realigning or redistributing itself
- Redistributing the array to which it is aligned
- Can cause very heavy data communication
- The !HPF$ INDEPENDENT directive can precede an indexed DO loop
- It asserts to the compiler that the operations in a DO loop may be
executed independently - that is, in any order, or interleaved, or
concurrently - without changing the semantics of the program.
- This directive is useful since it is often the case that the
compiler can not detect parallelism.
- The INDEPENDENT directive allows the compiler to parallelize the loop
without concern for dependencies .
|
Data Parallel Constructs and Attributes
|
Array Processing
Array processing is one of the most attractive features in the F90 / HPF.
It is particularly important to numerical intensive high performance
scientific computation.
A whole array is now an object. Operations can be performed on
a whole array rather than one element at a time.
Processing with arrays - definitions
- rank - The rank of an array is the number of dimensions
- extent - The extent of an array dimension is the number of elements
in a dimension
- shape - The shape of an array is a vector of its extents
- size - The size of an array in a product of its extents
- conformance - Arrays are said to be conformable if they have the same
shape
Array Specifications
type [,DIMENSION (extent-list),[,attribute] ... ::] entry-list
- type - can be an intrinsic(integer, real, complex ...) or derived type
- dimension - used to define entents of each dimension
- (extent-list) if an explicit shape array, defines upper and lower bounds
in each dimension; the values are provided by integer expressions
that can be evaluated at compile-time
- attribute - provides information (allocatable, dimension, intrinsic ...)
Array Operations
- Calculations can be performed on whole arrays or sections of arrays as
long as they are conformable
Array Sections
- A subset of an array may be specified by referencing a range
- As a subscript - a(1,2,3)
- As a subscript triplet [lower bound]:[upper bound][:stride] - a(2:6:2)
- As a subscript vector a(/2,4,6/)
Example of array constructor and operations
program Arrays
real :: a(0:6), b(3,3), c(-1:1)
a = (/ (sqrt(real(i)), i=1,7) /)
b = reshape( source = (/a, 2.5, 2.6/), &
shape = (/3,3/) )
c = 10.0
c = b(1,:) + b(:,3) + c
print *, "This is the 2nd element of array c:", c(0)
print *, "This is array c:", c
end program Arrays
Masked array assignments - WHERE
The WHERE construct allows for array assignments and calculations based on a
conditional mask array. All arrays must be conformable.
There are two types of WHERE.
- WHERE statement
WHERE ( logical-expression ) array-intrinsic-assignment-statement
- WHERE construct
WHERE ( logical-expression )
[ array-intrinsic-assignment-statement ] ...
ELSE WHERE
[ array-intrinsic-assignment-statement ]
END WHERE
The WHERE construct may not be nested.
An example of a WHERE construct:
subroutine example (a,b,c,above,below)
real, dimension (N,N)::a,b,c,above,below
b = 0
c = 0
where (a > 0.50)
above = 1
b = a
else where
below = 1
c = a
end where
end
Non-conformable array assignments - Forall
- The purpose of the FORALL statement and construct is to provide a
convenient syntax for simultaneous assignments to large groups of array
elements.
- A FORALL statement appears like a loop, but it is not.
- The "iterations" of the FORALL statement are executed simultaneously.
- The right-hand sides of the statement are calculated for the entire
index set of the FORALL statement before the left-hand sides are
modified.
- The FORALL construct allows for array assignments and calculations on
non-conformable arrays.
- An optional conditional mask array is supported.
- Function calls are supported within FORALLs.
There are two forms of FORALL
- The FORALL statement
FORALL (forall-triplet-spec-list [, scalar-mask-expr]) forall-assignment
- The FORALL construct
FORALL (forall-triplet-spec-list [, scalar-mask-expr])
forall-body-statement
[forall-body-statement]
END FORALL
- Multiple statements in the FORALL construct are executed
sequentially.
- Statement i is executed for the entire index set of the FORALL construct
before statement i+1 is begun.
- The right-hand sides of each statement are calculated for the entire
index set of the FORALL statement before the left-hand sides of that
same statement are modified.
Examples of the FORALL construct
- do x(i,j)=1/y(i) for i=1 to n, j=1 to m and y(i) 0.0
- Test prevents divide by zero
- Test may prevent/cause communication?
Pure Procedures
- A Pure Procedure is a function with no statements that could cause side
effects, and no arguments are changed, or a subroutine that has no side
effects, except through its arguments
- A procedure with:
- No SAVE attribute or statement
- No DATA initialization
- No use of variables in COMMON or in modules
- No reference to nonPURE procedures
- No input/output
- No STOP statement
- Only Pure Procedures can be used in a FORALL assignment statement.
The INDEPENDENT directive effects FORALL
The !HPF$ INDEPENDENT command
- Gives the compiler extra information used for optimization.
- Asserts that various active index values of the forall do not
interfere with each other.
- The result will not vary if the order is changed.
Note: Some compilers do this automatically.
The INDEPENDENT statement provides:
- Execute statements independently
- Any order
- Interleaved
- Concurrently
- Effects
- Precedence
- Communication patterns
- Temporary Storage Requirements
- INDEPENDENT & Precedence
- Data transfer can be delayed until necessary for continuation
- INDEPENDENT effect on Communication & Storage
- Without INDEPENDENT
- All RHSA must be calculated before proceeding
- Requires temporary storage
- One large broadcast can be done for all RHSA to processor for LHSA
- Requires a barrier to be done for synchronization
at each step of the evaluation (after all LHSA computations, all RHSA computations,
...)
- With INDEPENDENT
- No Temporary Storage
- Only one Block at the end of the calculations
Intrinsic Array Functions
There are a total of 17 new intrinsic array functions defined in Fortran 90
- ALL
- ANY
- COUNT
- CSHIFT
- EOSHIFT
- MAXLOC
- MAXVAL
- MERGE
- MINLOC
- MINVAL
- PACK
- PRODUCT
- RESHAPE
- SPREAD
- SUM
- TRANSPOSE
- UNPACK
ALL (MASK,DIM) - Determine whether all values are true in MASK along
dimension DIM.
ANY (MASK, DIM) - Determine whether any value is true in MASK along
dimension DIM.
COUNT (MASK,DIM) - Count the number of true elements of MASK along dimension
DIM.
CSHIFT (ARRAY,SHIFT,DIM) - Perform a circular shift on an array expression of
rank one or perform circular shifts on all the complete rank one sections along
a given dimension of an array expression of rank two or greater. Elements
shifted out at one end of a section are shifted in at the other end.
EOSHIFT (Array,Shift,Boundary,Dim) - Perform an end-off shift on an array
expression of rank one or perform end-off shifts on all complete rank-one
sections along specified given dimension of an array expression of rank two or greater.
Elements are shifted off at one end of a section and copies of a boundary
values and may be shifted by different amounts in different directions
MAXVAL (ARRAY,DIM,MASK) - Computes the value of the elements of ARRAY along
dimension DIM corresponding to the true elements of MASK.
MERGE (TSOURCE,FSOURCE,MASK) - Choose alternative value according to the value
of a mask.
MINVAL(ARRAY,DIM,MASK) - Minimum value of all the elements of ARRAY along
dimension DIM corresponding to true elements of MASK.
PACK(ARRAY,MASK,VECTOR) - Pack an array into an array of rank one under the
control of a mask.
PRODUCT (ARRAY,DIM,MASK) - Multiply all of the elements of ARRAY along
dimension DIM corresponding to the true elements of MASK.
RESHAPE (SOURCE,SHAPE,PAD,ORDER) - Construct an array of a specified
shape from the elements of a given array.
SPREAD (SOURCE,DIM,NCOPIES) - Replicate an array by adding a dimension.
Broadcast several copies of SOURCE along a specified dimension and thus
forms an array of rank one or greater.
SUM (ARRAY,DIM,MASK) - Add all the elements of ARRAY along dimension DIM
corresponding to the true elements of MASK.
TRANSPOSE (MATRIX) - Transpose an array of rank two.
UNPACK (VECTOR,MASK,FIELD) - Unpack an array of rank one into an array of shape
MASK under control of MASK.
Extrinsics
HPF, provides the ability to call routines written in other programming
paradigms and languages. Since these procedures are outside of
HPF they are called extrinsic.
- Called like a normal procedure
- Declared extrinsic in an "Interface"
- When called a copy is started on each processor
- Each "sees" only a portion of the array passed
- Can use any locally defined libraries
- Can be written in any language and style
- Need not be "PURE", can have side effects
|
References and More Information
|
- Maui High Performance Computing Center's HPF Home Page is unavailable at this
time
- Written: 01/10/96 Frank Pietryka
- Portions adapted from The Albuquerque Resource Center,
University of New Mexico from tutorials by
Dr. Brian T. Smith Tim Kaiser, Jim Warsa, Ward Deng and Amy Stevenson
-
Material in this tutorial references the the High Performance Fortran
Language Specification by the High Performance Fortran Forum which is
copyrighted by Rice University. Permission to copy without fee all or
part of the Specification is granted by Rice University.