28
RS/6000 SP: Practical MPI Programming
Case 3. One process sends and receives; the other receives and sends
It is always safe to order the calls of MPI_(I)SEND and MPI_(I)RECV so that a
send subroutine call at one process and a corresponding receive subroutine
call at the other process appear in matching order.
IF (myrank==0) THEN
CALL MPI_SEND(sendbuf, ...)
CALL MPI_RECV(recvbuf, ...)
ELSEIF (myrank==1) THEN
CALL MPI_RECV(recvbuf, ...)
CALL MPI_SEND(sendbuf, ...)
ENDIF
In this case, you can use either blocking or non-blocking subroutines.
Considering the previous options, performance, and the avoidance of deadlocks,
it is recommended to use the following code.
IF (myrank==0) THEN
CALL MPI_ISEND(sendbuf, ..., ireq1, ...)
CALL MPI_IRECV(recvbuf, ..., ireq2, ...)
ELSEIF (myrank==1) THEN
CALL MPI_ISEND(sendbuf, ..., ireq1, ...)
CALL MPI_IRECV(recvbuf, ..., ireq2, ...)
ENDIF
CALL MPI_WAIT(ireq1, ...)
CALL MPI_WAIT(ireq2, ...)
2.5 Derived Data Types
As will be discussed in 3.1, What is Parallelization? on page 41, if the total
amount of data transmitted between two processes is the same, you should
transmit it a fewer number of times. Suppose you want to send non-contiguous
data to another process. For the purpose of a fewer number of data
transmissions, you can first copy the non-contiguous data to a contiguous buffer,
and then send it at one time. On the receiving process, you may have to unpack
the data and copy it to proper locations. This procedure may look cumbersome,
but, MPI provides mechanisms, called derived data types, to specify more
general, mixed, and non-contiguous data. While it is convenient, the data
transmissions using derived data types might result in lower performance than
the manual coding of packing the data, transmitting, and unpacking. For this
reason, when you use derived data types, be aware of the performance impact.
2.5.1 Basic Usage of Derived Data Types
Suppose you want to send array elements a(4), a(5), a(7), a(8), a(10) and
a(11) to another process. If you define a derived data type itype1 as shown in
Figure 19 on page 29, just send one data of type itype1 starting at a(4). In this
figure, empty slots mean that they are neglected in data transmission.
Alternatively, you can send three data of type itype2 starting at a(4).