MPI_Irecv无法正确接收MPI_Send发送的数据 [英] MPI_Irecv does not properly receive the data sent by MPI_Send
问题描述
我有一个1D矩阵数据作为Q_send_matrix
.在每次迭代中,每个处理器都会更新其Q_send_matrix
并将其发送到前一个处理器(rank-1
),而它会从下一个处理器(rank+1
)接收一个新更新的矩阵作为Q_recv_matrix
.例如,在迭代中,Proc[0]
更新其Q_send_matrix
并将其发送到Proc[3]
,而从Proc[1]
接收Q_recv_matrix
.正如您可能估计的那样,它就像是振铃通讯.请在下面的代码解释后之后查看下面的代码.
I have a 1D matrix data as Q_send_matrix
. In each iteration, each processor updates its Q_send_matrix
and sends it to the previous processor(rank-1
), whereas it receives a newly updated matrix as Q_recv_matrix
from the next processor(rank+1
). For instance, in an iteration, Proc[0]
updates its Q_send_matrix
and sends it to Proc[3]
, whereas it receives Q_recv_matrix
from Proc[1]
. As you may estimated, it is like a ring communication. Please see the code below after my explanation below of the code.
MPI_Request request;
MPI_Status status;
// All the elements of Q_send and Q_recv buffers
// are set to 1.0 initially. Each processor
// updates its Q_send buffer to prepare it
// to be sent below.(above part is big, so it
// is not added here...)
/**
* Transfer Q matrix blocks among processors
* + Each processor sends the Q matrix
* + to the previous processor while receives
* + the Q matrix from the next processor
* + It is like a ring communication
* */
/* Receive Q matrix with MPI_Irecv */
source = (my_rank+1)%comm_size;
recv_count = no_col_per_proc[source]*input_k;
MPI_Irecv(Q_recv_matrix, recv_count,
MPI_FP_TYPE, source,
0, MPI_COMM_WORLD,
&request);
/* Send Q matrix */
dest = (my_rank-1+comm_size)%comm_size;
send_count = no_col_per_proc[my_rank]*input_k;
MPI_Send(Q_send_matrix, send_count,
MPI_FP_TYPE, dest,
0, MPI_COMM_WORLD);
/* Wait status */
// MPI_Wait(request, status);
/* Barrier */
MPI_Barrier(MPI_COMM_WORLD);
/* Print Q send and receive matrices */
for( j = 0; j < send_count; j ++ )
{
printf("P[%d] sends Q_send[%d] to P[%d] = %.2f\n",
my_rank, j, dest, Q_send_matrix[j]);
}
for( j = 0; j < recv_count; j ++ )
{
printf("P[%d] receives Q_recv[%d] from P[%d] = %.2f\n",
my_rank, j, source, Q_recv_matrix[j]);
}
我想以同步方式进行交流.但是,由于基于它们的阻止功能的死锁,MPI_Send
和MPI_Recv
无法实现.因此,我将MPI_Irecv
和MPI_Send
与MPI_Wait
一起使用.但是,它没有完成,所有处理器都在等待.因此,我放置了MPI_Barrier
而不是MPI_Wait
使其同步,并解决了处理器的等待问题,从而使它们完成了工作.但是,它不能正常工作.以下代码的某些输出是错误的.每个处理器发送正确的数据,并且发送方没有问题.另一方面,接收到的数据缓冲区没有变化.这意味着在某些处理器中,即使从其他处理器之一接收到的数据如下,接收到的缓冲区的初始值仍然保持不变.
I want to do this communication as a syncronous. However, it is not possible with MPI_Send
and MPI_Recv
due to the deadlock based on their blocking feature. Hence, I used MPI_Irecv
and MPI_Send
together with an MPI_Wait
. However, it did not finish, all the processors were waiting. So, I put an MPI_Barrier
instead of MPI_Wait
to make them syncronous, and solved the wait issue of the processors, so they finished their work. However, it did not work properly. Some outputs of the code as following are wrong. Each processor sends the correct data, and there is no problem on the sending side. On the other hand, there is no change on the received data buffer. It means that in some processors the initial value of the received buffer remains even if received a data from one of the other processors as following.
P[0] sends Q_send[0] to P[3] = -2.12
P[0] sends Q_send[1] to P[3] = -2.12
P[0] sends Q_send[2] to P[3] = 4.12
P[0] sends Q_send[3] to P[3] = 4.12
P[0] receives Q_recv[0] from P[1] = 1.00
P[0] receives Q_recv[1] from P[1] = 1.00
P[0] receives Q_recv[2] from P[1] = 1.00
P[0] receives Q_recv[3] from P[1] = 1.00
P[1] sends Q_send[0] to P[0] = -2.12
P[1] sends Q_send[1] to P[0] = -2.12
P[1] sends Q_send[2] to P[0] = 0.38
P[1] sends Q_send[3] to P[0] = 0.38
P[1] receives Q_recv[0] from P[2] = 1.00
P[1] receives Q_recv[1] from P[2] = 1.00
P[1] receives Q_recv[2] from P[2] = 1.00
P[1] receives Q_recv[3] from P[2] = 1.00
P[2] sends Q_send[0] to P[1] = 1.00
P[2] sends Q_send[1] to P[1] = 1.00
P[2] sends Q_send[2] to P[1] = -24.03
P[2] sends Q_send[3] to P[1] = -24.03
P[2] receives Q_recv[0] from P[3] = 1.00
P[2] receives Q_recv[1] from P[3] = 1.00
P[2] receives Q_recv[2] from P[3] = 1.00
P[2] receives Q_recv[3] from P[3] = 1.00
P[3] sends Q_send[0] to P[2] = 7.95
P[3] sends Q_send[1] to P[2] = 7.95
P[3] sends Q_send[2] to P[2] = 0.38
P[3] sends Q_send[3] to P[2] = 0.38
P[3] receives Q_recv[0] from P[0] = -2.12
P[3] receives Q_recv[1] from P[0] = -2.12
P[3] receives Q_recv[2] from P[0] = 4.12
P[3] receives Q_recv[3] from P[0] = 4.12
推荐答案
您必须完成MPI_Wait
或成功的MPI_Test
,然后才能访问MPI_Irecv
中的数据.您不能用障碍代替它.
You must finish an MPI_Wait
or a successful MPI_Test
before accessing the data from MPI_Irecv
. You cannot replace that with a barrier.
对于环形通信,请考虑使用MPI_Sendrecv
.它比使用异步通信更简单.
For a ring-communication, consider using MPI_Sendrecv
. It can be simpler than using asynchronous communication.
这篇关于MPI_Irecv无法正确接收MPI_Send发送的数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!