MPI笛卡尔拓扑:MPI_Neighbor_alltoall接收到所有错误数据 [英] MPI Cartesian Topology: MPI_Neighbor_alltoall wrong data received
问题描述
我有一个MPI笛卡尔拓扑,并且想通过MPI_Neighbor_alltoall将每个节点等级发送给它们的邻居.我不知道错误在哪里,并且我也实现了我自己的MPI_Neighbor_alltoall,该方法不起作用.我将代码最小化为(希望)易于理解的代码段.
I have an MPI Cartesian Topology and want to send every nodes rank to their neighbors with MPI_Neighbor_alltoall. I can't figure out, where the error is and i also implemented my own MPI_Neighbor_alltoall which doesn't work. I minimalised my code to an (hopefully) easy to understand code snippet.
alltoall.c
#include <mpi.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, char** argv) {
// MPI_Cart variables
MPI_Comm cart_comm;
MPI_Comm mycomm;
int ndims=2;
int periods[2]={0,0};
int coord[2];
int dims[2]={3,3};
int xdim = dims[0];
int ydim = dims[1];
int comm_size = xdim*ydim;
// Initialize the MPI environment and pass the arguments to all the processes.
MPI_Init(&argc, &argv);
// Get the rank and number of the process
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
// output: dimensions
if(rank==0){
printf("dims: [%i] [%i]\n", xdim, ydim);
}
// enough nodes
if(comm_size<=size){
// Communicator count has to match nodecount in dims
// so we create a new Communicator with matching nodecount
int color;
int graphnode;
if(rank<comm_size){
//printf("%d<%d\n",rank,comm_size);
color=0;
graphnode=1;
} else {
//printf("%d>=%d\n",rank,comm_size);
// not used nodes
color=1;
graphnode=0;
}
MPI_Comm_split(MPI_COMM_WORLD, color, rank, &mycomm);
MPI_Comm_rank(mycomm, &rank);
MPI_Comm_size(mycomm, &size);
// ***GRAPHNODE-SECTION***
if(graphnode){
// Create Dimensions
MPI_Dims_create(size, ndims, dims);
// Create Cartesian
MPI_Cart_create(mycomm, ndims, dims, periods, 1, &cart_comm);
// Get the name of the processor
char processor_name[MPI_MAX_PROCESSOR_NAME];
int len;
MPI_Get_processor_name(processor_name, &len);
// Get coordinates
MPI_Cart_coords(cart_comm, rank, ndims, coord);
// sending
int *sendrank = &rank;
int recvrank[4];
MPI_Neighbor_alltoall(sendrank , 1, MPI_INT, recvrank, 1, MPI_INT, cart_comm);
printf("my rank: %i, received ranks: %i %i %i %i\n", rank, recvrank[0], recvrank[1], recvrank[2], recvrank[3]);
} else {
// *** SPARE NODES SECTION ***
}
} else {
// not enough nodes reserved
if(rank==0)
printf("not enough nodes\n");
}
// Finalize the MPI environment.
MPI_Finalize();
}
因此,此代码创建了3x3的笛卡尔拓扑.如果没有足够的节点,它将结束;如果有太多的节点,则让备用节点不执行任何操作.即使这很容易,但我仍然做错了事,因为输出缺少一些数据.
So this code creates a 3x3 cartesian topology. It finalizes, if there are not enough nodes and let the spare nodes do nothing, when there are too many nodes. Even though this should be easy, i am still doing something wrong, because the output lacks some data.
输出
$ mpicc alltoall.c
$ mpirun -np 9 a.out
dims: [3] [3]
my rank: 2, received ranks: -813779952 5 0 32621
my rank: 1, received ranks: 1415889936 4 0 21
my rank: 5, received ranks: 9 8 0 32590
my rank: 3, received ranks: 9 6 -266534912 21
my rank: 7, received ranks: 9 32652 0 21
my rank: 8, received ranks: 9 32635 0 32635
my rank: 6, received ranks: 9 32520 1372057600 21
my rank: 0, received ranks: -1815116784 3 -1803923456 21
my rank: 4, received ranks: 9 7 0 21
从输出中可以看到,没有人将节点1,2作为邻居,而节点21是从哪里来的呢?等级4应该是唯一的节点,具有4个邻居,但是应该是{1,3,5,7}对吗?我真的不知道我的错误在哪里.
As you can see in the output, nobody has node 1,2 as neighbor and where does the 21 come from? Rank 4 should be the only node, that has 4 neighbors, but that should be {1,3,5,7} right? I really have no idea where my mistake here is.
坐标应如下所示:
[0,0] [1,0] [2,0]
[0,1] [1,1] [2,1]
[0,2] [1,2] [2,2]
,排名如下:
0 3 6
1 4 7
2 5 8
推荐答案
您正在访问许多未初始化的数据(在sendrank和recvrank中)
you are accessing a lot of uninitialized data (both in sendrank and recvrank)
这是对我有用的测试程序的重写版本
here is a rewritten version of your test program that works for me
#include <mpi.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, char** argv) {
// MPI_Cart variables
MPI_Comm cart_comm;
MPI_Comm mycomm;
int ndims=2;
int periods[2]={0,0};
int coord[2];
int dims[2]={3,3};
int xdim = dims[0];
int ydim = dims[1];
int comm_size = xdim*ydim;
// Initialize the MPI environment and pass the arguments to all the processes.
MPI_Init(&argc, &argv);
// Get the rank and number of the process
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
// output: dimensions
if(rank==0){
printf("dims: [%i] [%i]\n", xdim, ydim);
}
// enough nodes
if(comm_size<=size){
// Communicator count has to match nodecount in dims
// so we create a new Communicator with matching nodecount
int color;
int graphnode;
if(rank<comm_size){
//printf("%d<%d\n",rank,comm_size);
color=0;
graphnode=1;
} else {
//printf("%d>=%d\n",rank,comm_size);
// not used nodes
color=1;
graphnode=0;
}
MPI_Comm_split(MPI_COMM_WORLD, color, rank, &mycomm);
MPI_Comm_rank(mycomm, &rank);
MPI_Comm_size(mycomm, &size);
// ***GRAPHNODE-SECTION***
if(graphnode){
// Create Dimensions
MPI_Dims_create(size, ndims, dims);
// Create Cartesian
MPI_Cart_create(mycomm, ndims, dims, periods, 1, &cart_comm);
// Get the name of the processor
char processor_name[MPI_MAX_PROCESSOR_NAME];
int len;
MPI_Get_processor_name(processor_name, &len);
// Get coordinates
MPI_Cart_coords(cart_comm, rank, ndims, coord);
// sending
int sendrank[4];
int recvrank[4];
int i;
char * neighbors[4];
for (i=0; i<4; i++) {
sendrank[i] = rank;
recvrank[i] = -1;
}
MPI_Neighbor_alltoall(sendrank , 1, MPI_INT, recvrank, 1, MPI_INT, cart_comm);
for (i=0; i<4; i++) {
if (-1 != recvrank[i]) {
asprintf(&neighbors[i], "%d ", recvrank[i]);
} else {
neighbors[i] = "";
}
}
printf("my rank: %i, received ranks: %s%s%s%s\n", rank, neighbors[0], neighbors[1], neighbors[2], neighbors[3]);
} else {
// *** SPARE NODES SECTION ***
}
} else {
// not enough nodes reserved
if(rank==0)
printf("not enough nodes\n");
}
// Finalize the MPI environment.
MPI_Finalize();
}
这篇关于MPI笛卡尔拓扑:MPI_Neighbor_alltoall接收到所有错误数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!