MPI的Scatterv操作 [英] MPI's Scatterv operation

查看:403
本文介绍了MPI的Scatterv操作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我不确定我是否正确理解了MPI_Scatterv应该做什么.我有79个项目可以分散数量可变的节点.但是,当我使用MPI_Scatterv命令时,我得到了荒谬的数字(好像接收缓冲区的数组元素未初始化).这是相关的代码段:

I'm not sure that I am correctly understanding what MPI_Scatterv is supposed to do. I have 79 items to scatter amounts a variable amount of nodes. However, when I use the MPI_Scatterv command I get ridiculous numbers (as if the array elements of my receiving buffer are uninitialized). Here is the relevant code snippet:

MPI_Init(&argc, &argv);
int id, procs;

MPI_Comm_rank(MPI_COMM_WORLD, &id);
MPI_Comm_size(MPI_COMM_WORLD, &procs);

//Assign each file a number and figure out how many files should be
//assigned to each node
int file_numbers[files.size()];
int send_counts[nodes] = {0}; 
int displacements[nodes] = {0};

for (int i = 0; i < files.size(); i++)
{
    file_numbers[i] = i;
    send_counts[i%nodes]++;
}   

//figure out the displacements
int sum = 0;
for (int i = 0; i < nodes; i++)
{
    displacements[i] = sum;
    sum += send_counts[i];
}   

//Create a receiving buffer
int *rec_buf = new int[79];

if (id == 0)
{
    MPI_Scatterv(&file_numbers, send_counts, displacements, MPI_INT, rec_buf, 79, MPI_INT, 0, MPI_COMM_WORLD);
}   

cout << "got here " << id << " checkpoint 1" << endl;
cout << id << ": " << rec_buf[0] << endl;
cout << "got here " << id << " checkpoint 2" << endl;

MPI_Barrier(MPI_COMM_WORLD); 

free(rec_buf);

MPI_Finalize();

运行该代码时,我收到以下输出:

When I run that code I receive this output:

got here 1 checkpoint 1
1: -1168572184
got here 1 checkpoint 2
got here 2 checkpoint 1
2: 804847848
got here 2 checkpoint 2
got here 3 checkpoint 1
3: 1364787432
got here 3 checkpoint 2
got here 4 checkpoint 1
4: 903413992
got here 4 checkpoint 2
got here 0 checkpoint 1
0: 0
got here 0 checkpoint 2

我阅读了OpenMPI的文档,并浏览了一些代码示例,但我不确定缺少什么帮助会很棒!

I read the documentation for OpenMPI and looked through some code examples, I'm not sure what I'm missing any help would be great!

推荐答案

最常见的MPI错误之一再次出现:

One of the most common MPI mistakes strikes again:

if (id == 0)    // <---- PROBLEM
{
    MPI_Scatterv(&file_numbers, send_counts, displacements, MPI_INT,
                 rec_buf, 79, MPI_INT, 0, MPI_COMM_WORLD);
}   

MPI_SCATTERV集体MPI操作.指定通信器中的所有进程必须必须执行集体操作才能成功完成.您只在等级0中执行它,这就是为什么只有它才能获得正确的值.

MPI_SCATTERV is a collective MPI operation. Collective operations must be executed by all processes in the specified communicator in order to complete successfully. You are executing it only in rank 0 and that's why only it gets the correct values.

解决方案:删除条件if (...).

但是这里还有另一个细微的错误.由于集体操作不提供任何状态输出,因此MPI标准对发送到某个等级的元素数量与该等级愿意接收的元素数量进行严格匹配.在您的情况下,接收方总是指定79个元素,这些元素可能与send_counts中的相应数字不匹配.您应该改用:

But there is another subtle mistake here. Since collective operations do not provide any status output, the MPI standard enforces strict matching of the number of elements sent to some rank and the number of elements the rank is willing to receive. In your case the receiver always specifies 79 elements which might not match the corresponding number in send_counts. You should instead use:

MPI_Scatterv(file_numbers, send_counts, displacements, MPI_INT,
             rec_buf, send_counts[id], MPI_INT,
             0, MPI_COMM_WORLD);

还要注意以下代码中的差异,这可能也是在此处发布问题时出现的错字:

Also note the following discrepancy in your code that might as well be a typo while posting the question here:

MPI_Comm_size(MPI_COMM_WORLD, &procs);
                               ^^^^^
int send_counts[nodes] = {0};
                ^^^^^
int displacements[nodes] = {0};
                  ^^^^^

虽然在procs变量中获得了等级数,但其余代码中使用了nodes.我猜nodes应该用procs代替.

While you obtain the number of ranks in the procs variable, nodes is used in the rest of your code. I guess nodes should be replaced by procs.

这篇关于MPI的Scatterv操作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆