MPI C ++矩阵添加，函数参数和函数返回 [英] MPI C++ matrix addition, function arguments, and function returns

查看：153 发布时间：2016/10/30 17:12:10 c++ mpi

本文介绍了MPI C ++矩阵添加，函数参数和函数返回的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在过去的2年里，我一直在互联网上学习C ++，终于需要开发MPI。我一直在淘汰stackoverflow和互联网的其余部分（包括 http： //people.sc.fsu.edu/~jburkardt/cpp_src/mpi/mpi.html 和 https://computing.llnl.gov/tutorials/mpi/#LLNL ）。我想我有一些逻辑下来，但我很难包围我的头围下列：

I've been learning C++ from the internet for the past 2 years and finally the need has arisen for me to delve into MPI. I've been scouring stackoverflow and the rest of the internet (including http://people.sc.fsu.edu/~jburkardt/cpp_src/mpi/mpi.html and https://computing.llnl.gov/tutorials/mpi/#LLNL). I think I've got some of the logic down, but I'm having a hard time wrapping my head around the following:

#include (stuff)
using namespace std;

vector<double> function(vector<double> &foo, const vector<double> &bar, int dim, int rows);

int main(int argc, char** argv)
{
    vector<double> result;//represents a regular 1D vector
    int id_proc, tot_proc, root_proc = 0;
    int dim;//set to number of "columns" in A and B below
    int rows;//set to number of "rows" of A and B below
    vector<double> A(dim*rows), B(dim*rows);//represent matrices as 1D vectors

    MPI::Init(argc,argv);
    id_proc = MPI::COMM_WORLD.Get_rank();
    tot_proc = MPI::COMM_WORLD.Get_size();

    /*
    initialize A and B here on root_proc with RNG and Bcast to everyone else
    */

    //allow all processors to call function() so they can each work on a portion of A
    result = function(A,B,dim,rows);

    //all processors do stuff with A
    //root_proc does stuff with result (doesn't matter if other processors have updated result)

    MPI::Finalize();
    return 0;
}

vector<double> function(vector<double> &foo, const vector<double> &bar, int dim, int rows)
{
    /*
    purpose of function() is two-fold:
    1. update foo because all processors need the updated "matrix"
    2. get the average of the "rows" of foo and return that to main (only root processor needs this)
    */

    vector<double> output(dim,0);

    //add matrices the way I would normally do it in serial
    for (int i = 0; i < rows; i++)
    {
        for (int j = 0; j < dim; j++)
        {
            foo[i*dim + j] += bar[i*dim + j];//perform "matrix" addition (+= ON PURPOSE)
        }
    }

    //obtain average of rows in foo in serial
    for (int i = 0; i < rows; i++)
    {
        for (int j = 0; j < dim; j++)
        {
            output[j] += foo[i*dim + j];//sum rows of A
        }
    }

    for (int j = 0; j < dim; j++)
    {
            output[j] /= rows;//divide to obtain average
    }

    return output;        
}

上述代码仅用于说明概念。我主要关注的是并行化矩阵添加，但是我的想法是这样的：

The code above is to illustrate the concept only. My main concern is to parallelize the matrix addition but what boggles my mind is this:

1）如果每个处理器只工作在一部分循环必须修改每个处理器的循环参数）我使用什么命令将A的所有部分合并成所有处理器在其存储器中具有的单个更新的A.我的猜测是，我必须做一些类型的Alltoall，其中每个处理器发送它的部分A到所有其他处理器，但我如何保证（例如）行3处理器3工作覆盖其他处理器的行3，而不是意外的行1。

1) If each processor only works on a portion of that loop (naturally I'd have to modify the loop parameters per processor) what command do I use to merge all portions of A back into a single, updated A that all processors have in their memory. My guess is that I have to do some kind of Alltoall where each processor sends its portion of A to all other processors, but how do I guarantee that (for example) row 3 worked on by processor 3 overwrites row 3 of the other processors, and not row 1 by accident.

2）如果我使用Alltoall里面的函数（），所有的处理器都必须允许进入function我使用...隔离function（）。

2) If I use an Alltoall inside function(), do all processors have to be allowed to step into function(), or can I isolate function() using...

if (id_proc == root_proc)
{
    result = function(A,B,dim,rows);
}

...然后里面的function（）处理所有的并行化。听起来很愚蠢，我试图在一个处理器（广播）上做很多工作，并且只是并行化大的耗时的循环。只是想让代码在概念上简单，所以我可以得到我的结果，并继续。

… and then inside function() handle all the parallelization. As silly as it sounds, I'm trying to do a lot of the work on one processor (with broadcasts), and just parallelize the big time-consuming for loops. Just trying to keep the code conceptually simple so I can get my results and move on.

3）对于平均部分，我相信我可以只使用一个减少命令如果我想并行化，正确？

3) For the averaging part, I'm sure I can just use a reducing command if I wanted to parallelize it, correct?

另外，作为一个旁白：有一种方法来调用Bcast（），使其阻塞？我想使用它来同步我所有的处理器（boost库不是一个选项）。如果没有，我就去Barrier（）。谢谢你的回答这个问题，和stackoverflow的社区学习我如何编程在过去两年！：）

Also, as an aside: is there a way to call Bcast() such that it is blocking? I'd like to use it to synchronize all my processors (boost libraries are not an option). If not then I'll just go with Barrier(). Thank you for your answer to this question, and to the community of stackoverflow for learning me how to program over the past two years! :)

MPI C ++矩阵添加，函数参数和函数返回 [英] MPI C++ matrix addition, function arguments, and function returns

问题描述

推荐答案

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

MPI C ++矩阵添加，函数参数和函数返回 [英] MPI C++ matrix addition, function arguments, and function returns

问题描述

推荐答案

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭