使用Boost MPI并行化循环 [英] parallelize for loop using boost MPI

查看:479
本文介绍了使用Boost MPI并行化循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在学习使用Boost.MPI并行化大量计算,这只是我的简单测试,看看是否可以正确获得MPI逻辑.但是,我没有使它起作用.我使用world.size()= 10,数据数组中总共有50个元素,每个进程将进行5次迭代.我希望通过让每个进程将更新的数据数组发送到根进程来更新数据数组,然后根进程接收更新的数据数组然后打印出来.但是我只更新了一些元素.

I am learning to use Boost.MPI to parallelize the large amount of computation, here below is just my simple test see if I can get MPI logic correctly. However, I did not get it to work. I used world.size()=10, there are total 50 elements in data array, each process will do 5 iteration. I would hope to update data array by having each process sending the updated data array to root process, and then the root process receives the updated data array then print out. But I only get a few elements updated.

感谢您的帮助.

#include <boost/mpi.hpp>
#include <iostream>
#include <cstdlib>

namespace mpi = boost::mpi;
using namespace std;

#define max_rows 100
int data[max_rows];

int modifyArr(const int index, const int arr[]) {
  return arr[index]*2+1;
}

int main(int argc, char* argv[])
{
  mpi::environment env(argc, argv);
  mpi::communicator world;

  int num_rows = 50;
  int my_number;

  if (world.rank() == 0) {
    for ( int i = 0; i < num_rows; i++)
        data[i] = i + 1;
  }

  broadcast(world, data, 0);

  for (int i = world.rank(); i < num_rows; i += world.size()) {
    my_number = modifyArr(i, data);
    data[i]   = my_number;

    world.send(0, 1, data);

    //cout << "i=" << i << " my_number=" << my_number << endl;

    if (world.rank() == 0)
      for (int j = 1; j < world.size(); j++) 
        mpi::status s = world.recv(boost::mpi::any_source, 1, data);
  }

  if (world.rank() == 0) {
    for ( int i = 0; i < num_rows; i++)
      cout << "i=" << i << " results = " << data[i] << endl;
  }

  return 0;
}

推荐答案

您的问题可能在这里:

mpi::status s = world.recv(boost::mpi::any_source, 1, data);

这是数据可以返回到主节点的唯一方法.

This is the only way data can get back to the master node.

但是,您不会告诉主节点在data中的哪个位置存储它所获得的答案.由于数据是数组的地址,因此所有内容都应存储在第零个元素中.

However, you do not tell the master node where in data to store the answers it is getting. Since data is the address of the array, everything should get stored in the zeroth element.

在每个节点上交错处理要处理的数组元素是一个非常糟糕的主意.您应该将数组的块分配给每个节点,以便可以一次发送数组的整个块.这将大大减少通信开销.

Interleaving which elements of the array you are processing on each node is a pretty bad idea. You should assign blocks of the array to each node so that you can send entire chunks of the array at once. That will reduce communication overhead significantly.

此外,如果您的问题只是为了加快循环速度,则应考虑使用OpenMP,它可以执行以下操作:

Also, if your issue is simply speeding up for loops, you should consider OpenMP, which can do things like this:

#pragma omp parallel for
for(int i=0;i<100;i++)
  data[i]*=4;

Bam!我只是将for循环拆分到所有进程之间,而无需进一步的工作.

Bam! I just split that for loop up between all of my processes with no further work needed.

这篇关于使用Boost MPI并行化循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆