MPI帮助如何并行化我的代码 [英] MPI help on how to parallelize my code

查看：190 发布时间：2020/5/9 1:19:10 parallel-processing mpi message

本文介绍了MPI帮助如何并行化我的代码的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我是该领域的新手，并且需要有关如何并行化我的代码的帮助. 我有一个大型1D数组，实际上它描述了3D体积:21x21x21单精度值. 我有3台计算机要进行计算.对于所有单元格，对网格(卷)中的每个单元格执行的操作都是相同的.该程序将接收一些数据并对它们执行一些简单的算术，然后将返回值分配给网格单元.

I am very much a newbie in this subject and need help on how to parallelize my code. I have a large 1D array that in reality describes a 3D volume: 21x21x21 single precision values. I have 3 computers that I want to engage in the computation. The operation that is performed on each cell in the grid(volume) is identical for all cells. The program takes in some data and perform some simple arithmetics on them and the return value is assigned to the grid cell.

我的非参数化代码是:

float zg, yg, xg;
stack_result = new float[Nz*Ny*Nx];
// StrMtrx[8] is the vertical step size, StrMtrx[6]  is the vertical starting point 
for (int iz=0; iz<Nz; iz++) {
  zg = iz*StRMtrx[8]+StRMtrx[6];  // find the vertical position in meters
  // StrMtrx[5] is the crossline step size, StrMtrx[3]  is the crossline starting point
  for (int iy=0; iy<Ny; iy++) {
    yg = iy*StRMtrx[5]+StRMtrx[3];  // find the crossline position
    // StrMtrx[2] is the inline step size, StrMtrx[0]  is the inline starting point
    for (int ix=0; ix < nx; ix++) { 
      xg = ix*StRMtrx[2]+StRMtrx[0]; // find the inline position
      // do stacking on each grid cell
      // "Geoph" is the geophone ids, "Ngeo" is the number of geophones involved,
      // "pahse_use" is the wave type, "EnvMtrx" is the input data common to all
      // cells, "Mdata" is the length of input data
      stack_result[ix+Nx*iy+Nx*Ny*iz] =
        stack_for_qds(Geoph, Ngeo, phase_use, xg, yg, zg, EnvMtrx, Mdata);  
    }        
  }
}

现在，我要使用3台计算机，并将体积划分为3个垂直部分，因此，每个21x21x7单元将有3个子体积. (请注意，体积的解析是在z，y，x中进行的). 变量"stack_result"是完整的卷. 我的并行化版本(完全失败，我只得到其中一个子卷)是:

Now I take in 3 computers and divide the volume in 3 vertical segments, so I would then have 3 sub-volumes each 21x21x7 cells. (note the parsing of the volume is in z,y,x). The variable "stack_result" is the complete volume. My parallellized version (which utterly fails, I only get one of the sub-volumes back) is:

MPI_Status status;
int rank, numProcs, rootProcess;
ierr = MPI_Init(&argc, &argv);
ierr = MPI_Comm_rank(MPI_COMM_WORLD, &rank);
ierr = MPI_Comm_size(MPI_COMM_WORLD, &numProcs);
int rowsInZ = Nz/numProcs;  // 7 cells in Z (vertical)
int chunkSize = Nx*Ny*rowsInZ;
float *stack_result = new float[Nz*Ny*Nx];
float zg, yg, xg;
rootProcess = 0;
if(rank == rootProcess) {
  offset = 0;
  for (int n = 1; n < numProcs; n++) { 
    // send rank
    MPI_Send(&n, 1, MPI_INT, n, 2, MPI_COMM_WORLD);
    // send the offset in array
    MPI_Send(&offset, 1, MPI_INT, n, 2, MPI_COMM_WORLD);
    // send volume, now only filled with zeros,
    MPI_Send(&stack_result[offset], chunkSize, MPI_FLOAT, n, 1, MPI_COMM_WORLD);
    offset = offset+chunkSize;
  }
  // receive results
  for (int n = 1; n < numProcs; n++) { 
    int source = n;
    MPI_Recv(&offset, 1, MPI_INT, source, 2, MPI_COMM_WORLD, &status);
    MPI_Recv(&stack_result[offset], chunkSize, MPI_FLOAT, source, 1, MPI_COMM_WORLD, &status);
  }
}  else {
  int rank;
  int source = 0;
  int ierr = MPI_Recv(&rank, 1, MPI_INT, source, 2, MPI_COMM_WORLD, &status);
  ierr = MPI_Recv(&offset, 1, MPI_INT, source, 2, MPI_COMM_WORLD, &status);
  ierr = MPI_Recv(&stack_result[offset], chunkSize, MPI_FLOAT, source, 1, MPI_COMM_WORLD, &status);       
  int nz = rowsInZ;  // sub-volume vertical length
  int startZ = (rank-1)*rowsInZ;
  for (int iz = startZ; iz < startZ+nz; iz++) {
    zg = iz*StRMtrx[8]+StRMtrx[6];
    for (int iy = 0; iy < Ny; iy++) {
      yg = iy*StRMtrx[5]+StRMtrx[3];
      for (int ix = 0; ix < Nx; ix++) {
        xg = ix*StRMtrx[2]+StRMtrx[0];
        stack_result[offset+ix+Nx*iy+Nx*Ny*iz]=
          stack_for_qds(Geoph, Ngeo, phase_use, xg, yg, zg, EnvMtrx, Mdata);
      }  // x-loop
    }  // y-loop
  }   // z-loop
  MPI_Send(&offset, 1, MPI_INT, source, 2, MPI_COMM_WORLD);
  MPI_Send(&stack_result[offset], chunkSize, MPI_FLOAT, source, 1, MPI_COMM_WORLD);
}  // else
write("stackresult.dat", stack_result);
delete [] stack_result;
MPI_Finalize();

提前感谢您的耐心等候.

Thanks in advance for your patience.

MPI帮助如何并行化我的代码 [英] MPI help on how to parallelize my code

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

MPI帮助如何并行化我的代码 [英] MPI help on how to parallelize my code

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭