MPI(求和) [英] MPI (Summation)

查看:211
本文介绍了MPI(求和)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个程序,该程序计算最多1000个数字的总和.例如1 + 2 + 3 + 4 + 5 .... + 100.首先,我将求和作业分配给10个处理器:处理器0获得1-100,处理器1获得101-200,依此类推.总和存储在一个数组中.

I am writing a program that calculates the sum of every number up to 1000. For example, 1+2+3+4+5....+100. First, I assign summation jobs to 10 processors: Processor 0 gets 1-100, Processor 1 gets 101-200 and so on. The sums are stored in an array.

并行完成所有求和后,处理器将其值发送到处理器0(处理器0使用无阻塞发送/接收接收值),处理器0对所有值求和并显示结果.

After all summations have been done parallelly, processors send their values to Processor 0 (Processor 0 receives values using nonblocking send/recv) and Processor 0 sums up all the values and displays the result.

这是代码:

#include <mpi.h>
#include <iostream>

using namespace std;

int summation(int, int);

int main(int argc, char ** argv)
{
    int * array;
    int total_proc;
    int curr_proc;
    int limit = 0;
    int partial_sum = 0;
    int upperlimit = 0, lowerlimit = 0;

    MPI_Init(&argc, &argv);
    MPI_Comm_size(MPI_COMM_WORLD, &total_proc);
    MPI_Comm_rank(MPI_COMM_WORLD, &curr_proc);
    MPI_Request send_request, recv_request;

    /* checking if 1000 is divisible by number of procs, else quit */
    if(1000 % total_proc != 0)
    {
        MPI_Finalize();
        if(curr_proc == 0)
            cout << "**** 1000 is not divisible by " << total_proc << " ...quitting..."<< endl;
        return 0;
    }

    /* number of partial summations */
    limit = 1000/total_proc;

    array = new int [total_proc];

    /* assigning jobs to processors */
    for(int i = 0; i < total_proc; i++)
    {
        if(curr_proc == i)
        {
            upperlimit = upperlimit + limit;
            lowerlimit = (upperlimit - limit) + 1;
            partial_sum = summation(upperlimit, lowerlimit);
            array[i] = partial_sum;
        }
        else
        {
            upperlimit = upperlimit + limit;
            lowerlimit = (upperlimit - limit) + 1;
        }
    }

    cout << "** Partial Sum From Process " << curr_proc << " is " << array[curr_proc] << endl;

    /* send and receive - non blocking */
    for(int i = 1; i < total_proc; i++)
    {
        if(curr_proc == i) /* (i = current processor) */
        {
            MPI_Isend(&array[i], 1, MPI_INT, 0, i, MPI_COMM_WORLD, &send_request);
            cout << "-> Process " << i << " sent " << array[i] << " to Process 0" << endl;

            MPI_Irecv(&array[i], 1, MPI_INT, i, i, MPI_COMM_WORLD, &recv_request);
            //cout << "<- Process 0 received " << array[i] << " from Process " << i << endl;
        }
    }

    MPI_Finalize();

    if(curr_proc == 0)
    {
        for(int i = 1; i < total_proc; i++)
            array[0] = array[0] + array[i];
        cout << "Sum is " << array[0] << endl;
    }

    return 0;
}

int summation(int u, int l)
{
    int result = 0; 
    for(int i = l; i <= u; i++)
        result = result + i;
    return result;
}

输出:

** Partial Sum From Process 0 is 5050
** Partial Sum From Process 3 is 35050
-> Process 3 sent 35050 to Process 0
<- Process 0 received 35050 from Process 3
** Partial Sum From Process 4 is 45050
-> Process 4 sent 45050 to Process 0
<- Process 0 received 45050 from Process 4
** Partial Sum From Process 5 is 55050
-> Process 5 sent 55050 to Process 0
<- Process 0 received 55050 from Process 5
** Partial Sum From Process 6 is 65050
** Partial Sum From Process 8 is 85050
-> Process 8 sent 85050 to Process 0
<- Process 0 received 85050 from Process 8
-> Process 6 sent 65050 to Process 0
** Partial Sum From Process 1 is 15050
** Partial Sum From Process 2 is 25050
-> Process 2 sent 25050 to Process 0
<- Process 0 received 25050 from Process 2
<- Process 0 received 65050 from Process 6
** Partial Sum From Process 7 is 75050
-> Process 1 sent 15050 to Process 0
<- Process 0 received 15050 from Process 1
-> Process 7 sent 75050 to Process 0
<- Process 0 received 75050 from Process 7
** Partial Sum From Process 9 is 95050
-> Process 9 sent 95050 to Process 0
<- Process 0 received 95050 from Process 9
Sum is -1544080023

打印数组的内容:

5050
536870912
-1579286148
-268433415
501219332
32666
501222192
32666
1
0

我想知道是什么原因造成的.

I'd like to know what is causing this.

如果我在调用MPI_Finalize之前打印数组,则效果很好.

If I print the array before MPI_Finalize is invoked it works fine.

推荐答案

程序最重要的缺陷是如何划分工作.在MPI中,每个进程都在执行主要功能.因此,如果您希望所有进程协作以构建结果,则必须确保所有进程都执行您的summation函数.

The most important flaw your program has is how you divide the work. In MPI, every process is executing the main function. Therefore, you must ensure that all the processes execute your summation function if you want them to collaborate on building the result.

您不需要for循环.每个进程都单独执行主程序.它们只是具有不同的curr_proc值,您可以根据该值来计算他们必须执行的工作部分:

You don't need the for loop. Every process is executing the main separately. They just have different curr_proc values, and you can compute which portion of the job they have to perform based on that:

/* assigning jobs to processors */
int chunk_size = 1000 / total_proc;
lowerlimit = curr_proc * chunk_size;
upperlimit = (curr_proc+1) * chunk_size;
partial_sum = summation(upperlimit, lowerlimit);

然后,主进程如何接收所有其他进程的部分总和是不正确的.

Then, how the master process receives all the other processes' partial sum is not correct.

  • MPI等级值(curr_proc)从0开始到MPI_Comm_size输出值(total_proc-1).
  • 只有进程#1正在发送/接收数据.
  • 您正在使用发送和接收的立即版本:MPI_IsendMPI_recv,但是您没有等到这些请求完成.为此,您应该使用MPI_Waitall.
  • MPI rank values (curr_proc) start form 0 up to MPI_Comm_size output value (total_proc-1).
  • Only the process #1 is sending/receiving data.
  • You are using the immediate version of send and receive: MPI_Isend and MPI_recv but you are not waiting until those requests are completed. You should use MPI_Waitall for that purpose.

正确的版本应如下所示:

The correct version would be something like the following:

if( curr_proc == 0 ) {
   // master process receives all data
   for( int i = 1; i < total_proc; i++ )
      MPI_Recv( &array[i], MPI_INT, 1, i, 0, MPI_COMM_WORLD );
} else {
   // other processes send data to the master
   MPI_Send( &partial_sum, MPI_INT, 1, 0, 0, MPI_COMM_WORLD );
}

这种全对一的通信模式被称为 gather .在MPI中,已经有一个功能可以执行此功能:MPI_Gather.

This all-to-one communication pattern is known as gather. In MPI there is a function which already performs this functionality: MPI_Gather.

最后,您打算执行的操作被称为 reduction :采用给定数量的数值并通过连续执行单个操作(在您的情况下为总和)来生成单个输出值.在MPI中,有一个函数也可以做到这一点:MPI_Reduce.

Finally, what you intent to perform is known as reduction: take a given amount of numeric values and generate a single output value by continuously performing a single operation (a sum, in your case). In MPI there is a function which does that, too: MPI_Reduce.

我强烈建议您进行一些基本的指导性练习,然后尝试制作自己的作品. MPI在一开始很难理解.建立良好的基础对于您日后能够增加复杂性至关重要. 教程上的手也是入门MPI的好方法.

I strongly suggest you to do some basic guided exercises before trying to make your own. MPI is difficult to understand at the beginning. Building a good base is vital for you to be able to add complexity later on. A hands on tutorial is also a good way of getting started into MPI.

忘记提及您不需要强制将问题大小除以(在本例中为1000)除以资源数量(total_proc).根据情况,您可以将其余的分配给单个进程:

Forgot to mention that you don't need to enforce an even divission of the problem size (1000 in this case) by the number of resources (total_proc). Depending on the case, you can either assign the remainder to a single process:

chunk_size = 1000 / total_proc;
if( curr_proc == 0 )
    chunk_size += 1000 % total_proc;

或尽可能保持平衡:

int remainder = curr_proc < ( 1000 % proc )? 1 : 0;
lowerlimit = curr_proc * chunk_size /* as usual */
           + curr_proc;             /* cumulative remainder */
upperlimit = (curr_proc + 1) * chunk_size /* as usual */
           + remainder;                   /* curr_proc remainder */

第二种情况下,负载不平衡量将高达1,而在第一种情况下,最坏情况下的负载不平衡量可能会达到total_proc-1.

The second case, the load unbalance will be as much as 1, while in the first case the load unbalance can reach total_proc-1 in the worst case.

这篇关于MPI(求和)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆