如何使用 MPI_Reduce 对来自不同处理器组的不同值进行独立求和 [英] How to use MPI_Reduce to Sum different values from Different groups of processors independently

查看:63
本文介绍了如何使用 MPI_Reduce 对来自不同处理器组的不同值进行独立求和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将我的处理器分组,然后添加每个组的总和独立......但直到现在我才能正确找到结果.一个简单的例子如下:

I am trying to divide my processors into groups then add the summation of each group independently ... but I couldn't find the result correctly until now. a simple example is as follows:

int main(int argc, char** argv) 
{
    int size, rank,i=0,localsum1=0,globalsum1=0,localsum2=0,globalsum2=0;

    MPI_Init(&argc,&argv);
    MPI_Comm_size(MPI_COMM_WORLD,&size);
    MPI_Comm_rank(MPI_COMM_WORLD,&rank);

    if(rank==0)
    {
    }
    else if(rank==1)
    {
        localsum1 += 5;
        MPI_Reduce(&localsum1,&globalsum1,2,MPI_INT,MPI_SUM,0,MPI_COMM_WORLD);
    }
    else if(rank==2)
    {
        localsum2 += 10;
        MPI_Reduce(&localsum2,&globalsum2,2,MPI_INT,MPI_SUM,0,MPI_COMM_WORLD);
    }

    if(rank==0)
    {
        printf("globalsum1 = %d 
",globalsum1);
        printf("globalsum2 = %d 
",globalsum2);
    }
    MPI_Finalize();

    return (EXIT_SUCCESS);
}

我无法弄清楚这里缺少什么...有人可以帮忙吗?

I can't figure out what is missing here ... can anyone help?

推荐答案

MPI_Reduce 是一个集体操作.这意味着参与通信器中的所有任务都必须进行 MPI_Reduce() 调用.在上面,排名 0 永远不会调用 MPI_Reduce(),因此该程序将挂起,因为其他一些处理器等待排名 0 的参与,而这永远不会到来.

MPI_Reduce is a collective operation. What that means is that all tasks in the participating communicator must make the MPI_Reduce() call. In the above, rank 0 never calls MPI_Reduce() so this program will hang as some of the other processors wait for participation from rank 0 which will never come.

另外,因为是对整个通信器的集体操作,所以需要做一些工作来对reduction进行分区.一种方法是减少整数数组,并让每个处理器只对数组中的元素做出贡献:

Also, because it is a collective operation on the entire communicator, you need to do some work to partition the reduction. One way is just to reduce an array of ints, and have each processor contribute only to its element in the array:

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

int main(int argc, char** argv)
{
    int size, rank;

    MPI_Init(&argc,&argv);
    MPI_Comm_size(MPI_COMM_WORLD,&size);
    MPI_Comm_rank(MPI_COMM_WORLD,&rank);

    int localsum[2] = {0,0};
    int globalsum[2] = {0,0};

    if(rank % 2 == 1)
    {
        localsum[0] += 5;
    }
    else if( rank > 0 && (rank % 2 == 0))
    {
        localsum[1] += 10;
    }

    MPI_Reduce(localsum,globalsum,2,MPI_INT,MPI_SUM,0,MPI_COMM_WORLD);

    if(rank==0)
    {
        printf("globalsum1 = %d 
",globalsum[0]);
        printf("globalsum2 = %d 
",globalsum[1]);
    }

    MPI_Finalize();

    return (EXIT_SUCCESS);
}

现在运行的地方

$ mpicc -o reduce reduce.c
$ mpirun -np 3 ./reduce
globalsum1 = 5 
globalsum2 = 10 

否则,您可以创建只连接您希望参与每个求和的处理器的通信器,并在每个通信器内进行归约.下面是一个不太漂亮的方法来做到这一点.总的来说,这非常强大,但比第一个解决方案更复杂:

Otherwise, you can create communicators that only connect the processors you want to be involved in each sum, and do the reductions within each commuicator. Below is a not-very-pretty way to do this. This is quite powerful in general but more complicated than the first solution:

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

int main(int argc, char** argv)
{
    int size, rank;

    MPI_Init(&argc,&argv);
    MPI_Comm_size(MPI_COMM_WORLD,&size);
    MPI_Comm_rank(MPI_COMM_WORLD,&rank);

    int localsum = 0;
    int globalsum = 0;

    MPI_Comm  comm_evens_plus_root, comm_odds_plus_root;
    MPI_Group grp_evens_plus_root, grp_odds_plus_root, grp_world;

    MPI_Comm_group(MPI_COMM_WORLD, &grp_world);
    int *ranks = malloc((size/2 + 1) * sizeof(rank));
    int i,j;
    for (i=1, j=0; i<size; i+=2, j+=1)
        ranks[j] = i;
    MPI_Group_excl(grp_world, j, ranks, &grp_evens_plus_root);
    MPI_Comm_create(MPI_COMM_WORLD, grp_evens_plus_root, &comm_evens_plus_root);

    for (i=2, j=0; i<size; i+=2, j+=1)
        ranks[j] = i;
    MPI_Group_excl(grp_world, j, ranks, &grp_odds_plus_root);
    MPI_Comm_create(MPI_COMM_WORLD, grp_odds_plus_root, &comm_odds_plus_root);

    free(ranks);

    if(rank % 2 == 1)
    {
        localsum += 5;
        MPI_Reduce(&localsum,&globalsum,1,MPI_INT,MPI_SUM,0,comm_odds_plus_root);
    }
    else if( rank > 0 && (rank % 2 == 0))
    {
        localsum += 10;
        MPI_Reduce(&localsum,&globalsum,1,MPI_INT,MPI_SUM,0,comm_evens_plus_root);
    }

    if(rank==0)
    {
        MPI_Reduce(&localsum,&globalsum,1,MPI_INT,MPI_SUM,0,comm_odds_plus_root);
        printf("globalsum1 = %d 
",globalsum);
        MPI_Reduce(&localsum,&globalsum,1,MPI_INT,MPI_SUM,0,comm_evens_plus_root);
        printf("globalsum2 = %d 
",globalsum);
    }

    MPI_Comm_free(&comm_odds_plus_root);
    MPI_Comm_free(&comm_evens_plus_root);
    MPI_Group_free(&grp_odds_plus_root);
    MPI_Group_free(&grp_evens_plus_root);
    MPI_Finalize();

    return (EXIT_SUCCESS);
}

跑步给予

$ mpicc -o reduce2 reduce2.c 
$ mpirun -np 3 ./reduce
globalsum1 = 5 
globalsum2 = 10 

这篇关于如何使用 MPI_Reduce 对来自不同处理器组的不同值进行独立求和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆