MPI广播或分散到特定等级 [英] MPI Bcast or Scatter to specific ranks

查看:104
本文介绍了MPI广播或分散到特定等级的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些数据.我想做的是这样的:

I have some array of data. What I was trying to do is like this:

使用等级0将数据广播到50个节点.每个节点上都有1个mpi进程,该进程可使用16个内核.然后,每个mpi进程将调用python multiprocessing.完成一些计算后,mpi进程将保存通过多处理计算出的数据.然后,mpi进程更改一些变量,然后再次运行多重处理.等等.

Use rank 0 to bcast data to 50 nodes. Each node has 1 mpi process on it with 16 cores available to that process. Then, each mpi process will call python multiprocessing. Some calculations are done, then the mpi process saves the data that was calculated with multiprocessing. The mpi process then changes some variable, and runs multiprocessing again. Etc.

因此,除了最初的启动(它们都接收到一些数据)之外,节点不需要相互通信.

So the nodes do not need to communicate with each other besides the initial startup in which they all receive some data.

多处理效果不佳.所以现在我要使用所有MPI.

The multiprocessing is not working out so well. So now I want to use all MPI.

我如何(或不可能)使用整数数组来引用bcast或散点的MPI等级.例如,等级为1-1000,该节点有12个核心.因此,我想每12位就对数据进行广播.然后,我希望在第12位上将数据分散到第12 + 1位到12 + 12位.

How can I (or is it not possible) use an array of integers that refers to MPI ranks for bcast or scatter. For example, ranks 1-1000, the node has 12 cores. So every 12th rank I want to bcast the data. Then on every 12th rank, i want it to scatter data to 12th+1 to 12+12 ranks.

这要求第一个bcast与totalrank/12通信,然后每个等级将负责将数据发送到同一节点上的等级,然后收集结果,保存它,然后将更多数据发送到同一节点上的等级.

This requires the first bcast to communicate with totalrank/12, then each rank will be responsible for sending data to ranks on the same node, then gathering the results, saving it, then sending more data to ranks on the same node.

推荐答案

我对mpi4py的了解不多,无法为您提供代码示例,但是这可能是C ++中的解决方案.我敢肯定,您可以轻松地从中推断出Python代码.

I don't know enough of mpi4py to be able to give you a code sample with it, but here is what could be a solution in C++. I'm sure you can infer a Python code out of it easily.

#include <mpi.h>
#include <iostream>
#include <cstdlib> /// for abs
#include <zlib.h>  /// for crc32

using namespace std;

int main( int argc, char *argv[] ) {

    MPI_Init( &argc, &argv );
    // get size and rank
    int rank, size;
    MPI_Comm_rank( MPI_COMM_WORLD, &rank );
    MPI_Comm_size( MPI_COMM_WORLD, &size );

    // get the compute node name
    char name[MPI_MAX_PROCESSOR_NAME];
    int len;
    MPI_Get_processor_name( name, &len );

    // get an unique positive int from each node names
    // using crc32 from zlib (just a possible solution)
    uLong crc = crc32( 0L, Z_NULL, 0 );
    int color = crc32( crc, ( const unsigned char* )name, len );
    color = abs( color );

    // split the communicator into processes of the same node
    MPI_Comm nodeComm;
    MPI_Comm_split( MPI_COMM_WORLD, color, rank, &nodeComm );

    // get the rank on the node
    int nodeRank;
    MPI_Comm_rank( nodeComm, &nodeRank );

    // create comms of processes of the same local ranks
    MPI_Comm peersComm;
    MPI_Comm_split( MPI_COMM_WORLD, nodeRank, rank, &peersComm );

    // now, masters are all the processes of nodeRank 0
    // they can communicate among them with the peersComm
    // and with their local slaves with the nodeComm
    int worktoDo = 0;
    if ( rank == 0 ) worktoDo = 1000;
    cout << "Initially [" << rank << "] on node "
         << name << " has " << worktoDo << endl;
    MPI_Bcast( &worktoDo, 1, MPI_INT, 0, peersComm );
    cout << "After first Bcast [" << rank << "] on node "
         << name << " has " << worktoDo << endl;
    if ( nodeRank == 0 ) worktoDo += rank;
    MPI_Bcast( &worktoDo, 1, MPI_INT, 0, nodeComm );
    cout << "After second Bcast [" << rank << "] on node "
         << name << " has " << worktoDo << endl;

    // cleaning up
    MPI_Comm_free( &peersComm );
    MPI_Comm_free( &nodeComm );

    MPI_Finalize();
    return 0;
}

如您所见,首先要在同一节点上创建具有进程的通信器.然后,您将在每个节点上创建具有相同本地等级的所有进程的对等通信器. 从那里开始,您的全局等级0的主进程将数据发送到本地主.他们将把工作分配到他们负责的节点上.

As you can see, you first create communicators with processes on the same node. Then you create peer communicators with all processes of the same local rank on each nodes. From than, your master process of global rank 0 will send data to the local masters. And they will distribute the work on the node they are responsible of.

这篇关于MPI广播或分散到特定等级的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆