消息传递系统中如何实现屏障? [英] How is barrier implemented in message passing systems?

查看:118
本文介绍了消息传递系统中如何实现屏障?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我了解的是,一个主进程将一条消息发送给所有其他进程.反过来,所有其他进程都向主进程发送一条消息.这样足以解决工作障碍吗?如果没有,那么还需要什么?

What I understand is, that one master process sends a message to all other processes. All the other processes in return send a message to the master process. Would this be enough for a barrier to work? If not, then what more is needed?

推荐答案

让我们看看OpenMPI对barrier的实现.尽管其他实现可能略有不同,但一般的通信模式应该相同.

Let's have a look at OpenMPI's implementation of barrier. While other implementations may differ slightly, the general communication pattern should be identical.

首先要注意的是,MPI的障碍没有设置成本:到达MPI_Barrier调用的过程将阻塞,直到该组的所有其他成员也都调用了MPI_Barrier为止.请注意,MPI并不要求它们达到相同的调用,只需达到MPI_Barrier的任何调用即可.因此,由于每个进程都知道该组中节点的总数,因此不需要分配其他状态来初始化该呼叫.

First thing to note is that MPI's barrier has no setup costs: A process reaching an MPI_Barrier call will block until all other members of the group have also called MPI_Barrier. Note that MPI does not require them to reach the same call, just any call to MPI_Barrier. Hence, since the total number of nodes in the group is already known to each process, no additional state needs to be distributed for initializing the call.

现在,让我们看一些代码:

Now, let's look at some code:

/*
 * Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
 *                         University Research and Technology
 *                         Corporation.  All rights reserved.
 * Copyright (c) 2004-2005 The University of Tennessee and The University
 *                         of Tennessee Research Foundation.  All rights
 *                         reserved.
 * Copyright (c) 2004-2005 High Performance Computing Center Stuttgart, 
 *                         University of Stuttgart.  All rights reserved.
 * Copyright (c) 2004-2005 The Regents of the University of California.
 *                         All rights reserved.
 * Copyright (c) 2012      Oak Ridge National Labs.  All rights reserved.
 * [...]
 */

[...]

/*
 *  barrier_intra_lin
 *
 *  Function:   - barrier using O(N) algorithm
 *  Accepts:    - same as MPI_Barrier()
 *  Returns:    - MPI_SUCCESS or error code
 */
int
mca_coll_basic_barrier_intra_lin(struct ompi_communicator_t *comm,
                                 mca_coll_base_module_t *module)
{
    int i;
    int err;
    int size = ompi_comm_size(comm);
    int rank = ompi_comm_rank(comm);

首先所有节点(根节点等级为0的节点除外)向根节点发送一条通知,告知它们已到达屏障:

First all nodes (except the one with rank 0, the root node) send a notification that they have reached the barrier to the root node:

    /* All non-root send & receive zero-length message. */

    if (rank > 0) {
        err =
            MCA_PML_CALL(send
                         (NULL, 0, MPI_BYTE, 0, MCA_COLL_BASE_TAG_BARRIER,
                          MCA_PML_BASE_SEND_STANDARD, comm));
        if (MPI_SUCCESS != err) {
            return err;
        }

之后,它们阻止来自根的等待通知:

After that they block awaiting notification from the root:

        err =
            MCA_PML_CALL(recv
                         (NULL, 0, MPI_BYTE, 0, MCA_COLL_BASE_TAG_BARRIER,
                          comm, MPI_STATUS_IGNORE));
        if (MPI_SUCCESS != err) {
            return err;
        }
    }

根节点实现通信的另一端.首先,它一直阻塞,直到收到n-1通知为止(该组中除他本人外,来自该组中每个节点的一个通知,因为他已经在屏障调用中了):

The root node implements the other side of the communication. First it blocks until it received n-1 notifications (one from every node in the group, except himself, since he is inside the barrier call already):

else {
        for (i = 1; i < size; ++i) {
            err = MCA_PML_CALL(recv(NULL, 0, MPI_BYTE, MPI_ANY_SOURCE,
                                    MCA_COLL_BASE_TAG_BARRIER,
                                    comm, MPI_STATUS_IGNORE));
            if (MPI_SUCCESS != err) {
                return err;
            }
        }

所有通知到达后,它将发出每个节点正在等待的消息,表示每个人都已到达屏障,此后它便离开屏障调用:

Once all notifications have arrived, it sends out the messages that every node is waiting for, signalling that everyone has reached the barrier, after which it leaves the barrier call itself:

        for (i = 1; i < size; ++i) {
            err =
                MCA_PML_CALL(send
                             (NULL, 0, MPI_BYTE, i,
                              MCA_COLL_BASE_TAG_BARRIER,
                              MCA_PML_BASE_SEND_STANDARD, comm));
            if (MPI_SUCCESS != err) {
                return err;
            }
        }
    }

    /* All done */

    return MPI_SUCCESS;
}

因此,通信模式首先是从所有节点到根的n:1,然后是从根回到所有节点的1:n.为避免根节点因请求而过载,OpenMPI允许使用基于树的通信模式,但基本思想是相同的:所有节点在进入屏障时都会通知根,而根将结果汇总并在每个节点到达时通知所有人准备继续.

So the communication pattern is first an n:1 from all nodes to the root and then a 1:n from the root back to all nodes. To avoid overloading the root node with requests, OpenMPI allows use of a tree-based communication pattern, but the basic idea is the same: All nodes notify the root when entering the barrier, while the root aggregates the results and inform everyone once they are ready to continue.

这篇关于消息传递系统中如何实现屏障?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆