MPI中的流程经理 [英] Process manager in MPI

查看:163
本文介绍了MPI中的流程经理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是MPI的新手,我对创建和启动工作有一些疑问.我试图弄清楚 出来,但事情对我来说很混乱.所以我正在使用的集群体系结构是这样的-有四个相互连接的节点(A,B,C,D),每个节点上都安装了MPICH2. mpiexec -info给出...

I am new in MPI, I have some doubts regarding Job creation and launching. I tried to figure it out but things are quite messy for me. So the cluster architecture on which i am working is like this- There are four nodes(A,B,C,D) connected to each other, MPICH2 is installed on each node. mpiexec -info gives...

.....配置选项:'--prefix =/usr/local/mpich2-1.4.1-install/''--with-pm = hydra'....

.....Configure options: '--prefix=/usr/local/mpich2-1.4.1-install/' '--with-pm=hydra' ....

    Process Manager:                        pmi
    Launchers available:                    ssh rsh fork slurm ll lsf sge manual persist
    Topology libraries available:           hwloc plpa
    Resource management kernels available:  user slurm ll lsf sge pbs

据我所知(如果我错了,请纠正我)PMI是流程管理界面,hydra,mpirun,mpiexec是流程管理器,如果我们使用不同的PM,则PMI提供了将PM与流程进行交互的方法.所以我的疑问是-

According to my knowledge(Please correct me if i am wrong) PMI is process management interface, hydra, mpirun, mpiexec are process manager, PMI provides way to interact PM with processes if we are using different PMs. So my doubts are -

1,为什么将PMI显示为流程管理器"?

1, why it is showing PMI as Process Manager?

2,pbs有什么作用吗?

2, Is there any role of pbs?

3,谁负责在不同节点上创建可执行文件的副本?(我正在从节点A启动作业).

3, Who is responsible for creating the copy of executable on different nodes?(I am launching job from node A).

我知道问题很漫长,感谢一些好的资源的建议.

I know question is very lengthy, I will be thankful for suggestion of some good resources.

推荐答案

集群有两种类型-处于某些分布式资源管理器(DRM)(例如PBS,LSF,S/OGE等)的控制之下的集群,以及那些不是.典型的DRM提供了机制以在授予的分配中启动远程进程并控制这些进程,例如.向他们发送信号并获取有关其发射和终止状态的信息.当群集不受DRM的控制时,MPI运行时必须实施自己的流程管理.不同的MPI库具有不同的方法,但是几乎所有方法都可以归结为通过远程节点上的rsh或ssh守护程序启动以处理远程进程.即使在使用DRM时,该库可能仍会在两者之间放置自己的流程管理器,以提供可移植性.

There are two types of clusters - those who are under the control of some distributed resource manager (DRM) like PBS, LSF, S/OGE, etc. and those who are not. A typical DRM provides mechanisms to launch remote processes within the granted allocation and to control those processes, e.g. send them signals and get back information about their launch and termination statuses. When the cluster is not under the control of a DRM, the MPI runtime has to implement its own process management. Different MPI libraries have different approaches but almost all of them boil down to starting via rsh or ssh a daemon on the remote nodes to take care of the remote processes. Even when a DRM is in use, the library might still put its own process manager in between in order to provide portability.

MPICH带有两个过程管理器:MPD和Hydra. MPD代表多用途守护程序,现在被认为是旧版. Hydra较新且更好,因为它提供了可感知拓扑的流程绑定和其他功能.无论使用什么流程管理器,库都必须以某种方式与之对话,例如获取启动信息或请求在MPI_COMM_SPAWN期间启动新进程.这是通过PMI界面完成的.

MPICH comes with two process managers: MPD and Hydra. MPD stands for Multi-Purpose Daemon and is now considered legacy. Hydra is newer and better as it provides topology-aware process binding and other goodies. No matter what process manager is in use, the library has to talk to it somehow, e.g. obtain launch information or request that new processes are launched during MPI_COMM_SPAWN. This is done through the PMI interface.

话虽这么说,您的案例中的mpiexec Hydra流程管理器.您列出的信息是Hydra本身的功能.由于MPICH及其派生产品(例如Intel MPI)可能是唯一使用Hydra的MPI实现,因此除了MPICH固有的PMI之外,后者不需要提供任何其他过程管理接口.启动器是Hydra用来启动远程进程的机制.当不使用DRM时,sshrsh是显而易见的选择. fork用于在本地节点上启动进程.资源管理内核是Hydra与DRM进行交互以便确定诸如已授权分配之类的机制.其中一些还可以启动流程,例如pbs使用PBS或Torque的tm界面.

That being said, the mpiexec in your case is the Hydra process manager. The information that you list are the capabilities of Hydra itself. Since MPICH and its derivatives (e.g. Intel MPI) are probably the only MPI implementations that uses Hydra, the latter doesn't need to provide any other process management interface than the one that is native to MPICH, namely PMI. The launchers are the mechanisms that Hydra could use in order to launch remote processes. ssh and rsh are the obvious choices when no DRM is in use. fork is for starting processes on the local node. Resource management kernels are mechanisms for Hydra to interact with DRMs in order to determine things like granted allocations. Some of those can also launch processes, e.g. pbs uses the tm interface of PBS or Torque.

总结一下:

1)Hydra实现了PMI接口,以便能够与MPICH进行通信.它不了解其他界面,例如它无法启动针对Open MPI编译的MPI可执行文件.

1) Hydra implements the PMI interface in order to be able to talk to MPICH. It doesn't understand other interfaces, e.g. it cannot launch MPI executables compiled against Open MPI.

2)Hydra与类似PBS的DRM(PBSPro,Torque)集成.例如,集成意味着您不必向mpiexec提供主机列表,因为已授予节点的列表是自动获取的.它还使用PBS的本机tm接口启动和监视远程进程.

2) Hydra integrates with PBS-like DRMs (PBSPro, Torque). The integration means that, for example, you don't have to provide a list of hosts to mpiexec since the list of granted nodes is obtained automatically. It also uses the native tm interface of PBS to launch and monitor remote processes.

3)在更高级别上,Hydra启动远程副本.最终,这可以通过DRM或通过rsh/ssh完成.

3) On a higher level, Hydra launches the remote copies. Ultimately, this is done either by the DRM or via rsh/ssh.

这篇关于MPI中的流程经理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆