Hydra MPI的最佳进步是什么 [英] What's the best advance of Hydra MPI
问题描述
我正在研究MPICH2随附的新流程管理器,但是直到现在我仍无法弄清楚此实现的重大进展,有人知道一个很好的教程或有一定的经验吗?
argonne Wiki太简单了: http://wiki.mcs.anl .gov/mpich2/index.php/Using_the_Hydra_Process_Manager
I'm studying about the new Process Manager that came automatically with MPICH2, but until now I can't figure out what's is big advance of this implementation, someone have knows a good tutorial or have some experience with?
The argonne wiki is a kind of too simple: http://wiki.mcs.anl.gov/mpich2/index.php/Using_the_Hydra_Process_Manager
推荐答案
从我工作的角度来看,最大的进步是流程启动的可伸缩性.在基于MPICH2的MPI实现中,使用以前的进程启动器启动8000+任务作业速度异常缓慢,并且由于超时或其他网络问题而经常失败,这几乎排除了基于MPICH2的MPI作为我们最大的工作.但是Hydra具有良好的分层启动模型,该模型也可以利用您的资源管理器.
From the point of view of where I work, the biggest single advance is scalability of process launching. Launching 8000+ task jobs with the previous process launchers in MPICH2-based MPI implementations was unusably slow and would frequently fail due to timeouts or other network problems, which all but ruled out MPICH2-based MPIs for our largest jobs. But Hydra has a good hierarchical launch model which can also take advantage of your resource manager.
了解拓扑的分配策略也不错,但是与作业启动失败(或耗时数小时)和作业成功之间的差异相比,这是二阶效应.
The topology-aware allocation strategies are good, too, but compared to the difference between jobs startup failing (or taking hours) and jobs succeeding, it's a second-order effect.
这篇关于Hydra MPI的最佳进步是什么的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!