串行调用MPI二进制文件作为MPI应用程序的子进程 [英] Calling mpi binary in serial as subprocess of mpi application
问题描述
我有一个大型并行(使用MPI)模拟应用程序,该应用程序会生成大量数据.为了评估这些数据,我使用了python脚本.
I have a large parallel (using MPI) simulation application which produces large amounts of data. In order to evaluate this data I use a python script.
我现在要做的是多次运行此应用程序(> 1000),然后根据所得数据计算统计属性.
What I now need to do is to run this application a large number of times (>1000) and calculate statistical properties from the resulting data.
到目前为止,我的方法是让python脚本并行运行(使用mpi4py,即使用48个节点),并使用subprocess.check_call
调用模拟代码.
我需要此调用才能以串行方式运行我的mpi仿真应用程序.
在这种情况下,我不需要模拟也可以并行运行.
然后,python脚本可以并行分析数据,完成后将启动新的模拟运行,直到积累了大量运行.
My approach up until now is, to have a python script running in parallel (using mpi4py, using i.e. 48 nodes) calling the simulation code using subprocess.check_call
.
I need this call to run my mpi simulation application in serial.
I do not need the simulation to also run in parallel in this case.
The python script can then analyze the data in parallel and after finishing it will startup a new simulation run till a large number of runs is accumulated.
目标是
- 不保存2000次运行的整个数据集
- 将中间数据保存在内存中
存根MWE:
from mpi4py import MPI
import subprocess
print "Master hello"
call_string = 'python multi_call_slave.py'
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
print "rank %d of size %d in master calling: %s" % (rank, size, call_string)
std_outfile = "./sm_test.out"
nr_samples = 1
for samples in range(0, nr_samples):
with open(std_outfile, 'w') as out:
subprocess.check_call(call_string, shell=True, stdout=out)
# analyze_data()
# communicate_results()
文件multi_call_slave.py
(这是C模拟代码):
file multi_call_slave.py
(this would be the C simulation code):
from mpi4py import MPI
print "Slave hello"
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
print "rank %d of size %d in slave" % (rank, size)
这将不起作用.结果输出到stdout
:
This will not work. Resulting output in stdout
:
Master hello
rank 1 of size 2 in master calling: python multi_call_slave_so.py
Master hello
rank 0 of size 2 in master calling: python multi_call_slave_so.py
[cli_0]: write_line error; fd=7 buf=:cmd=finalize
:
system msg for write_line failure : Broken pipe
Fatal error in MPI_Finalize: Other MPI error, error stack:
MPI_Finalize(311).....: MPI_Finalize failed
MPI_Finalize(229).....:
MPID_Finalize(150)....:
MPIDI_PG_Finalize(126): PMI_Finalize failed, error -1
[cli_1]: write_line error; fd=8 buf=:cmd=finalize
:
system msg for write_line failure : Broken pipe
Fatal error in MPI_Finalize: Other MPI error, error stack:
MPI_Finalize(311).....: MPI_Finalize failed
MPI_Finalize(229).....:
MPID_Finalize(150)....:
MPIDI_PG_Finalize(126): PMI_Finalize failed, error -1
sm_test.out
中的结果输出:
Slave hello
rank 0 of size 2 in slave
原因是,子进程假定将作为并行应用程序运行,而我打算将其作为串行应用程序运行. 作为一种非常"hacky"的解决方法,我执行了以下操作:
The reason is, that the subprocess assumes to be run as a parallel application, whereas I intend to run it as a serial application. As a very "hacky" workaround I did the following:
- 编译具有特定mpi分布(即intel mpi)的所有需要的mpi感知库
- 使用不同的mpi库(即openmpi)编译仿真代码
如果我现在使用intel mpi启动并行python脚本,则底层仿真将不会意识到周围的并行环境,因为它正在使用其他库.
If I would now start my parallel python script using intel mpi, the underlying simulation would not be aware of the surrounding parallel environment as it was using a different library.
这可以工作一段时间,但不幸的是,它不是很容易移植,并且由于各种原因很难在不同的集群上进行维护.
This worked fine for a while, but unfortunately is not very portable and difficult to maintain on different clusters for various reasons.
我可以
- 使用
srun
将子流程调用循环放入shell脚本- 将在HD上强制执行任务缓冲结果
- put the subprocess calling loop into a shell script using
srun
- Would mandate buffering results on HD
- 不打算那样用
- 难以确定子流程是否完成
- 适当地更改必要的C代码
- 尝试操作环境变量无济于事
- 也不打算那样用
- 使用
mpirun -n 1
或srun
进行子流程调用无济于事
- tried manipulating the environment variables to no avail
- also not meant to be used like that
- using
mpirun -n 1
orsrun
for the subprocess call does not help
是否有任何优雅的官方方法?我真的没有主意,不胜感激!
Is there any elegant, official way of doing this? I am really out of ideas and appreciate any input!
推荐答案
不,这既没有优雅的方法,也没有官方的方法.在MPI应用程序中执行其他程序的唯一官方支持的方法是使用
MPI_Comm_spawn
.通过简单的操作系统机制(例如subprocess
提供的那种机制)来生成子MPI进程是危险的,在某些情况下甚至可能带来灾难性的后果.No, there is neither an elegant nor an official way to do this. The only officially supported way to execute other programs from within an MPI application is the use of
MPI_Comm_spawn
. Spawning child MPI processes via simple OS mechanisms like the one provided bysubprocess
is dangerous and could even have catastrophic consequences in certain cases.尽管
MPI_Comm_spawn
没有提供一种机制来确定子进程何时退出,但是您可以使用内部通信屏障来模拟它.您仍然会遇到问题,因为MPI_Comm_spawn
调用不允许任意重定向标准I/O,而是将其重定向到mpiexec
/mpirun
.While
MPI_Comm_spawn
does not provide a mechanism to find out when the child process has exited, you could kind of simulate it with an intercomm barrier. You will still face problems since theMPI_Comm_spawn
call does not allow for the standard I/O to be redirected arbitrarily and instead it gets redirected tompiexec
/mpirun
.您可以做的是编写一个包装器脚本,该脚本删除MPI库为了传递会话信息而可能使用的所有可能途径.对于Open MPI,它将是任何以
OMPI_
开头的环境变量.对于Intel MPI,它将是以I_
开头的变量.等等.某些库可能使用文件或共享内存块或某些其他OS机制,您也必须注意这些问题.一旦消除了用于传达MPI会话信息的任何可能的机制,您就可以简单地启动可执行文件,并且它应构成一个单例MPI作业(即,表现为与mpiexec -n 1
一起运行).What you could do is to write a wrapper script that deletes all possible pathways that the MPI library might use in order to pass session information around. For Open MPI that would be any environment variable that starts with
OMPI_
. For Intel MPI that would be variables that start withI_
. And so on. Some libraries might use files or shared memory blocks or some other OS mechanisms and you'll have to take care of those too. Once any possible mechanism to communicate MPI session information has been eradicated, you could simply start the executable and it should form a singleton MPI job (that is, behave as if run withmpiexec -n 1
).这篇关于串行调用MPI二进制文件作为MPI应用程序的子进程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!