每个核心分配两个MPI流程 [英] assign two MPI processes per core
问题描述
如何为每个内核分配2个MPI进程?
How do I assign 2 MPI processes per core?
例如,如果我执行mpirun -np 4 ./application
,则它应使用2个物理内核来运行4个MPI进程(每个内核2个进程).我正在使用Open MPI 1.6.我做了mpirun -np 4 -nc 2 ./application
,但无法运行.
For example, if I do mpirun -np 4 ./application
then it should use 2 physical cores to run 4 MPI processes (2 processes per core). I am using Open MPI 1.6. I did mpirun -np 4 -nc 2 ./application
but wasn't able to run it.
它抱怨mpirun was unable to launch the specified application as it could not find an executable:
推荐答案
orterun
(Open MPI SPMD/MPMD启动器; mpirun/mpiexec
只是它的符号链接)对进程绑定有一些支持,但不够灵活允许您每个核心绑定两个进程.您可以尝试使用-bycore -bind-to-core
,但是当所有内核都已经分配了一个进程时,它将出错.
orterun
(the Open MPI SPMD/MPMD launcher; mpirun/mpiexec
are just symlinks to it) has some support for process binding but it is not flexible enough to allow you to bind two processes per core. You can try with -bycore -bind-to-core
but it will err when all cores already have one process assigned to them.
但是有一种解决方法-您可以使用 rankfile 在其中明确指定将每个等级绑定到哪个插槽.这是一个示例:为了在双核CPU上运行4个进程(每个内核2个进程),您需要执行以下操作:
But there is a workaround - you can use a rankfile where you explicitly specify which slot to bind each rank to. Here is an example: in order to run 4 processes on a dual-core CPU with 2 processes per core, you would do the following:
mpiexec -np 4 -H localhost -rf rankfile ./application
其中rankfile
是具有以下内容的文本文件:
where rankfile
is a text file with the following content:
rank 0=localhost slot=0:0
rank 1=localhost slot=0:0
rank 2=localhost slot=0:1
rank 3=localhost slot=0:1
这会将等级0和1放在处理器0的核心0上,将等级2和3放在处理器0的核心1上.
This will place ranks 0 and 1 on core 0 of processor 0 and ranks 2 and 3 on core 1 of processor 0. Ugly but works:
$ mpiexec -np 4 -H localhost -rf rankfile -tag-output cat /proc/self/status | grep Cpus_allowed_list
[1,0]<stdout>:Cpus_allowed_list: 0
[1,1]<stdout>:Cpus_allowed_list: 0
[1,2]<stdout>:Cpus_allowed_list: 1
[1,3]<stdout>:Cpus_allowed_list: 1
编辑:来自您的其他问题很明显,您实际上是在超线程CPU上运行的.然后,您必须弄清楚逻辑处理器的物理编号(这有点令人困惑,但是物理编号对应于/proc/cpuinfo
中报告的processor:
的值).获得它的最简单方法是安装hwloc
库.它提供了可以像这样使用的hwloc-ls
工具:
From your other question is becomes clear that you are actually running on a hyperthreaded CPU. Then you would have to figure out the physical numbering of your logical processors (it's a bit confusing but physical numbering corresponds to the value of processor:
as reported in /proc/cpuinfo
). The easiest way to obtain it is to install the hwloc
library. It provides the hwloc-ls
tool that you can use like this:
$ hwloc-ls --of console
...
NUMANode L#0 (P#0 48GB) + Socket L#0 + L3 L#0 (12MB)
L2 L#0 (256KB) + L1 L#0 (32KB) + Core L#0
PU L#0 (P#0) <-- Physical ID 0
PU L#1 (P#12) <-- Physical ID 12
...
物理ID在括号中的P#
之后列出.在您的8核情况下,第一个核的第二个超线程(核0)很可能具有ID 8
,因此您的等级文件将类似于:
Physical IDs are listed after P#
in the brackets. In your 8-core case the second hyperthread of the first core (core 0) would most likely have ID 8
and hence your rankfile would look something like:
rank 0=localhost slot=p0
rank 1=localhost slot=p8
rank 2=localhost slot=p1
rank 3=localhost slot=p9
(请注意p
前缀-不要忽略它)
(note the p
prefix - don't omit it)
如果没有hwloc
或无法安装它,则必须自己解析/proc/cpuinfo
.超线程将具有相同的physical id
和core id
值,但具有不同的processor
和apicid
.物理ID等于processor
的值.
If you don't have hwloc
or you cannot install it, then you would have to parse /proc/cpuinfo
on your own. Hyperthreads would have the same values of physical id
and core id
but different processor
and apicid
. The physical ID is equal to the value of processor
.
这篇关于每个核心分配两个MPI流程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!