每个核心分配两个MPI流程 [英] assign two MPI processes per core

查看:611
本文介绍了每个核心分配两个MPI流程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何为每个内核分配2个MPI进程?

How do I assign 2 MPI processes per core?

例如,如果我执行mpirun -np 4 ./application,则它应使用2个物理内核来运行4个MPI进程(每个内核2个进程).我正在使用Open MPI 1.6.我做了mpirun -np 4 -nc 2 ./application,但无法运行.

For example, if I do mpirun -np 4 ./application then it should use 2 physical cores to run 4 MPI processes (2 processes per core). I am using Open MPI 1.6. I did mpirun -np 4 -nc 2 ./application but wasn't able to run it.

它抱怨mpirun was unable to launch the specified application as it could not find an executable:

推荐答案

orterun(Open MPI SPMD/MPMD启动器; mpirun/mpiexec只是它的符号链接)对进程绑定有一些支持,但不够灵活允许您每个核心绑定两个进程.您可以尝试使用-bycore -bind-to-core,但是当所有内核都已经分配了一个进程时,它将出错.

orterun (the Open MPI SPMD/MPMD launcher; mpirun/mpiexec are just symlinks to it) has some support for process binding but it is not flexible enough to allow you to bind two processes per core. You can try with -bycore -bind-to-core but it will err when all cores already have one process assigned to them.

但是有一种解决方法-您可以使用 rankfile 在其中明确指定将每个等级绑定到哪个插槽.这是一个示例:为了在双核CPU上运行4个进程(每个内核2个进程),您需要执行以下操作:

But there is a workaround - you can use a rankfile where you explicitly specify which slot to bind each rank to. Here is an example: in order to run 4 processes on a dual-core CPU with 2 processes per core, you would do the following:

mpiexec -np 4 -H localhost -rf rankfile ./application

其中rankfile是具有以下内容的文本文件:

where rankfile is a text file with the following content:

rank 0=localhost slot=0:0
rank 1=localhost slot=0:0
rank 2=localhost slot=0:1
rank 3=localhost slot=0:1

这会将等级0和1放在处理器0的核心0上,将等级2和3放在处理器0的核心1上.

This will place ranks 0 and 1 on core 0 of processor 0 and ranks 2 and 3 on core 1 of processor 0. Ugly but works:

$ mpiexec -np 4 -H localhost -rf rankfile -tag-output cat /proc/self/status | grep Cpus_allowed_list
[1,0]<stdout>:Cpus_allowed_list:     0
[1,1]<stdout>:Cpus_allowed_list:     0
[1,2]<stdout>:Cpus_allowed_list:     1
[1,3]<stdout>:Cpus_allowed_list:     1

编辑:来自您的其他问题很明显,您实际上是在超线程CPU上运行的.然后,您必须弄清楚逻辑处理器的物理编号(这有点令人困惑,但是物理编号对应于/proc/cpuinfo中报告的processor:的值).获得它的最简单方法是安装hwloc库.它提供了可以像这样使用的hwloc-ls工具:

From your other question is becomes clear that you are actually running on a hyperthreaded CPU. Then you would have to figure out the physical numbering of your logical processors (it's a bit confusing but physical numbering corresponds to the value of processor: as reported in /proc/cpuinfo). The easiest way to obtain it is to install the hwloc library. It provides the hwloc-ls tool that you can use like this:

$ hwloc-ls --of console
...
  NUMANode L#0 (P#0 48GB) + Socket L#0 + L3 L#0 (12MB)
    L2 L#0 (256KB) + L1 L#0 (32KB) + Core L#0
      PU L#0 (P#0)    <-- Physical ID 0
      PU L#1 (P#12)   <-- Physical ID 12
...

物理ID在括号中的P#之后列出.在您的8核情况下,第一个核的第二个超线程(核0)很可能具有ID 8,因此您的等级文件将类似于:

Physical IDs are listed after P# in the brackets. In your 8-core case the second hyperthread of the first core (core 0) would most likely have ID 8 and hence your rankfile would look something like:

rank 0=localhost slot=p0
rank 1=localhost slot=p8
rank 2=localhost slot=p1
rank 3=localhost slot=p9

(请注意p前缀-不要忽略它)

(note the p prefix - don't omit it)

如果没有hwloc或无法安装它,则必须自己解析/proc/cpuinfo.超线程将具有相同的physical idcore id值,但具有不同的processorapicid.物理ID等于processor的值.

If you don't have hwloc or you cannot install it, then you would have to parse /proc/cpuinfo on your own. Hyperthreads would have the same values of physical id and core id but different processor and apicid. The physical ID is equal to the value of processor.

这篇关于每个核心分配两个MPI流程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆