为共享内存配置MPI是什么意思? [英] what does it mean configuring MPI for shared memory?

查看:101
本文介绍了为共享内存配置MPI是什么意思?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些与研究相关的问题.

I have a bit of research related question.

当前,我已经完成了基于MPI的结构骨架框架工作的实现(特别是使用 openmpi 6.3 ) .框架应该在单台机器上使用. 现在,我将其与其他先前的框架实现(例如 scandium

Currently I have finished implementation of structure skeleton frame work based on MPI (specifically using openmpi 6.3). the frame work is supposed to be used on single machine. now, I am comparing it with other previous skeleton implementations (such as scandium, fast-flow, ..)

我注意到的一件事是我的实现的性能不如其他实现. 我认为这是因为,我的实现基于MPI(因此需要进行发送和接收操作匹配的双向通信) 而我正在比较的其他实现是基于共享内存的. (...但是我仍然没有很好的解释来说明这一点,这是我的问题的一部分)

One thing I have noticed is that the performance of my implementation is not as good as the other implementations. I think this is because, my implementation is based on MPI (thus a two sided communication that require the match of send and receive operation) while the other implementations I am comparing with are based on shared memory. (... but still I have no good explanation to reason out that, and it is part of my question)

这两个类别的完成时间有一些大的差异.

There are some big difference on completion time of the two categories.

今天,我还在这里介绍了用于共享内存的open-mpi的配置=> openmpi-sm

Today I am also introduced to configuration of open-mpi for shared memory here => openmpi-sm

我的问题来了.

第一个意味着为共享内存配置MPI是什么意思?我的意思是,虽然MPI进程驻留在它们自己的虚拟内存中;以下命令中的标记实际上是什么? (我认为在MPI中,每次通信都是通过显式传递消息,进程之间不共享内存).

1st what does it means to configure MPI for shared memory? I mean while MPI processes live in their own virtual memory; what really is the flag like in the following command do? (I thought in MPI every communication is by explicitly passing a message, no memory is shared between processes).

    shell$ mpirun --mca btl self,sm,tcp -np 16 ./a.out

2nd 为什么与为共享内存开发的其他框架实现相比,MPI的性能如此差?至少我也在一台多核计算机上运行它. (我想这是因为其他实现使用了线程并行编程,但是对此我没有令人信服的解释.)

2nd why is the performance of MPI is so much worse with compared to other skeleton implementation developed for shared memory? At least I am also running it on one single multi-core machine. (I suppose it is because other implementation used thread parallel programming, but I have no convincing explanation for that).

任何建议或进一步讨论都非常欢迎.

any suggestion or further discussion is very welcome.

请让我知道是否需要进一步澄清我的问题.

Please let me know if I have to further clarify my question.

谢谢您的时间!

推荐答案

Open MPI是非常模块化的.它具有自己的组件模型,称为模块化组件体系结构(MCA).这是--mca参数名称的来源-用于为MCA参数提供运行时值,并由MCA中的不同组件导出.

Open MPI is very modular. It has its own component model called Modular Component Architecture (MCA). This is where the name of the --mca parameter comes from - it is used to provide runtime values to MCA parameters, exported by the different components in the MCA.

只要给定通信器中的两个进程想要互相交谈,MCA就会找到合适的组件,这些组件能够将消息从一个进程传递到另一个进程.如果两个进程都位于同一节点上,则Open MPI通常会选择共享内存BTL组件,称为sm.如果两个进程都位于不同的节点上,则Open MPI将遍历可用的网络接口,并选择可以连接到另一个节点的最快的网络接口.它在诸如InfiniBand之类的快速网络上(通过openib BTL组件)具有一些首选项,但是如果您的群集没有InfiniBand,则如果tcp BTL组件在允许的列表中,则将TCP/IP用作备用. BTL.

Whenever two processes in a given communicator want to talk to each other, MCA finds suitable components, that are able to transmit messages from one process to the other. If both processes reside on the same node, Open MPI usually picks the shared memory BTL component, known as sm. If both processes reside on different nodes, Open MPI walks the available network interfaces and choses the fastest one that can connect to the other node. It puts some preferences on fast networks like InfiniBand (via the openib BTL component), but if your cluster doesn't have InfiniBand, TCP/IP is used as a fallback if the tcp BTL component is in the list of allowed BTLs.

默认情况下,不需要进行任何特殊操作即可启用共享内存通信.只需使用mpiexec -np 16 ./a.out启动程序.您链接到的是Open MPI FAQ中的共享内存部分,该部分提示了可以调整sm BTL的哪些参数以获得更好的性能.我在Open MPI上的经验表明,即使在诸如多层NUMA系统之类的奇异硬件上,默认参数也几乎是最佳的并且工作得很好.请注意,默认的共享内存通信实现将数据复制两次-一次从发送缓冲区复制到共享内存,一次从共享内存复制到接收缓冲区.快捷方式以 KNEM 内核设备的形式存在,但是您必须下载并单独编译它不属于标准Linux内核.借助KNEM支持,Open MPI能够在同一节点上的进程之间执行零复制"传输-副本是由内核设备完成的,并且是从第一个进程的内存到第二个进程的内存的直接副本.过程.这大大改善了驻留在同一节点上的进程之间大型消息的传输.

By default you do not need to do anything special in order to enable shared memory communication. Just launch your program with mpiexec -np 16 ./a.out. What you have linked to is the shared memory part of the Open MPI FAQ which gives hints on what parameters of the sm BTL could be tweaked in order to get better performance. My experience with Open MPI shows that the default parameters are nearly optimal and work very well, even on exotic hardware like multilevel NUMA systems. Note that the default shared memory communication implementation copies the data twice - once from the send buffer to shared memory and once from shared memory to the receive buffer. A shortcut exists in the form of the KNEM kernel device, but you have to download it and compile it separately as it is not part of the standard Linux kernel. With KNEM support, Open MPI is able to perform "zero-copy" transfers between processes on the same node - the copy is done by the kernel device and it is a direct copy from the memory of the first process to the memory of the second process. This dramatically improves the transfer of large messages between processes that reside on the same node.

另一个选择是完全忽略MPI并直接使用共享内存.您可以使用POSIX内存管理界面(请参见此处)来创建共享内存块让所有进程直接对其进行操作.如果将数据存储在共享内存中,则可能会受益,因为不会进行任何复制.但是请注意现代多插槽系统上的NUMA问题,其中每个插槽都有自己的内存控制器,并且从同一板上的远程插槽访问内存的速度较慢.进程固定/绑定也很重要-将--bind-to-socket传递给mpiexec以使其将每个MPI进程固定到单独的CPU内核.

Another option is to completely forget about MPI and use shared memory directly. You can use the POSIX memory management interface (see here) to create a shared memory block have all processes operate on it directly. If data is stored in the shared memory, it could be beneficial as no copies would be made. But watch out for NUMA issues on modern multi-socket systems, where each socket has its own memory controller and accessing memory from remote sockets on the same board is slower. Process pinning/binding is also important - pass --bind-to-socket to mpiexec to have it pinn each MPI process to a separate CPU core.

这篇关于为共享内存配置MPI是什么意思?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆