使用std :: async创建的线程发送MPI的线程安全 [英] thread safety of MPI send using threads created with std::async

查看:115
本文介绍了使用std :: async创建的线程发送MPI的线程安全的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据本网站MPI::COMM_WORLD.Send(...)的使用是线程安全的.但是,在我的应用程序中,我经常(并非总是)遇到死锁或遇到分段错误.用mutex.lock()mutex.unlock()包围每个MPI::COMM_WORLD方法的调用,可以始终消除死锁和段错误.

这是我创建线程的方式:

const auto communicator = std::make_shared<Communicator>();
std::vector<std::future<size_t>> handles;
for ( size_t i = 0; i < n; ++i )
{
   handles.push_back(std::async(std::launch::async, foo, communicator));
}
for ( size_t i = 0; i < n; ++i )
{
   handles[i].get();
}

Communicator是一个具有std::mutex成员的类,并且专门调用诸如MPI::COMM_WORLD.Send()MPI::COMM_WORLD.Recv()之类的方法.我没有使用MPI发送/接收的任何其他方法. fooconst std::shared_ptr<Commmunicator> &作为参数.

我的问题:MPI承诺的线程安全性与std::async创建的线程不兼容吗?

解决方案

MPI中的线程安全性无法立即使用.首先,您必须确保您的实现实际上支持多个线程同时进行MPI调用.对于某些MPI实现,例如Open MPI,这要求在构建时使用特殊选项来配置库.然后,您必须告诉MPI在适当的线程支持级别进行初始化.当前,MPI标准定义了四个级别的线程支持:

  • MPI_THREAD_SINGLE-表示用户代码是单线程的.如果使用MPI_Init(),这是初始化MPI的默认级别;
  • MPI_THREAD_FUNNELED-表示用户代码是多线程的,但是只有主线程才能进行MPI调用.主线程是用于初始化MPI库的线程;
  • MPI_THREAD_SERIALIZED-表示用户代码是多线程的,但是对MPI库的调用已序列化;
  • MPI_THREAD_MULTIPLE-表示用户代码是多线程的,所有线程都可以在任何时间进行MPI调用,而无需任何同步.

为了通过线程支持初始化MPI,必须使用MPI_Init_thread()而不是MPI_Init():

int provided;

MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
if (provided < MPI_THREAD_MULTIPLE)
{
    printf("ERROR: The MPI library does not have full thread support\n");
    MPI_Abort(MPI_COMM_WORLD, 1);
}

具有已过时(并已从MPI-3中删除)C ++绑定的等效代码:

int provided = MPI::Init_thread(argc, argv, MPI::THREAD_MULTIPLE);
if (provided < MPI::THREAD_MULTIPLE)
{
    printf("ERROR: The MPI library does not have full thread support\n");
    MPI::COMM_WORLD.Abort(1);
}

线程支持级别的排序如下:MPI_THREAD_SINGLE< MPI_THREAD_FUNNELED< MPI_THREAD_SERIALIZED< MPI_THREAD_MULTIPLE,因此与MPI_THREAD_MULTIPLE不同的任何其他提供的级别都将具有较低的数值-这就是上面的if (...)代码如此编写的原因.

MPI_Init(&argc, &argv)等效于MPI_Init_thread(&argc, &argv, MPI_THREAD_SINGLE, &provided).不需要完全在请求的级别上初始化实现-而是可以在任何其他级别(更高或更低)上初始化实现,这些实现将在provided输出参数中返回.

有关更多信息-请参阅MPI标准的第12.4节,可在此处免费获得.

对于大多数MPI实现,在MPI_THREAD_SINGLE级别上的线程支持实际上与在MPI_THREAD_SERIALIZED级别上提供的线程支持相同-正是您在情况中所观察到的.

由于您尚未指定要使用的MPI实现,因此这里有一个方便的列表.

我已经说过,必须支持Open MPI并启用适当的标志才能支持MPI_THREAD_MULTIPLE.但是还有一个问题-它的InfiniBand组件不是线程安全的,因此在全线程支持级别进行初始化时,Open MPI不会使用本机InfiniBand通信.

英特尔MPI具有两种不同的风格-一种带有和不支持完整的多线程.通过将-mt_mpi选项传递给MPI编译器包装,从而启用与MT版本的链接,从而启用多线程支持.如果启用了OpenMP支持或自动并行器,则也暗示此选项.我不知道启用全线程支持后,IMPI中的InfiniBand驱动程序如何工作.

MPICH(2)不支持InfiniBand,因此它是线程安全的,可能最新版本提供了开箱即用的MPI_THREAD_MULTIPLE支持.

MVAPICH是构建Intel MPI的基础,它支持InfiniBand.当在带有InfiniBand的计算机上使用时,我不知道它在全线程支持级别下的行为.

关于多线程InfiniBand支持的注释非常重要,因为当今许多计算集群都使用InfiniBand结构.在禁用IB组件(Open MPI中的openib BTL)的情况下,大多数MPI实现切换到另一个协议,例如TCP/IP(Open MPI中的tcp BTL),这将导致通信速度变慢和潜伏.

According to this website, the usage of MPI::COMM_WORLD.Send(...) is thread safe. However in my application I often (not always) run into deadlocks or get segmentation faults. Enclosing each call of MPI::COMM_WORLD methods with a mutex.lock() and mutex.unlock() consistently removes deadlocks as well as segfaults.

This is how I create threads:

const auto communicator = std::make_shared<Communicator>();
std::vector<std::future<size_t>> handles;
for ( size_t i = 0; i < n; ++i )
{
   handles.push_back(std::async(std::launch::async, foo, communicator));
}
for ( size_t i = 0; i < n; ++i )
{
   handles[i].get();
}

Communicator is a class which has a std::mutex member and exclusively calls methods such as MPI::COMM_WORLD.Send() and MPI::COMM_WORLD.Recv(). I do not use any other methods of sending/receiving with MPI. foo takes a const std::shared_ptr<Commmunicator> & as argument.

My question: Is the thread safety promised by MPI not compatible with threads created by std::async?

解决方案

Thread-safety in MPI doesn't work out of the box. First, you have to ensure that your implementation actually supports multiple threads making MPI calls at once. With some MPI implementations, for example Open MPI, this requires the library to be configured with special options at build time. Then you have to tell MPI to initialise at the appropriate thread support level. Currently the MPI standard defines four levels of thread support:

  • MPI_THREAD_SINGLE - means that the user code is single threaded. This is the default level at which MPI is initialised if MPI_Init() is used;
  • MPI_THREAD_FUNNELED - means that the user code is multithreaded, but only the main thread makes MPI calls. The main thread is the one which initialises the MPI library;
  • MPI_THREAD_SERIALIZED - means that the user code is multithreaded, but calls to the MPI library are serialised;
  • MPI_THREAD_MULTIPLE - means that the user code is multithreaded and all threads can make MPI calls at any time with no synchronisation whatsoever.

In order to initialise MPI with thread support, one has to use MPI_Init_thread() instead of MPI_Init():

int provided;

MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
if (provided < MPI_THREAD_MULTIPLE)
{
    printf("ERROR: The MPI library does not have full thread support\n");
    MPI_Abort(MPI_COMM_WORLD, 1);
}

Equivalent code with the obsoleted (and removed from MPI-3) C++ bindings:

int provided = MPI::Init_thread(argc, argv, MPI::THREAD_MULTIPLE);
if (provided < MPI::THREAD_MULTIPLE)
{
    printf("ERROR: The MPI library does not have full thread support\n");
    MPI::COMM_WORLD.Abort(1);
}

Thread support levels are ordered like this: MPI_THREAD_SINGLE < MPI_THREAD_FUNNELED < MPI_THREAD_SERIALIZED < MPI_THREAD_MULTIPLE, so any other provided level, different from MPI_THREAD_MULTIPLE would have lower numerical value - that's why the if (...) code above is written so.

MPI_Init(&argc, &argv) is equivalent to MPI_Init_thread(&argc, &argv, MPI_THREAD_SINGLE, &provided). Implementations are not required to initialise exactly at the requested level - rather they could initialise at any other level (higher or lower), which is returned in the provided output argument.

For more information - see §12.4 of the MPI standard, freely available here.

With most MPI implementations, the thread support at level MPI_THREAD_SINGLE is actually equivalent to that provided at level MPI_THREAD_SERIALIZED - exactly what you observe in your case.

Since you've not specified which MPI implementation you use, here comes a handy list.

I've already said that Open MPI has to be compiled with the proper flags enabled in order to support MPI_THREAD_MULTIPLE. But there is another catch - its InfiniBand component is not thread-safe and hence Open MPI would not use native InfiniBand communication when initialised at full thread support level.

Intel MPI comes in two different flavours - one with and one without support for full multithreading. Multithreaded support is enabled by passing the -mt_mpi option to the MPI compiler wrapper which enables linking with the MT version. This option is also implied if OpenMP support or the autoparalleliser is enabled. I am not aware how the InfiniBand driver in IMPI works when full thread support is enabled.

MPICH(2) does not support InfiniBand, hence it is thread-safe and probably most recent versions provide MPI_THREAD_MULTIPLE support out of the box.

MVAPICH is the basis on which Intel MPI is built and it supports InfiniBand. I have no idea how it behaves at full thread support level when used on a machine with InfiniBand.

The note about multithreaded InfiniBand support is important since lot of compute clusters nowadays use InfiniBand fabrics. With the IB component (openib BTL in Open MPI) disabled, most MPI implementations switch to another protocol, for example TCP/IP (tcp BTL in Open MPI), which results in much slower and more latent communication.

这篇关于使用std :: async创建的线程发送MPI的线程安全的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆