如何在工作线程中重用由主线程创建的OMP线程池? [英] How to Reuse OMP Thread Pool, Created by Main Thread, in Worker Thread?

查看:187
本文介绍了如何在工作线程中重用由主线程创建的OMP线程池?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的c ++应用程序开始附近,我的主线程使用OMP来并行化多个for循环.在第一个并行化的for循环之后,我看到使用的线程在应用程序期间一直存在,并且可以使用以下命令(在CentOS 7中工作)重用于从主线程执行的后续OMP for循环:

Near the start of my c++ application, my main thread uses OMP to parallelize several for loops. After the first parallelized for loop, I see that the threads used remain in existence for the duration of the application, and are reused for subsequent OMP for loops executed from the main thread, using the command (working in CentOS 7):

for i in $(pgrep myApplication); do ps -mo pid,tid,fname,user,psr -p $i;done

稍后,在我的程序中,我从主线程启动了一个boost线程,在其中我使用OMP并行化了一个for循环.在这一点上,我看到创建了一组全新的线程,这具有相当大的开销.

Later in my program, I launch a boost thread from the main thread, in which I parallelize a for loop using OMP. At this point, I see an entirely new set of threads are created, which has a decent amount of overhead.

是否可以使boost线程中的OMP并行for循环重用主线程创建的原始OMP线程池?

Is it possible to make the OMP parallel for loop within the boost thread reuse the original OMP thread pool created by the main thread?

一些伪代码:

myFun(data)
{

    // Want to reuse OMP thread pool from main here.
    omp parallel for
    for(int i = 0; i < N; ++i)
    {
       // Work on data
    }

}


main
{

    // Thread pool created here.
    omp parallel for
    for(int i = 0; i < N; ++i)
    {
        // do stuff
    }


    boost::thread myThread(myFun) // Constructor starts thread.

    // Do some serial stuff, no OMP.

    myThread.join();


}

推荐答案

OpenMP与其他线程机制的交互被故意排除在规范之外,因此在很大程度上依赖于实现. GNU OpenMP运行时在TLS中保留一个指向线程池的指针,并将其传播到(嵌套的)团队中.通过pthread_create(或boost::threadstd::thread)启动的线程不会继承指针,因此会生成一个新的池.其他OpenMP运行时也可能是这种情况.

The interaction of OpenMP with other threading mechanisms is deliberately left out of the specification and is therefore dependent heavily on the implementation. The GNU OpenMP runtime keeps a pointer to the thread pool in TLS and propagates it down the (nested) teams. Threads started via pthread_create (or boost::thread or std::thread) do not inherit the pointer and therefore spawn a fresh pool. It is probably the case with other OpenMP runtimes too.

标准中有一项要求,在大多数实现中基本上强制这种行为.它涉及 threadprivate 变量的语义,以及如何在从同一线程派生的不同并行区域中保留它们的值(OpenMP标准,

There is a requirement in the standard that basically forces such behaviour in most implementations. It is about the semantics of the threadprivate variables and how their values are retained across the different parallel regions forked from the same thread (OpenMP standard, 2.15.2 threadprivate Directive):

只有在满足以下所有条件的情况下,才能保证非初始线程的threadprivate变量中的数据值在两个连续的活动parallel区域之间持久存在:

The values of data in the threadprivate variables of non-initial threads are guaranteed to persist between two consecutive active parallel regions only if all of the following conditions hold:

  • parallel区域均未嵌套在另一个显式并行区域内.
  • 用于执行两个parallel区域的线程数是相同的.
  • 用于执行两个parallel区域的线程相似性策略是相同的.
  • 封闭的任务区域中的 dyn-var 内部控制变量的值在同时进入两个parallel区域时为 false .
  • Neither parallel region is nested inside another explicit parallel region.
  • The number of threads used to execute both parallel regions is the same.
  • The thread affinity policies used to execute both parallel regions are the same.
  • The value of the dyn-var internal control variable in the enclosing task region is false at entry to both parallel regions.

如果所有这些条件都成立,并且在两个区域中都引用了threadprivate变量,则在各自区域中具有相同线程号的线程将引用该变量的相同副本.

If these conditions all hold, and if a threadprivate variable is referenced in both regions, then threads with the same thread number in their respective regions will reference the same copy of that variable.

除了性能之外,这可能是在OpenMP运行时中使用线程池的主要原因.

This, besides performance, is probably the main reason for using thread pools in OpenMP runtimes.

现在,想象一下由两个单独的线程分叉的两个并行区域共享同一个工作线程池.第一个线程派生了一个并行区域,并设置了一些threadprivate变量.稍后,第二个并行区域由同一线程分叉,在该线程中使用那些线程专用变量.但是,在两个并行区域之间的某个地方,第二线程分叉了一个并行区域,并且使用了来自同一池的工作线程.由于大多数实现将线程专用变量保留在TLS中,因此无法再声明上述语义.一种可能的解决方案是为每个单独的线程向池中添加新的工作线程,这与创建新的线程池没有太大区别.

Now, imagine that two parallel regions forked by two separate threads share the same worker thread pool. A parallel region was forked by the first thread and some threadprivate variables were set. Later a second parallel region is forked by the same thread, where those threadprivate variables are used. But somewhere between the two parallel regions, a parallel region is forked by the second thread and worker threads from the same pool are utilised. Since most implementations keep threadprivate variables in TLS, the above semantics can no longer be asserted. A possible solution would be to add new worker threads to the pool for each separate thread, which is not much different than creating new thread pools.

我不知道任何使工作线程池共享的解决方法.而且,如果可能的话,它将无法移植,因此将失去OpenMP的主要优势.

I'm not aware of any workarounds to make the worker thread pool shared. And if possible, it will not be portable, therefore the main benefit of OpenMP will be lost.

这篇关于如何在工作线程中重用由主线程创建的OMP线程池?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆