OpenMP递归任务 [英] OpenMP recursive tasks

查看:340
本文介绍了OpenMP递归任务的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑遵循程序来计算斐波那契数.
它使用OpenMP任务进行并行化.

Consider following Program calculating Fibonacci Numbers.
It uses OpenMP Tasks for parallelisation.

#include <iostream> 
#include <omp.h>

using namespace std;

int fib(int n)
{
    if(n == 0 || n == 1)
        return n;

    int res, a, b;
    #pragma omp parallel
    {
        #pragma omp single 
        {
            #pragma omp task shared(a)
            a = fib(n-1);
            #pragma omp task shared(b)
            b = fib(n-2);
            #pragma omp taskwait
            res = a+b;
        } 

    }
    return res;
  }

int main()
{  
    cout << fib(40);    
}

我使用的是gcc版本4.8.2和Fedora20.
使用g ++ -fopenmp name_of_program.cpp -Wall编译上述程序时 并运行它,当我查看htop时,我看到只有两个(有时3个)线程正在运行. 我在其上运行该程序的计算机具有8个逻辑CPU. 我的问题是,我需要怎么做才能将工作分流到8个线程上. 我尝试导出OMP_NESTED = TRUE,但这在运行 程序:
libgomp:线程创建失败:资源暂时不可用
我程序的目的不是有效地计算斐波那契数,而是 在OpenMP中使用任务或类似工具.

I use gcc version 4.8.2 and Fedora 20.
When compiling the above program with g++ -fopenmp name_of_program.cpp -Wall and running it, I see when looking into htop that only two (sometimes 3) threads are running. The machine I'm running this program on has 8 logical CPUs. My question is, what do I need to do to offload the work onto 8 Threads. I tried export OMP_NESTED=TRUE, but this leads to following error while running the Program:
libgomp: Thread creation failed: Resource temporarily unavailable
The point of my program is not to efficiently compute Fibonacci Numbers, but to use tasks or something similar in OpenMP.

推荐答案

使用OMP_NESTED = FALSE,将一组线程分配给顶级并行区域,并且在每个嵌套级别都没有额外的线程,因此最多两个线程将做有益的工作.

With OMP_NESTED=FALSE, a team of threads is assigned to the top-level parallel region, and no extra threads at each nested level, so at most two threads will be doing useful work.

如果OMP_NESTED = TRUE,则在每个级别分配一组线程.在您的系统上,有8个逻辑CPU,因此组的大小可能为8.该组包括来自该区域外部的一个线程,因此仅启动了7个新线程. fib(n)的递归树大约有fib(n)个节点. (fib的一个很好的自引用属性!)因此,该代码可能会创建7 * fib(n)线程,这会很快耗尽资源.

With OMP_NESTED=TRUE, a team of threads is assigned at each level. On your system there are 8 logical CPUs, so the team size is likely 8. The team includes one thread from outside the region, so only 7 new threads are launched. The recursion tree for fib(n) has about fib(n) nodes. (A nice self-referential property of fib!) Thus the code might create 7*fib(n) threads, which can quickly exhaust resources.

解决方法是在整个任务树周围使用单个并行区域.将omp parallelomp single逻辑移至fib之外的main.这样一来,一个线程小组就可以在整个任务树上工作.

The fix is to use a single parallel region around the entire task tree. Move the omp parallel and omp single logic to main, outside of fib. That way a single thread team will work on the entire task tree.

一般要点是将潜在的并行性与实际的并行性区分开来.任务指令指定潜在的并行性,在执行过程中可能会或可能不会实际使用. omp parallel(出于所有实际目的)指定实际的并行度.通常,您希望实际的并行度与可用的硬件相匹配,以免使机器陷入沼泽,但潜在的并行度要大得多,以便运行时可以平衡负载.

The general point is to distinguish potential parallelism from actual parallelism. The task directives specify potential parallelism, which might or might not actually be used during an execution. An omp parallel (for all practical purposes) specifies actual parallelism. Usually you want the actual parallelism to match the available hardware, so as not to swamp the machine, but have the potential parallelism be much larger, so that the run-time can balance load.

这篇关于OpenMP递归任务的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆