openmp:检查是否嵌套了并行 [英] openmp : check if nested parallesim

查看:177
本文介绍了openmp:检查是否嵌套了并行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个将两个std::vector相乘的方法:

Assume I have a method that multiplies two std::vector :

double multiply(std::vector<double> const& a, std::vector<double> const& b){
    double tmp(0);
    /*here I could easily do a parallelization with*/
    /*#pragma omp parallel loop for*/
    for(unsigned int i=0;i<a.size();i++){
        tmp += a[i]*b[i];
    }
    return tmp;
}

如果我在此函数中设置了pragma宏,则将运行对multiply(...)的调用 在所有线程上.

If I set in this function the pragma macro, a call to multiply(...) will run on all threads.

现在假设我要在其他地方做很多向量乘法:

Now assume that somewehere else I want to do many vector multiplication :

void many_multiplication(std::vector<double>* a, std::vector<double>* b, unsigned int N){
    /*here I could easily do a parallelization with*/
    /*#pragma omp parallel loop for*/
    for(unsigned int i=0;i<N;i++){
        for(unsigned int j=0;j<N;j++){
            multiply(a[i],b[j]);
        }
    }
}

我也可以用相同的方式进行并行化.但这将导致 不必要的嵌套并行性.

I could also do the parallelization the same way. But this will lead to unwanted nested parallelism.

如何检查在并行区域中是否调用了multiply(..), 那么multiply(...)pragma宏为关闭".如果它被称为 从非平行区域开始,然后打开".

How can I check that if multiply(..) is called within a parallel region, then the pragma macro of multiply(...) is "turn off". And if it's called from a non-parallel region, then it's "turn on".

推荐答案

嵌套并行处理默认情况下禁用,除非通过将OMP_NESTED设置为true或通过调用omp_set_nested(1);( OpenMP规范的第2.3.2节)明确修改了嵌套设置正如Avi Ginsburg所建议的那样,这是一个坏主意.相反,您应该根据嵌套级别使用条件并行执行:

Nested parallelism is disabled by default, unless enabled specificially by setting OMP_NESTED to true or by calling omp_set_nested(1); (§2.3.2 of the OpenMP specification) Explicitly modifying the settings for nesting as suggested by Avi Ginsburg is a bad idea. Instead, you should use conditional parallel execution based on the level of nesting:

double multiply(std::vector<double> const& a, std::vector<double> const& b){
    double tmp(0);
    int active_levels = omp_get_active_level();
    #pragma omp parallel for reduction(+:tmp) if(active_level < 1)
    for(unsigned int i=0;i<a.size();i++){
        tmp += a[i]+b[i];
    }
    return tmp;
}

omp_get_active_level()返回在进行调用时包围线程的活动并行区域的数量.如果从并行区域外部或外部无效区域调用,则返回0.多亏了if(active_level < 1)子句,无论嵌套设置如何,只要并行区域没有被封装在活动区域​​中,它都将被激活,即并行运行.

omp_get_active_level() returns the number of active parallel regions that enclose the thread at the moment the call is made. It returns 0 if called from outside a parallel region or with inactive outer region(s). Thanks to the if(active_level < 1) clause, the parallel region will only be activated, i.e. run in parallel, if it is not enclosed in an active region, regardless of the setting for nesting.

如果您的编译器不支持OpenMP 3.0或更高版本(例如,使用任何版本的MS Visual C/C ++编译器),则可以使用omp_in_parallel()调用代替:

If your compiler does not support OpenMP 3.0 or higher (e.g. with any version of MS Visual C/C++ Compiler), then omp_in_parallel() call can be used instead:

double multiply(std::vector<double> const& a, std::vector<double> const& b){
    double tmp(0);
    int in_parallel = omp_in_parallel();
    #pragma omp parallel for reduction(+:tmp) if(in_parallel == 0)
    for(unsigned int i=0;i<a.size();i++){
        tmp += a[i]+b[i];
    }
    return tmp;
}

如果至少一个封闭的并行区域处于活动状态,但

omp_in_parallel()返回非零值,但不提供有关嵌套深度的信息,即灵活性稍差.

omp_in_parallel() returns non-zero if at least one enclosing parallel region is active, but does not provide information about the depth of nesting, i.e. is a bit less flexible.

在任何情况下,编写此类代码都是一种不良做法.您应该只保留并行区域不变,并允许最终用户选择是否应启用嵌套并行性.

In any case, writing such code is a bad practice. You should simply leave the parallel regions as they are and allow the end user choose whether nested parallelism should be enabled or not.

这篇关于openmp:检查是否嵌套了并行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆