openmp:检查是否嵌套了并行 [英] openmp : check if nested parallesim
问题描述
假设我有一个将两个std::vector
相乘的方法:
Assume I have a method that multiplies two std::vector
:
double multiply(std::vector<double> const& a, std::vector<double> const& b){
double tmp(0);
/*here I could easily do a parallelization with*/
/*#pragma omp parallel loop for*/
for(unsigned int i=0;i<a.size();i++){
tmp += a[i]*b[i];
}
return tmp;
}
如果我在此函数中设置了pragma宏,则将运行对multiply(...)
的调用
在所有线程上.
If I set in this function the pragma macro, a call to multiply(...)
will run
on all threads.
现在假设我要在其他地方做很多向量乘法:
Now assume that somewehere else I want to do many vector multiplication :
void many_multiplication(std::vector<double>* a, std::vector<double>* b, unsigned int N){
/*here I could easily do a parallelization with*/
/*#pragma omp parallel loop for*/
for(unsigned int i=0;i<N;i++){
for(unsigned int j=0;j<N;j++){
multiply(a[i],b[j]);
}
}
}
我也可以用相同的方式进行并行化.但这将导致 不必要的嵌套并行性.
I could also do the parallelization the same way. But this will lead to unwanted nested parallelism.
如何检查在并行区域中是否调用了multiply(..)
,
那么multiply(...)
的pragma
宏为关闭".如果它被称为
从非平行区域开始,然后打开".
How can I check that if multiply(..)
is called within a parallel region,
then the pragma
macro of multiply(...)
is "turn off". And if it's called
from a non-parallel region, then it's "turn on".
推荐答案
嵌套并行处理默认情况下禁用,除非通过将OMP_NESTED
设置为true
或通过调用omp_set_nested(1);
( OpenMP规范的第2.3.2节)明确修改了嵌套设置正如Avi Ginsburg所建议的那样,这是一个坏主意.相反,您应该根据嵌套级别使用条件并行执行:
Nested parallelism is disabled by default, unless enabled specificially by setting OMP_NESTED
to true
or by calling omp_set_nested(1);
(§2.3.2 of the OpenMP specification) Explicitly modifying the settings for nesting as suggested by Avi Ginsburg is a bad idea. Instead, you should use conditional parallel execution based on the level of nesting:
double multiply(std::vector<double> const& a, std::vector<double> const& b){
double tmp(0);
int active_levels = omp_get_active_level();
#pragma omp parallel for reduction(+:tmp) if(active_level < 1)
for(unsigned int i=0;i<a.size();i++){
tmp += a[i]+b[i];
}
return tmp;
}
omp_get_active_level()
返回在进行调用时包围线程的活动并行区域的数量.如果从并行区域外部或外部无效区域调用,则返回0
.多亏了if(active_level < 1)
子句,无论嵌套设置如何,只要并行区域没有被封装在活动区域中,它都将被激活,即并行运行.
omp_get_active_level()
returns the number of active parallel regions that enclose the thread at the moment the call is made. It returns 0
if called from outside a parallel region or with inactive outer region(s). Thanks to the if(active_level < 1)
clause, the parallel region will only be activated, i.e. run in parallel, if it is not enclosed in an active region, regardless of the setting for nesting.
如果您的编译器不支持OpenMP 3.0或更高版本(例如,使用任何版本的MS Visual C/C ++编译器),则可以使用omp_in_parallel()
调用代替:
If your compiler does not support OpenMP 3.0 or higher (e.g. with any version of MS Visual C/C++ Compiler), then omp_in_parallel()
call can be used instead:
double multiply(std::vector<double> const& a, std::vector<double> const& b){
double tmp(0);
int in_parallel = omp_in_parallel();
#pragma omp parallel for reduction(+:tmp) if(in_parallel == 0)
for(unsigned int i=0;i<a.size();i++){
tmp += a[i]+b[i];
}
return tmp;
}
如果至少一个封闭的并行区域处于活动状态,但
omp_in_parallel()
返回非零值,但不提供有关嵌套深度的信息,即灵活性稍差.
omp_in_parallel()
returns non-zero if at least one enclosing parallel region is active, but does not provide information about the depth of nesting, i.e. is a bit less flexible.
在任何情况下,编写此类代码都是一种不良做法.您应该只保留并行区域不变,并允许最终用户选择是否应启用嵌套并行性.
In any case, writing such code is a bad practice. You should simply leave the parallel regions as they are and allow the end user choose whether nested parallelism should be enabled or not.
这篇关于openmp:检查是否嵌套了并行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!