编译指示用于内部编译指示主或单个 [英] pragma omp for inside pragma omp master or single
问题描述
我在这里坐着一些东西,试图使孤立工作正常进行,并通过减少对#pragma omp parallel
的调用来减少开销.
我正在尝试的是类似的东西:
I'm sitting with some stuff here trying to make orphaning work, and reduce the overhead by reducing the calls of #pragma omp parallel
.
What I'm trying is something like:
#pragma omp parallel default(none) shared(mat,mat2,f,max_iter,tol,N,conv) private(diff,k)
{
#pragma omp master // I'm not against using #pragma omp single or whatever will work
{
while(diff>tol) {
do_work(mat,mat2,f,N);
swap(mat,mat2);
if( !(k%100) ) // Only test stop criteria every 100 iteration
diff = conv[k] = do_more_work(mat,mat2);
k++;
} // end while
} // end master
} // end parallel
do_work
取决于先前的迭代,因此while循环必须顺序运行.
但是我希望能够并行运行"do_work",因此它看起来像:
The do_work
depends on the previous iteration so the while-loop is has to be run sequential.
But I would like to be able to run the ´do_work´ parallel, so it would look something like:
void do_work(double *mat, double *mat2, double *f, int N)
{
int i,j;
double scale = 1/4.0;
#pragma omp for schedule(runtime) // Just so I can test different settings without having to recompile
for(i=0;i<N;i++)
for(j=0;j<N;j++)
mat[i*N+j] = scale*(mat2[(i+1)*N+j]+mat2[(i-1)*N+j] + ... + f[i*N+j]);
}
我希望这可以通过某种方式完成,但我不确定如何做到.因此,我能获得的任何帮助都将不胜感激(如果您告诉我这是不可能的话).顺便说一句,我正在使用Open mp 3.0,gcc编译器和sun studio编译器.
I hope this can be accomplished some way, I'm just not sure how. So any help I can get is greatly appreciated (also if you're telling me this isn't possible). Btw I'm working with open mp 3.0, the gcc compiler and the sun studio compiler.
推荐答案
原始代码中的外部并行区域仅包含一个串行段(#pragma omp master
),这没有任何意义,并有效地导致了纯串行执行(无并行性) ).由于do_work()
取决于先前的迭代,但是您希望并行运行它,因此必须使用同步.用于该目的的openmp工具是一个(显式或隐式)同步屏障.
The outer parallel region in your original code contains only a serial piece (#pragma omp master
), which makes no sense and effectively results in purely serial execution (no parallelism). As do_work()
depends on the previous iteration, but you want to run it in parallel, you must use synchronisation. The openmp tool for that is an (explicit or implicit) synchronisation barrier.
例如(与您的代码相似的代码):
For example (code similar to yours):
#pragma omp parallel
for(int j=0; diff>tol; ++j) // must be the same condition for each thread!
#pragma omp for // note: implicit synchronisation after for loop
for(int i=0; i<N; ++i)
work(j,i);
请注意,如果任何线程仍在当前j
上运行,则隐式同步可确保没有线程进入下一个j
.
Note that the implicit synchronisation ensures that no thread enters the next j
if any thread is still working on the current j
.
替代方案
for(int j=0; diff>tol; ++j)
#pragma omp parallel for
for(int i=0; i<N; ++i)
work(j,i);
应该效率较低,因为它会在每次迭代时创建一个新的线程组,而不仅仅是同步.
should be less efficient, as it creates a new team of threads at each iteration, instead of merely synchronising.
这篇关于编译指示用于内部编译指示主或单个的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!