为什么以下OpenMP程序无法减少我的变量? [英] Why fails the following OpenMP program to reduce my variable?
问题描述
请考虑以下最小C代码示例.使用export OMP_NUM_THREADS=4 && gcc -fopenmp minimal.c && ./a.out
进行编译和执行时(Debian 8上的GCC 4.9.2),这在我的机器上产生了5条带有rho=100
的行(有时也是200或400).当然,所有五个打印行的预期输出都是rho=400
.
Consider the following minimal C code example. When compiling and executing with export OMP_NUM_THREADS=4 && gcc -fopenmp minimal.c && ./a.out
(GCC 4.9.2 on Debian 8), this produces five lines with rho=100
(sometimes also 200 or 400) on my machine. Expected output is of course rho=400
for all five printed lines.
如果我在// MARKER
处插入更多代码或在其中放置障碍,则该程序更有可能产生正确的结果.但是,即使有另一个障碍,它有时也会失败,我的程序也会失败.因此问题似乎在于a
进入归约循环时未正确初始化.
The program is more likely to produce the correct result if I insert more code at // MARKER
or place a barrier just there. But even with another barrier, it sometimes fails and so does my program. So the problem seems to be that a
is not properly initialized when going into the reduction loop.
OpenMP 4.0.0手册甚至在第55页上指出除非指定了nowait子句,否则在循环结构的末尾会有一个隐式屏障.因此,此时应设置a
.这是怎么了?我想念什么吗?
The OpenMP 4.0.0 manual even states on page 55 that there is an implicit barrier at the end of a loop construct unless a nowait clause is specified. So a
should be set up at this point. What is going wrong here? Am I missing something?
#include <stdio.h>
#ifdef _OPENMP
#include <omp.h>
#define ID omp_get_thread_num()
#else
#define ID 0
#endif
double a[100];
int main(int argc, char *argv[]) {
int i;
double rho;
#pragma omp parallel
{
#pragma omp for
for (i = 0; i < 100; i++) {
a[i] = 2;
}
// MARKER
rho = 0.0;
#pragma omp for reduction(+: rho)
for (i = 0; i < 100; i++) {
rho += ((a[i])*(a[i]));
}
fprintf(stderr, "[%d] rho=%f\n", ID, rho);
}
fprintf(stderr, "[%d] rho=%f\n", ID, rho);
return 0;
}
推荐答案
好,我知道了答案,但是我很努力地得到答案...
OK I've got the answer, but I sweat to get it...
这是一个竞争条件,原因是共享了rho
并且您像rho = 0.0;
This is a race condition due to the fact that rho
is shared and that you initialise it inside the parallel region like this rho = 0.0;
要么在并行区域之外对其进行初始化,要么就在其之前使用#pragma omp single
可以修复代码...
Either initialising it outside of the parallel region, or using a #pragma omp single
right before will fix the code...
这篇关于为什么以下OpenMP程序无法减少我的变量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!