为什么以下OpenMP程序无法减少我的变量? [英] Why fails the following OpenMP program to reduce my variable?

查看:72
本文介绍了为什么以下OpenMP程序无法减少我的变量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

请考虑以下最小C代码示例.使用export OMP_NUM_THREADS=4 && gcc -fopenmp minimal.c && ./a.out进行编译和执行时(Debian 8上的GCC 4.9.2),这在我的机器上产生了5条带有rho=100的行(有时也是200或400).当然,所有五个打印行的预期输出都是rho=400.

Consider the following minimal C code example. When compiling and executing with export OMP_NUM_THREADS=4 && gcc -fopenmp minimal.c && ./a.out (GCC 4.9.2 on Debian 8), this produces five lines with rho=100 (sometimes also 200 or 400) on my machine. Expected output is of course rho=400 for all five printed lines.

如果我在// MARKER处插入更多代码或在其中放置障碍,则该程序更有可能产生正确的结果.但是,即使有另一个障碍,它有时也会失败,我的程序也会失败.因此问题似乎在于a进入归约循环时未正确初始化.

The program is more likely to produce the correct result if I insert more code at // MARKER or place a barrier just there. But even with another barrier, it sometimes fails and so does my program. So the problem seems to be that a is not properly initialized when going into the reduction loop.

OpenMP 4.0.0手册甚至在第55页上指出除非指定了nowait子句,否则在循环结构的末尾会有一个隐式屏障.因此,此时应设置a.这是怎么了?我想念什么吗?

The OpenMP 4.0.0 manual even states on page 55 that there is an implicit barrier at the end of a loop construct unless a nowait clause is specified. So a should be set up at this point. What is going wrong here? Am I missing something?

#include <stdio.h>
#ifdef _OPENMP
#include <omp.h>
#define ID omp_get_thread_num()
#else
#define ID 0
#endif

double a[100];

int main(int argc, char *argv[]) {
    int i;
    double rho;
    #pragma omp parallel
    {
        #pragma omp for
        for (i = 0; i < 100; i++) {
            a[i] = 2;
        }
        // MARKER
        rho = 0.0;
        #pragma omp for reduction(+: rho)
        for (i = 0; i < 100; i++) {
            rho += ((a[i])*(a[i]));
        }
        fprintf(stderr, "[%d] rho=%f\n", ID, rho);
    }
    fprintf(stderr, "[%d] rho=%f\n", ID, rho);
    return 0;
}

推荐答案

好,我知道了答案,但是我很努力地得到答案...

OK I've got the answer, but I sweat to get it...

这是一个竞争条件,原因是共享了rho并且您像rho = 0.0;

This is a race condition due to the fact that rho is shared and that you initialise it inside the parallel region like this rho = 0.0;

要么在并行区域之外对其进行初始化,要么就在其之前使用#pragma omp single可以修复代码...

Either initialising it outside of the parallel region, or using a #pragma omp single right before will fix the code...

这篇关于为什么以下OpenMP程序无法减少我的变量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆