为什么OpenMP无法对这些数字求和? [英] Why does OpenMP fail to sum these numbers?
问题描述
请考虑以下最小C代码示例.使用export OMP_NUM_THREADS=4 && gcc -fopenmp minimal2.c && ./a.out
进行编译和执行时(OS X 10.11上的自制GCC 5.2.0),这通常会产生正确的行为,即具有相同编号的七行.但有时会发生这种情况:
Consider the following minimal C code example. When compiling and executing with export OMP_NUM_THREADS=4 && gcc -fopenmp minimal2.c && ./a.out
(homebrew GCC 5.2.0 on OS X 10.11), this usually produces the correct behavior, i.e. seven lines with the same number. But sometimes, this happens:
[ ] bsum=1.893293142303100e+03
[1] asum=1.893293142303100e+03
[2] asum=1.893293142303100e+03
[0] asum=1.893293142303100e+03
[3] asum=3.786586284606200e+03
[ ] bsum=1.893293142303100e+03
[ ] asum=3.786586284606200e+03
equal: 0
看起来像是竞争条件,但是我的代码对我来说似乎还不错.我在做什么错了?
It looks like a race condition, but my code seems fine to me. What am I doing wrong?
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#ifdef _OPENMP
#include <omp.h>
#define ID omp_get_thread_num()
#else
#define ID 0
#endif
#define N 1400
double a[N];
double verify() {
int i;
double bsum = 0.0;
for (i = 0; i < N; i++) {
bsum += a[i] * a[i];
}
fprintf(stderr, "[ ] bsum=%.15e\n", bsum);
return bsum;
}
int main(int argc, char *argv[]) {
int i;
double asum = 0.0, bsum;
srand((unsigned int)time(NULL));
//srand(1445167001); // fails on my machine
for (i = 0; i < N; i++) {
a[i] = 2 * (double)rand()/(double)RAND_MAX;
}
bsum = verify();
#pragma omp parallel shared(asum)
{
#pragma omp for reduction(+: asum)
for (i = 0; i < N; i++) {
asum += a[i] * a[i];
}
fprintf(stderr, "[%d] asum=%.15e\n", ID, asum);
}
bsum = verify();
fprintf(stderr, "[ ] asum=%.15e\n", asum);
return 0;
}
编辑:Gilles引起我注意,因为我高估了双精度,所以从第15个有效数字开始的错误是正常的.在Debian机器上,我也无法用2倍正确的数字重现错误的行为,因此这可能与自制gcc或Mac相关.
Gilles brought to my attention that the errors beginning at the 15th significant digit are normal as I overestimated the precision of a double. I also cannot reproduce the faulty behavior with 2x the correct number on the Debian machine, so this might be homebrew gcc or Mac related.
我有一个类似问题的问题,此处,但两者似乎并不相关(至少在我看来),因此我将其作为一个单独的问题开始.
I had a problem with a similar issue here, but the two do not seem to be related (at least in my eyes), so I started this as a separate question.
推荐答案
I strongly suspect that this is because floating-point addition is not associative. As a result, OpenMP sums the multiplications in different orders, yielding slightly different results.
例如,串行加法减少可具有与并行减法不同的加法关联模式.这些不同的关联可能会更改浮点加法的结果.
For example, a serial addition reduction may have a different pattern of addition associations than a parallel reduction. These different associations may change the results of floating-point addition.
请参见用于缩减的OpenMP并行会产生错误的结果,以获取建议解决方案.
See OpenMP parallel for reduction delivers wrong results for a suggested solution.
这篇关于为什么OpenMP无法对这些数字求和?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!