什么时候需要减少? [英] When is the reduction needed?

查看:88
本文介绍了什么时候需要减少?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我编写了这段代码,该代码读取一个Matrix,它基本上将矩阵的值相加...但是我的问题是,因为我尝试以不同的方式进行编译,所以我发现reduction (+:sum)并不是必须的,但是,我只是不知道为什么,在这种情况下,我可能错过了简化系统的实际含义.可以选择:#pragma omp parallel for private(i, j) reduction (+:sum)

I've written this code which reads a Matrix and it basically sums the values of the matrix... But my question would be, since I've tried doing the pragma in different ways, I found that the reduction (+:sum) wouldn't be necessary, but, I just don't know why, I might have missed the actual sense of the reduction system in this case. This would be the alternative: #pragma omp parallel for private(i, j) reduction (+:sum)

这将是代码:

#include <stdio.h>
#include <math.h>
#include <omp.h>
#include <unistd.h>


int main ()
{

    printf("===MATRIX SUM===\n");
    printf("N ROWS: ");
    int i1; scanf("%d",&i1);
    printf("M COLUMNS: ");
    int j1; scanf("%d",&j1);
    int matrixA[i1][j1];

    int i, j;

    for(i = 0; i < i1; i++){
        for (j = 0; j < j1; j++){
            scanf("%d",&matriuA[i][j]);
        }
    }

    printf("\nMATRIX A: \n");
    for (i = 0; i < i1; i++){
        for (j = 0; j < j1; j++){
            printf("%d ", matrixA[i][j]);
        }
        printf("\n");
    }
    int sum = 0;
    #pragma omp parallel for private(i, j)
        for (i = 0; i < i1; i++)
            for (j = 0; j < j1; j++){
                sum += matrixA[i][j];
           }


    printf("\nTHE RESULT IS: %d", sum);

    return 0;
}

而且,我想问一问,是否有更好的解决杂物的方法,因为我读到这是最有效的方法.

And, I would like to ask, if there would be like, a better solution for the pragma reduction since I read that's the most efficient way.

推荐答案

如果没有归约条款,您发布的代码不正确.

The code you posted is not correct without the reduction clause.

sum += matrixA[i][j];

由多个线程并行执行时,将导致经典的竞争条件. Sum是一个共享变量,但是sum += ...不是原子操作.

Will cause a classic race condition when executed by multiple threads in parallel. Sum is a shared variable, but sum += ... is not an atomic operation.

(sum is initially 0, all matrix elements 1)
Thread 1                     |  Thread 2
-----------------------------------------------------------
tmp = sum + matrix[0][0] = 1 |
                             | tmp = sum + matrix[1][0] = 1
sum = tmp = 1                |
                             | sum = tmp = 1 (instead of 2)

还原完全可以解决此问题.通过减少,循环将在sum变量的隐式线程本地副本上工作.在该区域的末尾,原始的sum变量将被设置为所有线程本地副本的总和(以无竞争条件的正确方式).

The reduction fixes exactly this. With reduction, the loop will work on an implicit thread-local copy of the sum variable. At the end of the region, the original sum variable will be set to the sum of all thread-local copies (in a correct way without race-conditions).

另一种解决方案是将sum += ...标记为原子操作或关键部分.但是,这会带来很大的性能损失.

Another solution would be to mark the sum += ... as atomic operation or critical section. That, however has a significant performance penalty.

这篇关于什么时候需要减少?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆