减少OpenMp中的数组 [英] Reducing on array in OpenMp

查看:171
本文介绍了减少OpenMp中的数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图并行化以下程序,但不知道如何减少数组。我知道这是不可能这样做,但有另一个选择吗?谢谢。(我在m上添加了减少,这是错误的,但想要如何做的建议。)

I am trying to parallelize the following program, but don't know how to reduce on an array. I know it is not possible to do so, but is there an alternative? Thanks.(I added reduction on m which is wrong but would like to have an advice on how to do it.)

#include <iostream>
#include <stdio.h>
#include <time.h>
#include <omp.h>
using namespace std;

int A [] = {84, 30, 95, 94, 36, 73, 52, 23, 2, 13};
int S [10];
int n,m=0;
time_t start_time, end_time;

int main ()
{
start_time = time(NULL);
#pragma omp parallel for private (m)reduction(+:m)
for ( n=0 ; n<10 ; ++n )
{
    for (m=0; m<=n; ++m){
    S[n] += A[m];
    }
}
end_time = time(NULL);
cout << end_time-start_time;
}


推荐答案

使用OpenMP进行数组减少。在Fortran它甚至有构造为此。在C / C ++中,你必须自己做。

Yes it is possible to do an array reduction with OpenMP. In Fortran it even has construct for this. In C/C++ you have to do it yourself. Here are two ways to do it.

第一种方法为每个线程创建 S 的私有版本,并行,然后在临界区将它们合并到 S 中(见下面的代码)。第二种方法产生具有尺寸10 * nthreads的数组。并行填充此数组,然后将其合并到 S ,而不使用临界区。第二种方法复杂得多,并且可能有缓存问题,特别是在多插座系统上,如果你不小心。有关详情,请参阅填充直方图(数组减少)与OpenMP并行而不使用关键部分

The first method makes private version of S for each thread, fill them in parallel, and then merges them into S in a critical section (see the code below). The second method makes an array with dimentions 10*nthreads. Fills this array in parallel and then merges it into S without using a critical section. The second method is much more complicated and can have cache issues especially on multi-socket systems if you are not careful. For more details see this Fill histograms (array reduction) in parallel with OpenMP without using a critical section

第一种方法

int A [] = {84, 30, 95, 94, 36, 73, 52, 23, 2, 13};
int S [10] = {0};
#pragma omp parallel
{
    int S_private[10] = {0};
    #pragma omp for
    for (int n=0 ; n<10 ; ++n ) {
        for (int m=0; m<=n; ++m){
            S_private[n] += A[m];
        }
    }
    #pragma omp critical
    {
        for(int n=0; n<10; ++n) {
            S[n] += S_private[n];
        }
    }
}

第二种方法

int A [] = {84, 30, 95, 94, 36, 73, 52, 23, 2, 13};
int S [10] = {0};
int *S_private;
#pragma omp parallel
{
    const int nthreads = omp_get_num_threads();
    const int ithread = omp_get_thread_num();

    #pragma omp single 
    {
        S_private = new int[10*nthreads];
        for(int i=0; i<(10*nthreads); i++) S_private[i] = 0;
    }
    #pragma omp for
    for (int n=0 ; n<10 ; ++n )
    {
        for (int m=0; m<=n; ++m){
            S_private[ithread*10+n] += A[m];
        }
    }
    #pragma omp for
    for(int i=0; i<10; i++) {
        for(int t=0; t<nthreads; t++) {
            S[i] += S_private[10*t + i];
        }
    }
}
delete[] S_private;

这篇关于减少OpenMp中的数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆