OpenMP在C数组中的减少/并行化代码 [英] OpenMP in C array reduction / parallelize the code

查看:88
本文介绍了OpenMP在C数组中的减少/并行化代码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的代码有问题,它应该打印出一定数量的外观.

I have a problem with my code, it should print number of appearances of a certain number.

我想将此代码与OpenMP并行化,我试图对数组使用归约法,但是显然它并没有按我的意愿工作.

I want parallelize this code with OpenMP, and I tried to use reduction for arrays but it's obviously didn't working as I wanted.

错误是:分段错误".一些变量应该是私有的吗?还是我尝试使用归约方式的问题?

The error is: "segmentation fault". Should some variables be private? or it's the problem with the way I'm trying to use the reduction?

我认为每个线程应该计数数组的某个部分,然后以某种方式合并它.

I think each thread should count some part of array, and then merge it somehow.

#pragma omp parallel for reduction (+: reasult[:i])
    for (i = 0; i < M; i++) {   
      for(j = 0; j < N; j++) {
         if ( numbers[j] == i){
            result[i]++;
         }
      }
  }

其中 N 是大数字,告诉我我有多少个数字.数字是所有数字的数组,是每个数字之和的结果数组.

Where N is big number telling how many numbers I have. Numbers is array of all numbers and result array with sum of each number.

推荐答案

首先,您要在名称上输入错字

First you have a typo on the name

#pragma omp parallel for reduction (+: reasult[:i])

实际上应该是结果",不是诱因"

should actually be "result" not "reasult"

尽管如此,为什么还要 section 具有 result [:i] 的数组?根据您的代码,您似乎想减少整个数组,即:

Nonetheless, why are you section the array with result[:i]? Based on your code, it seems that you wanted to reduce the entire array, namely:

#pragma omp parallel for reduction (+: result)
    for (i = 0; i < M; i++)   
      for(j = 0; j < N; j++)
         if ( numbers[j] == i)
            result[i]++;

某人的编译器不支持 OpenMP 4.5数组精简功能可以替代地明确实现精简(

When one's compiler does not support the OpenMP 4.5 array reduction feature one can alternatively explicitly implement the reduction (check this SO thread to see how).

@ Hristo Iliev 在评论中指出的

假设M * sizeof(result [0])/#threads是缓存行大小,即使M的值较大也不会足够,绝对不需要减少过程.除非程序在NUMA系统上运行,否则就是这样.

Provided that M * sizeof(result[0]) / #threads is a multiple of the cache line size, and even if it isn't when the value of M is large enough, there is absolutely no need to involve reduction in the process. Unless the program is running on a NUMA system, that is.

假定满足上述条件,并且如果仔细分析,则会将最外层的循环迭代( ie ,变量 i )分配给线程,并且由于该变量 i 用于访问 result 数组,每个线程将更新 result 数组的不同位置.因此,您可以将代码简化为:

Assuming that the aforementioned conditions are met, and if you analyze carefully the outermost loop iterations (i.e., variable i) are assigned to the threads, and since the variable i is used to access the result array, each thread will be updating a different position of the result array. Therefore, you can simplified your code to:

#pragma omp parallel for
for (i = 0; i < M; i++)   
   for(j = 0; j < N; j++)
      if ( numbers[j] == i)
         result[i]++;

这篇关于OpenMP在C数组中的减少/并行化代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆