如何在OpenMP中进行有序还原 [英] How to do an ordered reduction in OpenMP

查看:115
本文介绍了如何在OpenMP中进行有序还原的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

OpenMP 4.5+提供了在C ++中减少向量/数组的功能(新闻稿)

OpenMP 4.5+ provides the capability to do vector/array reductions in C++ (press release)

使用上述功能,我们可以编写例如:

Using said capability allows us to write, e.g.:

#include <vector>
#include <iostream>

int main(){
  std::vector<int> vec;

  #pragma omp declare reduction (merge : std::vector<int> : omp_out.insert(omp_out.end(), omp_in.begin(), omp_in.end()))

  #pragma omp parallel for default(none) schedule(static) reduction(merge: vec)
  for(int i=0;i<100;i++)
    vec.push_back(i);

  for(const auto x: vec)
    std::cout<<x<<"\n";

  return 0;
}

问题是,在执行此类代码时,各种线程的结果可能以任何方式排序.

The problem is, upon executing such code, the results of the various threads may be ordered in any which way.

是否有一种方法可以强制执行顺序,以使线程0的结果优先于线程1的结果,依此类推?

Is there a way to enforce order such that thread 0's results preceed thread 1's, and so on?

推荐答案

未明确指定减少的顺序. ("OpenMP程序中组合值的位置以及 值未指定.",OpenMP 4.5中的2.15.3.6.因此,您不能使用缩减.

The order of a reduction is explicitly not specified. ("The location in the OpenMP program at which the values are combined and the order in which the values are combined are unspecified.", 2.15.3.6 in OpenMP 4.5). Therefore you cannot use a reduction.

一种方法是使用以下命令:

One way would be to use ordered as follows:

std::vector<int> vec;
#pragma omp parallel for default(none) schedule(static) shared(vec)
for(int i=0;i<100;i++) {
    // do some computations here
    #pragma omp ordered
    vec.push_back(i);
}

请注意,现在已共享vec,并且ordered意味着线程之间执行和同步的序列化.除非您的每次计算都需要大量且均匀的时间,否则这对性能可能会非常不利.

Note that vec is now shared, and ordered implies a serialization of execution and synchronization among threads. This can be very bad for performance except if each of your computations require a significant and uniform amount of time.

您可以自定义有序减价.从for循环中拆分parallel区域,并按顺序手动插入本地结果.

You can make a custom ordered reduction. Split the parallel region from for loop and manually insert the local results in a sequential order.

std::vector<int> global_vec;
#pragma omp parallel
{
    std::vector<int> local_vec;
    #pragma omp for schedule(static)
    for (int i=0; i < 100; i++) {
        // some computations
        local_vec.push_back(i);
    }
    for (int t = 0; t < omp_get_num_threads(); t++) {
        #pragma omp barrier
        if (t == omp_get_thread_num()) {
            global_vec.insert(local_vec.begin(), local_vec.end())
        }
    }
}

这篇关于如何在OpenMP中进行有序还原的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆