OpenMP实施减少 [英] OpenMP implementation of reduction

查看:98
本文介绍了OpenMP实施减少的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要实现归约运算(对于每个线程,该值应存储在不同的数组条目中).但是,对于更多线程,它运行速度较慢.有什么建议吗?

I need to implement reduction operation (for each thread the value should be stored in different array entry). However, it runs slower for more threads. Any suggestions?

double local_sum[16];.
//Initializations....
#pragma omp parallel for shared(h,n,a) private(x, thread_id) 
for (i = 1; i < n; i++) {
    thread_id = omp_get_thread_num();
    x = a  + i* h;
    local_sum[thread_id] += f(x);
}

推荐答案

您正在体验错误共享的影响.在x86上,单个高速缓存行的长度为64个字节,因此包含64 / sizeof(double) = 8个数组元素.当一个线程更新其元素时,其运行的核心将使用缓存一致性协议来使所有其他核心中的同一缓存行无效.当另一个线程更新其元素时,取而代之或直接在高速缓存上操作,其内核必须从上级数据高速缓存或主内存重新加载高速缓存行.这会大大减慢程序的执行速度.

You are experiencing the effects of false sharing. On x86 a single cache line is 64 bytes long and therefore holds 64 / sizeof(double) = 8 array elements. When one thread updates its element, the core that it runs on uses the cache coherency protocol to invalidate the same cache line in all other cores. When another thread updates its element, instead or operating directly on the cache, its core has to reload the cache line from an upper-level data cache or from the main memory. This significantly slows down the program execution.

最简单的解决方案是插入填充,从而将不同线程访问的数组元素分布到不同的缓存行中.在x86上,这将是7个double元素.因此,您的代码应如下所示:

The simplest solution is to insert padding and thus spread array elements accessed by different threads into distinct cache lines. On x86 that would be 7 double elements. Therefore your code should look like:

double local_sum[8*16];
//Initializations....
#pragma omp parallel for shared(h,n,a) private(x, thread_id) 
for (i = 1; i < n; i++) {
    thread_id = omp_get_thread_num();
    x = a  + i* h;
    local_sum[8*thread_id] += f(x);

}

在对数组末尾求和(或将所有数组元素初始化为零)时,别忘了仅取每个第8个元素.

Don't forget to take only each 8th element when summing the array at the end (or initialise all array elements to zero).

这篇关于OpenMP实施减少的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆