比串行版本慢1个线程的OpenMP [英] OpenMP with 1 thread slower than sequential version

查看：117 发布时间：2016/8/19 14:37:41 c performance openmp

本文介绍了比串行版本慢1个线程的OpenMP的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我必须使用OpenMP实现的背包（gcc版本4.6.3）

I have implemented knapsack using OpenMP (gcc version 4.6.3)

#define MAX(x,y)   ((x)>(y) ? (x) : (y))
#define table(i,j)    table[(i)*(C+1)+(j)]

   for(i=1; i<=N; ++i) {
#pragma omp parallel for
      for(j=1; j<=C; ++j) {
         if(weights[i]>j) {
            table(i,j) = table(i-1,j);
         }else {
            table(i,j) = MAX(profits[i]+table(i-1,j-weights[i]), table(i-1,j));
         }
      }
   }

执行时间顺序程序= 1秒

execution time for the sequential program = 1s

执行时间为1线程的OpenMP = 1.7S（开销= 40％）

execution time for the openmp with 1 thread = 1.7s (overhead = 40%)

用于相同的编译器优化标志（-O3）在这两种情况下。

Used the same compiler optimization flags (-O3) in the both cases.

有人能解释这种行为背后的原因。

Can someone explain the reason behind this behavior.

感谢。

推荐答案

启用OpenMP的抑制某些编译器的优化，例如它可以被矢量化或不被保存在寄存器中的共享变量prevent循环。因此启用的OpenMP-code是通常比串行慢和一个具有以利用可用的并行来抵消这种

Enabling OpenMP inhibits certain compiler optimisations, e.g. it could prevent loops from being vectorised or shared variables from being kept in registers. Therefore OpenMP-enabled code is usually slower than the serial and one has to utilise the available parallelism to offset this.

话虽这么说，你的code包含嵌套在外环的并行区域内。这意味着进入和退出并行区域的开销相乘的 N 的次。这才有意义，如果的 N 的相对较小， C 的是显著较大（如规模较大的订单），比的 N 的，因此，目前正在做的工作里面的区域大大超过了OpenMP的开销。

That being said, your code contains a parallel region nested inside the outer loop. This means that the overhead of entering and exiting the parallel region is multiplied N times. This only makes sense if N is relatively small and C is significantly larger (like orders of magnitude larger) than N, therefore the work being done inside the region greatly outweighs the OpenMP overhead.

这篇关于比串行版本慢1个线程的OpenMP的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

比串行版本慢1个线程的OpenMP [英] OpenMP with 1 thread slower than sequential version

问题描述

推荐答案

相关文章

C/C++最新文章

热门教程

热门工具

登录关闭

比串行版本慢1个线程的OpenMP [英] OpenMP with 1 thread slower than sequential version

问题描述

推荐答案

相关文章

C/C++最新文章

热门教程

热门工具

登录 关闭

登录关闭