当我使用open mp时,我所有的代码运行都慢得多 [英] All my codes are running much slower when I use open mp

查看:147
本文介绍了当我使用open mp时,我所有的代码运行都慢得多的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经在该网站上看到几篇帖子,它们都在谈论这个问题.但是,我认为我认真的代码(由于创建线程而造成的开销以及所有问题都不应该成为大问题)现在已经变得比打开mp慢得多!我正在使用带有gfortran 4.6.3的四核计算机作为编译器.以下是测试代码的示例.

 程序测试使用omp_lib整数* 8 i,j,k,l!$ omp并行!$ omp do我= 1,20000做j = 1,1000做k = 1,1000l =我尽头尽头尽头!$ omp end no nowait!$ omp结束并行结束程序测试 

如果我在没有打开mp的情况下运行此代码,则大约需要80秒,但是在打开mp的情况下,它需要大约150秒.我的其他严重代码在串行模式下的运行时间约为5分钟左右,也遇到了同样的问题.在那些代码中,我要注意的是线程之间没有依赖关系.那么为什么这些代码应该变得更慢而不是更快?

谢谢.

解决方案

您有一个竞争条件,更多线程正在同一共享 l 中编写.因此,该程序无效, l 应该为 private .由于线程使其他内核拥有的缓存内容无效,并且线程必须始终重新加载内存内容,因此也会导致速度降低.当更多线程使用相同的缓存行并且被称为错误共享时,也会发生类似的情况./p>

您可能还没有使用任何编译器优化.通过 -O2 -O3 -O5 -Ofast 启用它们.您会看到该程序耗时0秒,因为编译器将所有内容都进行了优化.

I already saw several posts on this site which talk about this issue. However, I think my serious codes where overhead due to creation of threads and all should not be a big issue, have become much slower with open mp now! I am using a quad core machine with gfortran 4.6.3 as my compiler. Below is an example of a test code.

Program test
use omp_lib
integer*8 i,j,k,l
!$omp parallel 
!$omp do
do i = 1,20000
  do j = 1, 1000
   do k = 1, 1000
       l = i
   enddo
  enddo
enddo
!$omp end do nowait
!$omp end parallel
End program test

This code takes around 80 seconds if I run it without open mp, however, with open mp, it takes around 150 seconds. I have seen the same issue with my other serious codes whose runtime is around 5 minutes or so in serial mode. In those codes I am taking care that there are no dependencies from thread to thread. Then why should these codes become slower instead of faster?

Thanks in advance.

解决方案

You have a race condition, more threads are writing in the same shared l. Thus the program is invalid, l should be private. It also leads to a slowdown because the threads invalidate the cache content the other cores have and the threads have to reload the memory content all the time. Similar thing happens when more threads use the same cache line and it is known as false sharing.

You also probably don't use any compiler optimizations. Enable them by -O2 -O3, -O5 or -Ofast. You will see that the program takes 0 seconds because the compiler optimizes everything out.

这篇关于当我使用open mp时,我所有的代码运行都慢得多的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆