为什么gcc数学库效率如此低下? [英] Why is the gcc math library so inefficient?

查看：98 发布时间：2020/9/6 20:30:58 c performance gcc fortran archlinux

本文介绍了为什么gcc数学库效率如此低下?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

当我将一些fortran代码移植到c时，令我惊讶的是，使用ifort(intel fortran编译器)编译的fortran程序和使用gcc编译的c程序之间的大部分执行时间差异来自三角函数的求值.功能(sin，cos).这让我感到惊讶，因为我曾经相信答案解释的内容是，正弦和余弦之类的功能是在微处理器内部的微代码中实现的.

When I was porting some fortran code to c, it surprised me that the most of the execution time discrepancy between the fortran program compiled with ifort (intel fortran compiler) and the c program compiled with gcc, comes from the evaluations of trigonometric functions (sin, cos). It surprised me because I used to believe what this answer explains, that functions like sine and cosine are implemented in microcode inside microprocessors.

为了更明确地发现问题，我在fortran中制作了一个小型测试程序

In order to spot the problem more explicitly I made a small test program in fortran

program ftest
  implicit none
  real(8) :: x
  integer :: i
  x = 0d0
  do i = 1, 10000000
    x = cos (2d0 * x)
  end do
  write (*,*) x
end program ftest

在intel Q6600处理器和3.6.9-1-ARCH x86_64 Linux上我得到ifort version 12.1.0

On intel Q6600 processor and 3.6.9-1-ARCH x86_64 Linux I get with ifort version 12.1.0

$ ifort -o ftest ftest.f90 
$ time ./ftest
  -0.211417093282753     

real    0m0.280s
user    0m0.273s
sys     0m0.003s

当我使用gcc version 4.7.2时

$ gfortran -o ftest ftest.f90 
$ time ./ftest
  0.16184945593939115     

real    0m2.148s
user    0m2.090s
sys     0m0.003s

这几乎是10倍的差异！我仍然可以相信cos的gcc实现是微处理器实现的包装，其方式与intel实现中的实现方式类似吗?如果这是真的，瓶颈在哪里?

This is almost a factor of 10 difference! Can I still believe that the gcc implementation of cos is a wrapper around the microprocessor implementation in a similar way as this is probably done in the intel implementation? If this is true, where is the bottle neck?

编辑

根据评论，启用的优化应该可以提高性能.我的意见是，优化不会影响库函数……这并不意味着我不会在非平凡的程序中使用它们.但是，这是另外两个基准(现在在我的家用计算机intel core2上)

According to comments, enabled optimizations should improve the performance. My opinion was that optimizations do not affect the library functions ... which does not mean that I don't use them in nontrivial programs. However, here are two additional benchmarks (now on my home computer intel core2)

$ gfortran -o ftest ftest.f90
$ time ./ftest
  0.16184945593939115     

real    0m2.993s
user    0m2.986s
sys     0m0.000s

和

$ gfortran -Ofast -march=native -o ftest ftest.f90
$ time ./ftest
  0.16184945593939115     

real    0m2.967s
user    0m2.960s
sys     0m0.003s

您(评论员)想到了哪些特定的优化?在这个特定示例中，编译器如何利用多核处理器，其中每个迭代都取决于前一个的结果?

Which particular optimizations did you (commentators) have in mind? And how can compiler exploit a multi-core processor in this particular example, where each iteration depends on the result of the previous one?

编辑2

Daniel Fisher和Ilmari Karonen的基准测试使我认为问题可能与gcc的特定版本(4.7.2)有关，也许与我正在使用的gcc的特定版本(Arch x86_64 Linux)有关.我的电脑.所以我在intel core i7框上用debian x86_64 Linux，gcc version 4.4.5和ifort version 12.1.0

The benchmark tests of Daniel Fisher and Ilmari Karonen made me think that the problem might be related to the particular version of gcc (4.7.2) and maybe to a particular build of it (Arch x86_64 Linux) that I am using on my computers. So I repeated the test on the intel core i7 box with debian x86_64 Linux, gcc version 4.4.5 and ifort version 12.1.0

$ gfortran -O3 -o ftest ftest.f90
$ time ./ftest
  0.16184945593939115     

real    0m0.272s
user    0m0.268s
sys     0m0.004s

和

$ ifort -O3 -o ftest ftest.f90
$ time ./ftest
  -0.211417093282753     

real    0m0.178s
user    0m0.176s
sys     0m0.004s

对我来说，这是一个非常可以接受的性能差异，这绝对不会让我问这个问题.看来我必须在Arch Linux论坛上问这个问题.

For me this is a very much acceptable performance difference, which would never make me ask this question. It seems that I will have to ask on Arch Linux forums about this issue.

但是，仍然很欢迎您对整个故事进行解释.

However, the explanation of the whole story is still very welcome.

为什么gcc数学库效率如此低下? [英] Why is the gcc math library so inefficient?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

为什么gcc数学库效率如此低下? [英] Why is the gcc math library so inefficient?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭