与C /​​ FORTRAN比较蟒蛇 [英] comparing python with c/fortran

查看:198
本文介绍了与C /​​ FORTRAN比较蟒蛇的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了下面的程序比较C / FORTRAN蟒的速度。
为了获得由程序使用的时候我用了时间命令。一切
程序计算的X-的 X + Y的平方根的Y + Z * Z,其中x,y,z是浮动。
我用根正方形,因为它是在最耗时部分之一
科学计算,在我参与。

我有时间如下:

  FORTRAN 0m29.9s //
Ç0m20.7s //
蟒蛇30m10.8s

按照简单的测试我做我发现,不建议为Python
科学计算。但可能我的code是非常低效的。

你认为我可以让我的code只是为了这个简单的测试案例更有效率?

Fortran语言:

 程序root_square
隐无整数I,J
真正的X,Y,Z,RX = 1.0
Y = 2.0
Z = 3.0做J = 1,3000
    做我= 1,1000000
        R =开方(X * X + Y * Y + Z * Z)
    ENDDO
ENDDO程序结束root_square

C:

 的#includestdio.h中
#包括文件math.hINT主要(无效)
{浮X = 1.0,Y = 2.0,Z = 3.0,R;
INT I,J;为(J = 0; J< 3000; J ++){
        对于(i = 0; I< 1000000;我++){
                R =开方(X * X + Y * Y + Z * Z);
        }
}返回0;
}

的Python:

 #!的/ usr /斌/包膜蟒蛇从数学进口开方X = 1.0
Y = 2.0
Z = 3.0对于j的范围(1,3001):
  因为我在范围内(1,1000001):
    R =开方(X * X + Y * Y + Z * Z)


解决方案

我最近做的a类似的试验,提供更现实的真实世界的算法。它涉及numpy的,MATLAB,FORTRAN以及C#(通过 ILNumerics )。如果没有特定的优化,​​numpy的出现,产生更高效的code比别人。当然 - 一如既往 - 这只能说明一个总的趋势。您将能够写出FORTRAN code这在年底的运行速度比相应numpy的执行速度较慢。但大多数时候,numpy的就会慢很多。在这里,我测试的(平均)结果:

为了一次这么简单的浮点运算在你的榜样,一切都归结到编译器生成最佳机器指令的能力。这里,它不是那么重要,许多编译步骤如何参与。 .NET和numpy的使用不止一个步骤,首先编译成字节code这比在虚拟机上执行时。但优化结果的选项也同样存在 - 在理论上。在实践,现代​​Fortran和C编译器都在为执行速度优化更好。举一个例子,他们利用浮点扩展(SSE,AVX),并做的更好循环展开。 numpy的(或更好的CPython,其主要用于通过numpy的)似乎在这一点上表现差。如果你想确保,这框架是最适合你的任务,你可以连接到一个调试器和调查可执行文件的最后机器指令。

但是,请记住,在一个更​​现实的场景中,浮点运算性能是在一个大的优化链的最末端唯一重要的。存储器带宽:差值通常由更强大的作用掩盖。一旦你开始处理阵列(至极是常见的最科学的应用程序),你将不得不采取的内存管理的成本考虑在内。偏离框架在支持内存写入有效的算法,算法作者。在我看来,numpy的就更难写的存储高效的算法,然后FORTRAN或C.但它并不容易放入系统的任何语言​​。 (ILNumerics改善了该增色不少。)

另外重要的一点是并行化。该框架是否支持你并行执行的计算?而如何有效的是它做什么?同样我个人认为:无论C和FORTRAN也numpy的可以很容易地并行化的算法。但是,Fortran和C,至少给你做这样的机会,即使有时需要使用特殊的编译器。其他框架(ILNumerics,MATLAB)就自动并行化。

如果您需要的最佳性能为你大多会更好用FORTRAN或C就因为他们在最后产生更好的机器code(单处理器系统)非常小,但昂贵的算法。然而,编写C或FORTRAN的放大算法和服用内存效率并行考虑往往变得繁琐。在这里,更高层次的语言(如numpy的,ILNumerics或Matlab)超越低级语言。如果做得正确 - 在执行速度上的差异往往是微不足道的。不幸的是,这往往是不为numpy的的情况下,真实的。

I wrote the following programs to compare the speed of python with c/fortran. To get the time used by the programs I used the "time" command. All the programs compute the square root of xx+yy+z*z where x,y,z are floats. I used the root square because it is one of the most time consuming parts in scientific computing, in which I am involved.

I got the following times:

fortran  0m29.9s //
c        0m20.7s //
python  30m10.8s

According to the simple test I did I found that Python is not recommended for scientific computing. But probably my code is very inefficient.

Do you think I could make my code more efficient just for this simple test case?

Fortran:

program root_square
implicit none

integer i,j
real x,y,z,r

x=1.0
y=2.0
z=3.0

do j=1,3000
    do i=1,1000000
        r=sqrt(x*x+y*y+z*z)
    enddo
enddo

end program root_square

C:

#include "stdio.h"
#include "math.h"

int main (void)
{

float x=1.0,y=2.0,z=3.0,r;
int i,j;

for(j=0; j<3000; j++){
        for(i=0; i<1000000; i++) {
                r=sqrt(x*x+y*y+z*z);
        }
}

return 0;
}

Python:

#!/usr/bin/env python

from math import sqrt

x = 1.0
y = 2.0
z = 3.0

for j in range(1,3001):
  for i in range(1,1000001):
    r = sqrt(x*x+y*y+z*z)

解决方案

I have recently done a similar test with a more realistic real-world algorithm. It involves numpy, Matlab, FORTRAN and C# (via ILNumerics). Without specific optimizations, numpy appears to generate much less efficient code than the others. Of course - as always - this can only suggest a general trend. You will be able to write FORTRAN code which at the end runs slower than a corresponding numpy implementation. But most the time, numpy will be much slower. Here the (averaged) results of my test:

In order to time such simple floating point operations as in your example, all comes down to the compilers ability to generate 'optimal' machine instructions. Here, it is not so important, how many compilation steps are involved. .NET and numpy utilize more than one step by first compiling to byte code which than executes in a virtual machine. But the options to optimize the result does equally exist - in theory. In praxis, modern FORTRAN and C compiler are better in optimizing for execution speed. As one example they utilize floating point extensions (SSE, AVX) and do better loop unrolling. numpy (or better CPython, which is mostly used by numpy) seems to perform worse at this point. If you want to ensure, which framework is best for your task, you may attach to a debugger and investigate the final machine instructions of the executable.

However, keep in mind, in a more realistic scenario the floating point performance is only important at the very end of a large optimization chain. The difference is often masked by a much stronger effect: memory bandwith. As soon as you start handling arrays (wich is common in most scientific applications) you will have to take the cost of memory management into account. Frameworks deviate in supporting the algorithm author in writing memory efficient algorithms. In my opinion numpy makes it harder to write memory efficient algorithms then FORTRAN or C. But it is not easy in any of thoses languages. (ILNumerics improves this considerably.)

Another important point is parallelization. Does the framework supports you in executing your computations in parallel? And how efficient is it done? Again my personal opinion: neither C nor FORTRAN nor numpy make it easy to parallelize your algorithms. But FORTRAN and C at least give you the chance to do so, even if it sometimes require to use special compilers. Other frameworks (ILNumerics, Matlab) do parallelize automatically.

If you are in need of 'peak performance' for very small but costly algorithms you will mostly better off using FORTRAN or C. Just because they at the end generate better machine code (on a uniprocessor system). However, writing larger algorithms in C or FORTRAN and taking memory efficiency and parallelism into account often gets cumbersome. Here, higher level languages (like numpy, ILNumerics or Matlab) outdo lower level languages. And if done right - the difference in execution speed often is negligible. Unfortunately, this is often not true for the case of numpy.

这篇关于与C /​​ FORTRAN比较蟒蛇的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆