与 Project Euler 的速度比较:C vs Python vs Erlang vs Haskell [英] Speed comparison with Project Euler: C vs Python vs Erlang vs Haskell

查看:21
本文介绍了与 Project Euler 的速度比较:C vs Python vs Erlang vs Haskell的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我从 问题 #12http://projecteuler.net/" rel="noreferrer">Project Euler 作为编程练习并比较我在 C、Python、Erlang 和 Haskell 中的(肯定不是最佳的)实现.为了获得更高的执行时间,我搜索第一个除数超过 1000 个的三角形数,而不是原始问题中所述的 500 个.

结果如下:

C:

lorenzo@enzo:~/erlang$ gcc -lm -o euler12.bin euler12.clorenzo@enzo:~/erlang$ 时间 ./euler12.bin842161320真正的 0m11.074s用户 0m11.070s系统 0m0.000s

Python:

lorenzo@enzo:~/erlang$ time ./euler12.py842161320真正的 1m16.632s用户 1m16.370s系统 0m0.250s

Python 与 PyPy:

lorenzo@enzo:~/Downloads/pypy-c-jit-43780-b590cf6de419-linux64/bin$ time ./pypy/home/lorenzo/erlang/euler12.py842161320真正的 0m13.082s用户 0m13.050s系统 0m0.020s

Erlang:

lorenzo@enzo:~/erlang$ erlc euler12.erllorenzo@enzo:~/erlang$ time erl -s euler12 解决Erlang R13B03 (erts-5.7.4) [source] [64-bit] [smp:4:4] [rq:4] [async-threads:0] [hipe] [kernel-poll:false]Eshell V5.7.4(使用 ^G 中止)1>842161320真正的 0m48.259s用户 0m48.070s系统 0m0.020s

Haskell:

lorenzo@enzo:~/erlang$ ghc euler12.hs -o euler12.hsx[1 of 1] 编译 Main ( euler12.hs, euler12.o )链接euler12.hsx ...lorenzo@enzo:~/erlang$ 时间 ./euler12.hsx842161320真正的 2m37.326s用户 2m37.240s系统 0m0.080s

总结:

  • C:100%
  • Python:692%(使用 PyPy 为 118%)
  • Erlang:436%(135% 感谢 RichardC)
  • 哈斯克尔:1421%

我认为 C 有很大的优势,因为它使用 long 进行计算,而不是像其他三个一样使用任意长度的整数.它也不需要先加载运行时(其他的呢?).

问题 1:Erlang、Python 和 Haskell 是否会因为使用任意长度的整数而降低速度,或者只要这些值小于 MAXINT 就不会?

问题 2:为什么 Haskell 这么慢?是否有关闭刹车的编译器标志还是我的实现?(后者很有可能,因为 Haskell 对我来说是一本有七个印章的书.)

问题 3:您能否为我提供一些提示,如何在不改变我确定因素的方式的情况下优化这些实现?以任何方式优化:更好、更快、更本机"的语言.

问题 4:我的功能实现是否允许 LCO(最后一次调用优化,也称为尾递归消除),从而避免在调用堆栈中添加不必要的帧?

我真的尝试在四种语言中实现尽可能相似的相同算法,尽管我不得不承认我的 Haskell 和 Erlang 知识非常有限.

<小时>

使用的源代码:

#include #include <math.h>int factorCount (long n){双平方 = sqrt(n);int isquare = (int) square;int count = isquare == square ?-1:0;长候选人;for (candidate = 1;候选人<= isquare;候选人++)如果 (0 == n % 候选) 计数 += 2;返回计数;}int主(){长三角形 = 1;整数索引 = 1;while (factorCount (triangle) <1001){指数++;三角形 += 索引;}printf ("%ld
", 三角形);}

<小时>

#!/usr/bin/env python3.2导入数学def factorCount (n):平方 = math.sqrt (n)isquare = int(正方形)计数 = -1 如果 isquare == square else 0对于范围 (1, isquare + 1) 中的候选人:如果不是 n % 候选人:count += 2返回计数三角形 = 1指数 = 1而factorCount(三角形)<1001:指数 += 1三角形 += 索引打印(三角形)

<小时>

-module (euler12).-编译(export_all).factorCount (Number) ->factorCount (Number, math:sqrt (Number), 1, 0).factorCount (_, Sqrt, Candidate, Count) 当 Candidate >平方->数数;factorCount (_, Sqrt, Candidate, Count) 当 Candidate == Sqrt ->计数 + 1;factorCount (Number, Sqrt, Candidate, Count) ->案件编号 rem 候选人0 ->factorCount (Number, Sqrt, Candidate + 1, Count + 2);_ ->factorCount (Number, Sqrt, Candidate + 1, Count)结尾.nextTriangle(索引,三角形)->Count = factorCount(三角形),如果计数>1000 ->三角形;真的 ->nextTriangle(索引 + 1,三角形 + 索引 + 1)结尾.解决 () ->io:format ("~p~n", [nextTriangle (1, 1) ] ),停止 (0).

<小时>

factorCount number = factorCount' number isquare 1 0 - (fromEnum $ square == fromIntegral isquare)其中 square = sqrt $ fromIntegral numberisquare = 地板正方形factorCount' number sqrt 候选计数|fromIntegral 候选 >sqrt = 计数|number `mod`Candidate == 0 = factorCount' number sqrt (candidate + 1) (count + 2)|否则 = factorCount' number sqrt (candidate + 1) countnextTriangle 索引三角形|factorCount 三角形 >1000 = 三角形|否则 = nextTriangle (index + 1) (triangle + index + 1)main = 打印 $ nextTriangle 1 1

解决方案

Using GHC 7.0.3, gcc 4.4.6, Linux 2.6.29 在 x86_64 Core2 Duo (2.5GHz) 机器上,使用 ghc -O2 -fllvm -fforce-recomp for Haskell 和 gcc -O3 -lm for C 编译.

  • 您的 C 程序运行时间为 8.4 秒(比您的运行速度快,可能是因为 -O3)
  • Haskell 解决方案在 36 秒内运行(由于 -O2 标志)
  • 您的 factorCount' 代码未明确输入并默认为 Integer(感谢 Daniel 在这里纠正了我的误诊!).使用 Int 给出显式类型签名(无论如何这是标准做法),时间更改为 11.1 秒
  • factorCount' 中,您不必要地调用了fromIntegral.修复不会产生任何变化(编译器很聪明,你很幸运).
  • 您使用了 mod,其中 rem 更快且足够.这会将时间更改为 8.5 秒.
  • factorCount' 不断地应用两个永远不会改变的额外参数(numbersqrt).工作器/包装器转换为我们提供:

 $ time ./so842161320真正的 0m7.954s用户 0m7.944s系统 0m0.004s

没错,7.95 秒.始终比 C 解决方案快半秒.如果没有 -fllvm 标志,我仍然得到 8.182 秒,所以 NCG 后端在这种情况下也做得很好.

结论:Haskell 很棒.

结果代码

factorCount number = factorCount' number isquare 1 0 - (fromEnum $ square == fromIntegral isquare)其中 square = sqrt $ fromIntegral numberisquare = 地板正方形factorCount' :: Int ->内部 ->内部 ->内部 ->整数factorCount' number sqrtCandidate0 count0 = goCandidate0 count0在哪里去候选人计数|候选人 >sqrt = 计数|number `rem` 候选 == 0 = go (candidate + 1) (count + 2)|否则 = 去(候选人 + 1)计数nextTriangle 索引三角形|factorCount 三角形 >1000 = 三角形|否则 = nextTriangle (index + 1) (triangle + index + 1)main = 打印 $ nextTriangle 1 1

既然我们已经探索过了,让我们来解决问题

<块引用>

问题1:erlang、python、haskell会不会因为使用而掉速任意长度的整数或者不是,只要值更小比MAXINT?

在 Haskell 中,使用 IntegerInt 慢,但慢多少取决于执行的计算.幸运的是(对于 64 位机器)Int 就足够了.为了可移植性,您可能应该重写我的代码以使用 Int64Word64(C 不是唯一具有 long 的语言).<块引用>

问题2:为什么haskell 这么慢?是否有编译器标志关闭刹车还是我的实现?(后者相当对我来说,haskell 可能是一本带有七个印章的书.)

问题 3:你能给我一些如何优化这些的提示吗?在不改变我确定因素的方式的情况下实现?以任何方式优化:更好、更快、更本机"的语言.

那是我上面的回答.答案是

  • 0) 通过 -O2 使用优化
  • 1) 尽可能使用快速(尤其是:可拆箱)类型
  • 2) rem 不是 mod(经常被遗忘的优化)和
  • 3) 工作器/包装器转换(可能是最常见的优化).
<块引用>

问题 4:我的功能实现是否允许 LCO,因此避免在调用堆栈中添加不必要的帧?

是的,这不是问题.干得好,很高兴您考虑了这一点.

I have taken Problem #12 from Project Euler as a programming exercise and to compare my (surely not optimal) implementations in C, Python, Erlang and Haskell. In order to get some higher execution times, I search for the first triangle number with more than 1000 divisors instead of 500 as stated in the original problem.

The result is the following:

C:

lorenzo@enzo:~/erlang$ gcc -lm -o euler12.bin euler12.c
lorenzo@enzo:~/erlang$ time ./euler12.bin
842161320

real    0m11.074s
user    0m11.070s
sys 0m0.000s

Python:

lorenzo@enzo:~/erlang$ time ./euler12.py 
842161320

real    1m16.632s
user    1m16.370s
sys 0m0.250s

Python with PyPy:

lorenzo@enzo:~/Downloads/pypy-c-jit-43780-b590cf6de419-linux64/bin$ time ./pypy /home/lorenzo/erlang/euler12.py 
842161320

real    0m13.082s
user    0m13.050s
sys 0m0.020s

Erlang:

lorenzo@enzo:~/erlang$ erlc euler12.erl 
lorenzo@enzo:~/erlang$ time erl -s euler12 solve
Erlang R13B03 (erts-5.7.4) [source] [64-bit] [smp:4:4] [rq:4] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.7.4  (abort with ^G)
1> 842161320

real    0m48.259s
user    0m48.070s
sys 0m0.020s

Haskell:

lorenzo@enzo:~/erlang$ ghc euler12.hs -o euler12.hsx
[1 of 1] Compiling Main             ( euler12.hs, euler12.o )
Linking euler12.hsx ...
lorenzo@enzo:~/erlang$ time ./euler12.hsx 
842161320

real    2m37.326s
user    2m37.240s
sys 0m0.080s

Summary:

  • C: 100%
  • Python: 692% (118% with PyPy)
  • Erlang: 436% (135% thanks to RichardC)
  • Haskell: 1421%

I suppose that C has a big advantage as it uses long for the calculations and not arbitrary length integers as the other three. Also it doesn't need to load a runtime first (Do the others?).

Question 1: Do Erlang, Python and Haskell lose speed due to using arbitrary length integers or don't they as long as the values are less than MAXINT?

Question 2: Why is Haskell so slow? Is there a compiler flag that turns off the brakes or is it my implementation? (The latter is quite probable as Haskell is a book with seven seals to me.)

Question 3: Can you offer me some hints how to optimize these implementations without changing the way I determine the factors? Optimization in any way: nicer, faster, more "native" to the language.

EDIT:

Question 4: Do my functional implementations permit LCO (last call optimization, a.k.a tail recursion elimination) and hence avoid adding unnecessary frames onto the call stack?

I really tried to implement the same algorithm as similar as possible in the four languages, although I have to admit that my Haskell and Erlang knowledge is very limited.


Source codes used:

#include <stdio.h>
#include <math.h>

int factorCount (long n)
{
    double square = sqrt (n);
    int isquare = (int) square;
    int count = isquare == square ? -1 : 0;
    long candidate;
    for (candidate = 1; candidate <= isquare; candidate ++)
        if (0 == n % candidate) count += 2;
    return count;
}

int main ()
{
    long triangle = 1;
    int index = 1;
    while (factorCount (triangle) < 1001)
    {
        index ++;
        triangle += index;
    }
    printf ("%ld
", triangle);
}


#! /usr/bin/env python3.2

import math

def factorCount (n):
    square = math.sqrt (n)
    isquare = int (square)
    count = -1 if isquare == square else 0
    for candidate in range (1, isquare + 1):
        if not n % candidate: count += 2
    return count

triangle = 1
index = 1
while factorCount (triangle) < 1001:
    index += 1
    triangle += index

print (triangle)


-module (euler12).
-compile (export_all).

factorCount (Number) -> factorCount (Number, math:sqrt (Number), 1, 0).

factorCount (_, Sqrt, Candidate, Count) when Candidate > Sqrt -> Count;

factorCount (_, Sqrt, Candidate, Count) when Candidate == Sqrt -> Count + 1;

factorCount (Number, Sqrt, Candidate, Count) ->
    case Number rem Candidate of
        0 -> factorCount (Number, Sqrt, Candidate + 1, Count + 2);
        _ -> factorCount (Number, Sqrt, Candidate + 1, Count)
    end.

nextTriangle (Index, Triangle) ->
    Count = factorCount (Triangle),
    if
        Count > 1000 -> Triangle;
        true -> nextTriangle (Index + 1, Triangle + Index + 1)  
    end.

solve () ->
    io:format ("~p~n", [nextTriangle (1, 1) ] ),
    halt (0).


factorCount number = factorCount' number isquare 1 0 - (fromEnum $ square == fromIntegral isquare)
    where square = sqrt $ fromIntegral number
          isquare = floor square

factorCount' number sqrt candidate count
    | fromIntegral candidate > sqrt = count
    | number `mod` candidate == 0 = factorCount' number sqrt (candidate + 1) (count + 2)
    | otherwise = factorCount' number sqrt (candidate + 1) count

nextTriangle index triangle
    | factorCount triangle > 1000 = triangle
    | otherwise = nextTriangle (index + 1) (triangle + index + 1)

main = print $ nextTriangle 1 1

解决方案

Using GHC 7.0.3, gcc 4.4.6, Linux 2.6.29 on an x86_64 Core2 Duo (2.5GHz) machine, compiling using ghc -O2 -fllvm -fforce-recomp for Haskell and gcc -O3 -lm for C.

  • Your C routine runs in 8.4 seconds (faster than your run probably because of -O3)
  • The Haskell solution runs in 36 seconds (due to the -O2 flag)
  • Your factorCount' code isn't explicitly typed and defaulting to Integer (thanks to Daniel for correcting my misdiagnosis here!). Giving an explicit type signature (which is standard practice anyway) using Int and the time changes to 11.1 seconds
  • in factorCount' you have needlessly called fromIntegral. A fix results in no change though (the compiler is smart, lucky for you).
  • You used mod where rem is faster and sufficient. This changes the time to 8.5 seconds.
  • factorCount' is constantly applying two extra arguments that never change (number, sqrt). A worker/wrapper transformation gives us:

 $ time ./so
 842161320  

 real    0m7.954s  
 user    0m7.944s  
 sys     0m0.004s  

That's right, 7.95 seconds. Consistently half a second faster than the C solution. Without the -fllvm flag I'm still getting 8.182 seconds, so the NCG backend is doing well in this case too.

Conclusion: Haskell is awesome.

Resulting Code

factorCount number = factorCount' number isquare 1 0 - (fromEnum $ square == fromIntegral isquare)
    where square = sqrt $ fromIntegral number
          isquare = floor square

factorCount' :: Int -> Int -> Int -> Int -> Int
factorCount' number sqrt candidate0 count0 = go candidate0 count0
  where
  go candidate count
    | candidate > sqrt = count
    | number `rem` candidate == 0 = go (candidate + 1) (count + 2)
    | otherwise = go (candidate + 1) count

nextTriangle index triangle
    | factorCount triangle > 1000 = triangle
    | otherwise = nextTriangle (index + 1) (triangle + index + 1)

main = print $ nextTriangle 1 1

EDIT: So now that we've explored that, lets address the questions

Question 1: Do erlang, python and haskell lose speed due to using arbitrary length integers or don't they as long as the values are less than MAXINT?

In Haskell, using Integer is slower than Int but how much slower depends on the computations performed. Luckily (for 64 bit machines) Int is sufficient. For portability sake you should probably rewrite my code to use Int64 or Word64 (C isn't the only language with a long).

Question 2: Why is haskell so slow? Is there a compiler flag that turns off the brakes or is it my implementation? (The latter is quite probable as haskell is a book with seven seals to me.)

Question 3: Can you offer me some hints how to optimize these implementations without changing the way I determine the factors? Optimization in any way: nicer, faster, more "native" to the language.

That was what I answered above. The answer was

  • 0) Use optimization via -O2
  • 1) Use fast (notably: unbox-able) types when possible
  • 2) rem not mod (a frequently forgotten optimization) and
  • 3) worker/wrapper transformation (perhaps the most common optimization).

Question 4: Do my functional implementations permit LCO and hence avoid adding unnecessary frames onto the call stack?

Yes, that wasn't the issue. Good work and glad you considered this.

这篇关于与 Project Euler 的速度比较:C vs Python vs Erlang vs Haskell的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆