与项目欧拉速度对比:C VS的Python VS二郎VS哈斯克尔 [英] Speed comparison with Project Euler: C vs Python vs Erlang vs Haskell
问题描述
我已问题#12 从的项目欧拉的作为编程练习和我的(肯定不是最优的)实现在C,Python和Erlang和Haskell的比较。为了得到一些更高的执行时间,我搜索了超过1000除数,而不是500在原来的问题说第一个三角形数量。
结果如下:
C:
洛伦佐@恩佐:〜/ $二郎的gcc -o -lm euler12.bin euler12.c
洛伦佐@恩佐:〜/ $二郎时间./euler12.bin
842161320真正0m11.074s
用户0m11.070s
SYS 0m0.000s
蟒蛇:
洛伦佐@恩佐:〜/ $二郎时间./euler12.py
842161320真正1m16.632s
用户1m16.370s
SYS 0m0.250s
蟒蛇与pypy:
洛伦佐@恩佐:〜/下载/ pypy-C-JIT-43780-b590cf6de419-LINUX64 /斌$时间./pypy /home/lorenzo/erlang/euler12.py
842161320真正0m13.082s
用户0m13.050s
SYS 0m0.020s
二郎:
洛伦佐@恩佐:〜/ $二郎erlc euler12.erl
洛伦佐@恩佐:〜/ $二郎ERL时间-s euler12解决
二郎R13B03(专家评审组-5.7.4)[来源] [64位] [SMP:4:4] [RQ:4] [异步线程:0] [HIPE] [内核投票:假]ESHELL V5.7.4(含中止^ G)
1> 842161320真正0m48.259s
用户0m48.070s
SYS 0m0.020s
哈斯克尔:
洛伦佐@恩佐:〜/ $二郎GHC euler12.hs -o euler12.hsx
[1 1]编译主(euler12.hs,euler12.o)
链接euler12.hsx ...
洛伦佐@恩佐:〜/ $二郎时间./euler12.hsx
842161320真正2m37.326s
用户2m37.240s
SYS 0m0.080s
摘要:
- C:100%
- 蟒蛇:692%(含118 pypy%)
- 二郎:436%(135感谢RichardC%)
- 哈斯克尔:1421%
我想,C有一个很大的优势,因为它使用长计算,而不是任意长度整数作为其他三个。此外,它并不需要先装入运行时(难道其他人呢?)。
问1:
不要二郎,Python和Haskell中失去适当的速度,使用任意长度的整数或不也只要值小于 MAXINT
?
问题2:
为什么Haskell的这么慢?是否有关闭刹车或者是我的执行编译器标志? (后者是完全有可能作为Haskell是一本书,用七印给我。)
问题3:
你能不能给我一些提示如何优化这些实现在不改变我决定因素的方式吗?优化以任何方式:更好,更快,更本土的语言
编辑:
问题4:
做我的功能实现许可证LCO(最后一次通话优化,a.k.a尾递归消除),从而避免增加不必要的帧到调用栈?
我真的努力实现相同的算法尽可能相似的四种语言,虽然我不得不承认,我的Haskell和Erlang的知识是非常有限的。
来源$ C $ CS使用:
的#include<&stdio.h中GT;
#包括LT&;&math.h中GT;INT factorCount(N久)
{
双平方=开方(N);
INT国际广场=(int)的平方;
诠释计数=国际广场==方? -1:0;
长候选人;
对(候选= 1;候选LT =国际广场候选++)
如果(0 == N%的候选人)数+ = 2;
返回计数;
}诠释的main()
{
长三角= 1;
INT索引= 1;
而(factorCount(三角形)下,1001)
{
指数++;
三角形+ =指数;
}
的printf(%LD \\ N,三角形);
}
#!在/ usr /斌/ env的python3.2进口数学高清factorCount(N):
方=的Math.sqrt(N)
国际广场= INT(方)
数= -1,如果国际广场==广场否则为0
有效范围内的候补(1,国际广场+ 1):
如果不是n%的候选人:数+ = 2
返回计数三角= 1
索引= 1
而factorCount(三角形)< 1001:
指数+ = 1
三角形+ =指数打印(三角形)
-module(euler12)。
-compile(export_all)。factorCount(数字) - GT; factorCount(数字,数学:SQRT(数字),1,0)。factorCount(_,的Sqrt,候选人,计数),当候选GT;开方 - >计数;factorCount(_,的Sqrt,候选人,计数)当候选人==的Sqrt - >计数+ 1;factorCount(数字,的Sqrt,候选人,计数) - GT;
病例数REM候选人
0 - > factorCount(数字,的Sqrt,候选人+ 1,计数+ 2);
_ - > factorCount(数字,的Sqrt,候选人+ 1,计数)。
结束。nextTriangle(指数,三角) - GT;
计数= factorCount(三角),
如果
COUNT> 1000 - >三角形;
真 - > nextTriangle(索引+ 1,+三角指数+ 1)
结束。解决() - >
IO:格式(用〜p〜n的,[nextTriangle(1,1)]),
停止(0)。
factorCount数= factorCount'号国际广场1 0 - (fromEnum $平方米== fromIntegral国际广场)
其中方=开方$ fromIntegral号
国际广场=楼广场factorCount'号开方候选人数
| fromIntegral候选GT;开方=计数
|一些`mod`候选人== 0 = factorCount'号的sqrt(候选+ 1)(数+ 2)
|否则= factorCount'号的sqrt(候选+ 1)的计nextTriangle指数三角形
| factorCount三角> 1000 =三角形
|否则= nextTriangle(索引+1)(三角形+索引+1)主要= $打印1 nextTriangle 1
使用 GHC 7.0.3
, GCC 4.4.6
,的Linux 2.6.29
的x86_64的酷睿双核(2.5GHz的)的机器上,使用编译 GHC -O2 -fllvm -fforce,recomp
为Haskell和 GCC -O3 -lm
对于C
- 8.4秒钟内你的C例程运行(比你跑,因为可能更快
-O3
) - Haskell的解决方案36秒运行一次(由于
-O2
标志) - 您
factorCount
code未显式类型,缺省值为整数
(感谢丹尼尔在这里纠正我的误诊!)。给使用显式类型的签名(这是标准的做法无论如何)内部
和时间的变化为11.1秒 - 在
factorCount
你不必要名为fromIntegral
。一个修复的结果虽然没有变化(编译器是聪明的,算你走运)。 - 您使用
MOD
,其中REM
更快,足够了。这改变了时间的8.5秒 -
factorCount
正在不断施加两个额外的参数是永远不会改变(数量
,开方
)。一名工人/包装改造给我们:
$时间./so
842161320 真正0m7.954s
用户0m7.944s
SYS 0m0.004s
这是正确的,7.95秒。一贯半秒比C解决方案更快。如果没有 -fllvm
标志,我仍然得到8.182秒
,所以后端NCG在这种情况下做的很好过
结论:Haskell是真棒
致使code
factorCount数= factorCount'号国际广场1 0 - (fromEnum $平方米== fromIntegral国际广场)
其中方=开方$ fromIntegral号
国际广场=楼广场factorCount'::诠释 - > INT - > INT - > INT - > INT
factorCount'号开方candidate0 COUNT0 =去candidate0 COUNT0
哪里
去候选人数
|候选GT;开方=计数
|一些`rem`候选人== 0 =去(候选+ 1)(数+ 2)
|否则=去(候选+ 1)的计nextTriangle指数三角形
| factorCount三角> 1000 =三角形
|否则= nextTriangle(索引+1)(三角形+索引+1)主要= $打印1 nextTriangle 1
编辑:所以,现在,我们已经探讨了,让解决问题。
问题1:你二郎神,蟒蛇和Haskell失速由于使用
任意长度的整数或不它们只要值少
比MAXINT?
块引用>在Haskell中,使用
整数
比内部
,但慢多少取决于执行的计算速度较慢。幸运的是(对于64位计算机)内部
就足够了。对于便携性的缘故,你或许应该重写我的code使用的Int64
或word64与
(C不是唯一用语言长
)。
问题2:为什么Haskell的这么慢?是否有一个编译器标志
关闭刹车或者是我的执行? (后者是相当
作为可能Haskell是一本书,用七印给我。)
问题3:你能提供给我一些提示如何优化这些
实现在不改变我决定因素的方式吗?
优化以任何方式:更好,更快,更本土的语言
块引用>这就是我上面的回答。答案是
- 0),通过
-O2
使用优化- 1)使用快(尤其是:拆箱-能)类型时,可能
- 2)
REM
不是MOD
(一个经常被遗忘的优化)和- 3)工人/包装转换(也许是最常见的优化)。
问题4:你我的功能实现许可证LCO,因此
避免增加不必要的帧到调用栈?
块引用>是的,这不是问题。良好的工作和高兴你考虑这一点。
I have taken Problem #12 from Project Euler as a programming exercise and to compare my (surely not optimal) implementations in C, Python, Erlang and Haskell. In order to get some higher execution times, I search for the first triangle number with more than 1000 divisors instead of 500 as stated in the original problem.
The result is the following:
C:
lorenzo@enzo:~/erlang$ gcc -lm -o euler12.bin euler12.c lorenzo@enzo:~/erlang$ time ./euler12.bin 842161320 real 0m11.074s user 0m11.070s sys 0m0.000s
python:
lorenzo@enzo:~/erlang$ time ./euler12.py 842161320 real 1m16.632s user 1m16.370s sys 0m0.250s
python with pypy:
lorenzo@enzo:~/Downloads/pypy-c-jit-43780-b590cf6de419-linux64/bin$ time ./pypy /home/lorenzo/erlang/euler12.py 842161320 real 0m13.082s user 0m13.050s sys 0m0.020s
erlang:
lorenzo@enzo:~/erlang$ erlc euler12.erl lorenzo@enzo:~/erlang$ time erl -s euler12 solve Erlang R13B03 (erts-5.7.4) [source] [64-bit] [smp:4:4] [rq:4] [async-threads:0] [hipe] [kernel-poll:false] Eshell V5.7.4 (abort with ^G) 1> 842161320 real 0m48.259s user 0m48.070s sys 0m0.020s
haskell:
lorenzo@enzo:~/erlang$ ghc euler12.hs -o euler12.hsx [1 of 1] Compiling Main ( euler12.hs, euler12.o ) Linking euler12.hsx ... lorenzo@enzo:~/erlang$ time ./euler12.hsx 842161320 real 2m37.326s user 2m37.240s sys 0m0.080s
Summary:
- C: 100%
- python: 692% (118% with pypy)
- erlang: 436% (135% thanks to RichardC)
- haskell: 1421%
I suppose that C has a big advantage as it uses long for the calculations and not arbitrary length integers as the other three. Also it doesn't need to load a runtime first (Do the others?).
Question 1: Do Erlang, Python and Haskell lose speed due to using arbitrary length integers or don't they as long as the values are less than
MAXINT
?Question 2: Why is Haskell so slow? Is there a compiler flag that turns off the brakes or is it my implementation? (The latter is quite probable as Haskell is a book with seven seals to me.)
Question 3: Can you offer me some hints how to optimize these implementations without changing the way I determine the factors? Optimization in any way: nicer, faster, more "native" to the language.
EDIT:
Question 4: Do my functional implementations permit LCO (last call optimization, a.k.a tail recursion elimination) and hence avoid adding unnecessary frames onto the call stack?
I really tried to implement the same algorithm as similar as possible in the four languages, although I have to admit that my Haskell and Erlang knowledge is very limited.
Source codes used:
#include <stdio.h> #include <math.h> int factorCount (long n) { double square = sqrt (n); int isquare = (int) square; int count = isquare == square ? -1 : 0; long candidate; for (candidate = 1; candidate <= isquare; candidate ++) if (0 == n % candidate) count += 2; return count; } int main () { long triangle = 1; int index = 1; while (factorCount (triangle) < 1001) { index ++; triangle += index; } printf ("%ld\n", triangle); }
#! /usr/bin/env python3.2 import math def factorCount (n): square = math.sqrt (n) isquare = int (square) count = -1 if isquare == square else 0 for candidate in range (1, isquare + 1): if not n % candidate: count += 2 return count triangle = 1 index = 1 while factorCount (triangle) < 1001: index += 1 triangle += index print (triangle)
-module (euler12). -compile (export_all). factorCount (Number) -> factorCount (Number, math:sqrt (Number), 1, 0). factorCount (_, Sqrt, Candidate, Count) when Candidate > Sqrt -> Count; factorCount (_, Sqrt, Candidate, Count) when Candidate == Sqrt -> Count + 1; factorCount (Number, Sqrt, Candidate, Count) -> case Number rem Candidate of 0 -> factorCount (Number, Sqrt, Candidate + 1, Count + 2); _ -> factorCount (Number, Sqrt, Candidate + 1, Count) end. nextTriangle (Index, Triangle) -> Count = factorCount (Triangle), if Count > 1000 -> Triangle; true -> nextTriangle (Index + 1, Triangle + Index + 1) end. solve () -> io:format ("~p~n", [nextTriangle (1, 1) ] ), halt (0).
factorCount number = factorCount' number isquare 1 0 - (fromEnum $ square == fromIntegral isquare) where square = sqrt $ fromIntegral number isquare = floor square factorCount' number sqrt candidate count | fromIntegral candidate > sqrt = count | number `mod` candidate == 0 = factorCount' number sqrt (candidate + 1) (count + 2) | otherwise = factorCount' number sqrt (candidate + 1) count nextTriangle index triangle | factorCount triangle > 1000 = triangle | otherwise = nextTriangle (index + 1) (triangle + index + 1) main = print $ nextTriangle 1 1
解决方案Using
GHC 7.0.3
,gcc 4.4.6
,Linux 2.6.29
on an x86_64 Core2 Duo (2.5GHz) machine, compiling usingghc -O2 -fllvm -fforce-recomp
for Haskell andgcc -O3 -lm
for C.
- Your C routine runs in 8.4 seconds (faster than your run probably because of
-O3
)- The Haskell solution runs in 36 seconds (due to the
-O2
flag)- Your
factorCount'
code isn't explicitly typed and defaulting toInteger
(thanks to Daniel for correcting my misdiagnosis here!). Giving an explicit type signature (which is standard practice anyway) usingInt
and the time changes to 11.1 seconds- in
factorCount'
you have needlessly calledfromIntegral
. A fix results in no change though (the compiler is smart, lucky for you).- You used
mod
whererem
is faster and sufficient. This changes the time to 8.5 seconds.factorCount'
is constantly applying two extra arguments that never change (number
,sqrt
). A worker/wrapper transformation gives us:$ time ./so 842161320 real 0m7.954s user 0m7.944s sys 0m0.004s
That's right, 7.95 seconds. Consistently half a second faster than the C solution. Without the
-fllvm
flag I'm still getting8.182 seconds
, so the NCG backend is doing well in this case too.Conclusion: Haskell is awesome.
Resulting Code
factorCount number = factorCount' number isquare 1 0 - (fromEnum $ square == fromIntegral isquare) where square = sqrt $ fromIntegral number isquare = floor square factorCount' :: Int -> Int -> Int -> Int -> Int factorCount' number sqrt candidate0 count0 = go candidate0 count0 where go candidate count | candidate > sqrt = count | number `rem` candidate == 0 = go (candidate + 1) (count + 2) | otherwise = go (candidate + 1) count nextTriangle index triangle | factorCount triangle > 1000 = triangle | otherwise = nextTriangle (index + 1) (triangle + index + 1) main = print $ nextTriangle 1 1
EDIT: So now that we've explored that, lets address the questions
Question 1: Do erlang, python and haskell lose speed due to using arbitrary length integers or don't they as long as the values are less than MAXINT?
In Haskell, using
Integer
is slower thanInt
but how much slower depends on the computations performed. Luckily (for 64 bit machines)Int
is sufficient. For portability sake you should probably rewrite my code to useInt64
orWord64
(C isn't the only language with along
).Question 2: Why is haskell so slow? Is there a compiler flag that turns off the brakes or is it my implementation? (The latter is quite probable as haskell is a book with seven seals to me.)
Question 3: Can you offer me some hints how to optimize these implementations without changing the way I determine the factors? Optimization in any way: nicer, faster, more "native" to the language.
That was what I answered above. The answer was
- 0) Use optimization via
-O2
- 1) Use fast (notably: unbox-able) types when possible
- 2)
rem
notmod
(a frequently forgotten optimization) and- 3) worker/wrapper transformation (perhaps the most common optimization).
Question 4: Do my functional implementations permit LCO and hence avoid adding unnecessary frames onto the call stack?
Yes, that wasn't the issue. Good work and glad you considered this.
这篇关于与项目欧拉速度对比:C VS的Python VS二郎VS哈斯克尔的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!