Python比编译好的Haskell更快? [英] Python faster than compiled Haskell?

查看:128
本文介绍了Python比编译好的Haskell更快?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个用Python和Haskell编写的简单脚本。它读取一个具有1,000,000换行分隔整数的文件,将该文件解析为整数列表,然后快速对其进行排序,然后将其写入排序的不同文件。该文件与未排序的文件具有相同的格式。简单。

这里是Haskell:

  quicksort :: Ord a => [a]  - > [a] 
quicksort [] = []
quicksort(p:xs)=(quicksort lesser)++ [p] ++(quicksort greater)
where
lesser =过滤器(< p)xs
greater = filter(> = p)xs

main = do
file< - readFiledata
let un = lines文件
let f = map(\ x - > read x :: Int)un
let done = quicksort f
writeFilesorted(unlines(map show done))

这里是Python:

  def qs(ar):
如果len(ar)== 0:
返回ar

p = ar [0]
return qs([i for i in ar if if< p])+ [p] + qs([i for i in ar if if i> p])


def read_file(fn):
f = open(fn)
data = f.read()
f.close()
返回数据

def write_file(fn,data):
f = open('sorted','w')
f.write(data)
f.close()


def main():
data = read_file('data')

lines = data.split('\\\
')
lines = [int(l)for l in line]

完成= qs(行)
完成= [str(l)for l in done]

write_file('sorted',\\\
.join(done))

if __name__ =='__main__':
main()

非常简单。现在我用 $ b

  $ ghc -O2 --make quick.hs 



 $ time ./quick 
$ time python qs.py

结果:



Haskell:

  real 0m10.820s 
user 0m10.656s
sys 0m0.154s

Python:

  real 0m9.888s 
user 0m9.669s
sys 0m0.203s

Python如何可能比本机代码Haskell更快?



谢谢

编辑


  • Python版本:2.7.1
  • GHC版本:7.0.4
  • Mac OSX 10.7.3
  • 2.4GHz Intel Core i5



生成的列表
$ b

  from random导入shuffle 
a = [str(a)for a xrange(0,1000 * 1000)]
shuffle(a)
s =\\\
.join(a)
f = open('data','w')
f.write(s)
f.close()

所以所有的数字都是唯一的。

read 。用这样的函数替换 read

  import数字

fastRead :: String - > Int
fastRead s = case readDec s of [(n,)] - > n

我有一个相当公平的加速:

 〜/ programming%time ./test.slow 
./test.slow 9.82s user 0.06s system 99%cpu 9.901 total
〜/ programming%time ./test.fast
./test.fast 6.99s用户0.05s系统99%cpu 7.064总数
〜/编程时间%./test.bytestring
./test.bytestring 4.94s user 0.06s system 99%cpu 5.026 total

只是为了好玩,上面的结果包括一个版本,对于ULTIMATE BARE-METAL SPEED, ByteString (因此完全忽略了文件编码问题,因此未能通过准备进入21世纪的测试)。它也有其他一些差异;例如,它发布到标准库的排序功能。

 导入限定的Data.ByteString为BS 
导入Data.Attoparsec.ByteString.Char8
import Control.Applicative
import Data.List

parser = many(decimal< * char'\\\
')

reallyParse p bs =案例解析
部分f - > f BS.empty
v - > v

main = do
数字< - BS.readFiledata
case真正解析器编号
完成t r | BS.null t - > writeFile排序。不合格。地图显示。 sort $ r


I have a simple script written in both Python and Haskell. It reads a file with 1,000,000 newline separated integers, parses that file into a list of integers, quick sorts it and then writes it to a different file sorted. This file has the same format as the unsorted one. Simple.

Here is Haskell:

quicksort :: Ord a => [a] -> [a]
quicksort []     = []
quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater)
    where
        lesser  = filter (< p) xs
        greater = filter (>= p) xs

main = do
    file <- readFile "data"
    let un = lines file
    let f = map (\x -> read x::Int ) un
    let done = quicksort f
    writeFile "sorted" (unlines (map show done))

And here is Python:

def qs(ar):
    if len(ar) == 0:
        return ar

    p = ar[0]
    return qs([i for i in ar if i < p]) + [p] + qs([i for i in ar if i > p])


def read_file(fn):
    f = open(fn)
    data = f.read()
    f.close()
    return data

def write_file(fn, data):
    f = open('sorted', 'w')
    f.write(data)
    f.close()


def main():
    data = read_file('data')

    lines = data.split('\n')
    lines = [int(l) for l in lines]

    done = qs(lines)
    done = [str(l) for l in done]

    write_file('sorted', "\n".join(done))

if __name__ == '__main__':
    main()

Very simple. Now I compile the Haskell code with

$ ghc -O2 --make quick.hs

And I time those two with:

$ time ./quick
$ time python qs.py

Results:

Haskell:

real    0m10.820s
user    0m10.656s
sys 0m0.154s

Python:

real    0m9.888s
user    0m9.669s
sys 0m0.203s

How can Python possibly be faster than native code Haskell?

Thanks

EDIT:

  • Python version: 2.7.1
  • GHC version: 7.0.4
  • Mac OSX, 10.7.3
  • 2.4GHz Intel Core i5

List generated by

from random import shuffle
a = [str(a) for a in xrange(0, 1000*1000)]
shuffle(a)
s = "\n".join(a)
f = open('data', 'w')
f.write(s)
f.close()

So all numbers are unique.

解决方案

In short, don't use read. Replace read with a function like this:

import Numeric

fastRead :: String -> Int
fastRead s = case readDec s of [(n, "")] -> n

I get a pretty fair speedup:

~/programming% time ./test.slow
./test.slow  9.82s user 0.06s system 99% cpu 9.901 total
~/programming% time ./test.fast
./test.fast  6.99s user 0.05s system 99% cpu 7.064 total
~/programming% time ./test.bytestring
./test.bytestring  4.94s user 0.06s system 99% cpu 5.026 total

Just for fun, the above results include a version that uses ByteString (and hence fails the "ready for the 21st century" test by totally ignoring the problem of file encodings) for ULTIMATE BARE-METAL SPEED. It also has a few other differences; for example, it ships out to the standard library's sort function. The full code is below.

import qualified Data.ByteString as BS
import Data.Attoparsec.ByteString.Char8
import Control.Applicative
import Data.List

parser = many (decimal <* char '\n')

reallyParse p bs = case parse p bs of
    Partial f -> f BS.empty
    v -> v

main = do
    numbers <- BS.readFile "data"
    case reallyParse parser numbers of
        Done t r | BS.null t -> writeFile "sorted" . unlines . map show . sort $ r

这篇关于Python比编译好的Haskell更快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆