为什么Haskell程序比等效的Python程序慢得多? [英] Why is this Haskell program so much slower than an equivalent Python one?

查看:388
本文介绍了为什么Haskell程序比等效的Python程序慢得多?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

作为编程挑战的一部分,我需要从标准输入读取一系列以空格分隔的整数(在一行上),并将这些整数的总和打印到标准输出。这个序列可以包含多达10,000,000个整数。

我有两个解决方案:一个用Haskell编写( foo.hs

code>),另一个等价的Python 2( foo.py )。不幸的是,(编译后的)Haskell程序一直比Python程序慢,并且我无法解释这两个程序之间的性能差异。请参阅下面的基准部分。如果有的话,我会希望Haskell占上风......



我做错了什么?我如何解释这种差异?有没有简单的方法来加快我的Haskell代码?



(有关信息,我使用的是具有8Gb RAM的2010年中期Macbook Pro,GHC 7.8.4, )



foo.hs



  main = print。 sum =<< getIntList 

getIntList :: IO [Int]
getIntList = fmap(map read。words)getLine

(用 ghc -O2 foo.hs 编译)
$ b

foo.py



  ns = map(int, raw_input()。split())
print sum(ns)



基准



下面, test.txt 由一行1000万个空格分隔的整数组成。

 #Haskell 
$ time ./foo< test.txt
1679257

real 0m36.704s
user 0m35.932s
sys 0m0.632s

#Python
$ time python foo.py< test.txt
1679257

real 0m7.916s
user 0m7.756s
sys 0m0.151s


解决方案

读取很慢。对于批量解析,请使用 bytestring text 原语或 attoparsec

我做了一些基准测试。您的原始版本在我的计算机上以 23,9 秒运行。以下版本在 0.35 秒内运行:

 将限定的Data.ByteString.Char8导入为B 
import Control.Applicative
import Data.Maybe
import Data.List
import Data.Char

main = print。 sum =<< getIntList

getIntList :: IO [Int]
getIntList =
map(fst。fromJust。B.readInt)。 B.words< $> B.readFiletest.txt

通过将解析器专用于测试.txt 文件,我可以将运行时降至 0.26 秒:

  getIntList :: IO [Int] 
getIntList =
unfoldr(B.readInt。B.dropWhile(==''))< $> B.readFiletest.txt


As part of a programming challenge, I need to read, from stdin, a sequence of space-separated integers (on a single line), and print the sum of those integers to stdout. The sequence in question can contain as many as 10,000,000 integers.

I have two solutions for this: one written in Haskell (foo.hs), and another, equivalent one, written in Python 2 (foo.py). Unfortunately, the (compiled) Haskell program is consistently slower than the Python program, and I'm at a loss for explaining the discrepancy in performance between the two programs; see the Benchmark section below. If anything, I would have expected Haskell to have the upper hand...

What am I doing wrong? How can I account for this discrepancy? Is there an easy way of speeding up my Haskell code?

(For information, I'm using a mid-2010 Macbook Pro with 8Gb RAM, GHC 7.8.4, and Python 2.7.9.)

foo.hs

main = print . sum =<< getIntList

getIntList :: IO [Int]
getIntList = fmap (map read . words) getLine

(compiled with ghc -O2 foo.hs)

foo.py

ns = map(int, raw_input().split())
print sum(ns)

Benchmark

In the following, test.txt consists of a single line of 10 million space-separated integers.

# Haskell
$ time ./foo < test.txt 
1679257

real    0m36.704s
user    0m35.932s
sys     0m0.632s

# Python
$ time python foo.py < test.txt
1679257 

real    0m7.916s
user    0m7.756s
sys     0m0.151s

解决方案

read is slow. For bulk parsing, use bytestring or text primitives, or attoparsec.

I did some benchmarking. Your original version ran in 23,9 secs on my computer. The version below ran in 0.35 secs:

import qualified Data.ByteString.Char8 as B
import Control.Applicative
import Data.Maybe
import Data.List
import Data.Char

main = print . sum =<< getIntList

getIntList :: IO [Int]
getIntList =
    map (fst . fromJust . B.readInt) . B.words <$> B.readFile "test.txt"

By specializing the parser to your test.txt file, I could get the runtime down to 0.26 sec:

getIntList :: IO [Int]          
getIntList =
    unfoldr (B.readInt . B.dropWhile (==' ')) <$> B.readFile "test.txt"

这篇关于为什么Haskell程序比等效的Python程序慢得多?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆