Haskell中的基本I / O性能 [英] Basic I/O performance in Haskell

查看:94
本文介绍了Haskell中的基本I / O性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

另一个微观基准:为什么这个循环(用 ghc -O2 -fllvm ,7.4.1,Linux 64bit 3.2内核编译,重定向到 / dev / null

  mapM_ print [1..100000000] 

比普通 C 中的简单for循环慢5倍 write(2)非缓冲系统调用?我试图收集Haskell的问题。



即使这个缓慢的C解决方案也比Haskell快得多

  int i; 
char buf [16];
for(i = 0; i <= 100000000; i ++){
sprintf(buf,%d \ n,i);
write(1,buf,strlen(buf));


解决方案

好的,根据 gcc -O3 编译的C代码大约需要运行21.5秒,原始的Haskell代码大约需要56秒。所以不是5的因素,有点高于2.5。



第一个不平凡的区别是,

  mapM_ print [1..100000000] 

使用 Integer s,这有点慢,因为它涉及一个预先检查,然后与盒装的 Int s一起工作,而显示 Int 的实例是否在unboxed Int# s上工作。



添加一个类型签名,以便Haskell代码可以在 Int s上运行,

  mapM_ print [1 :: Int .. 100000000] 

将时间缩短到47秒,比C代码的时间长一倍。

现在,另一个巨大的区别是 show 产生一个链接列表 Char ,并且不仅填充连续的字节缓冲区。

然后,使用 Char s的链接列表填充字节缓冲区,写入 stdout 句柄。



所以,Haskell代码比C代码更复杂, ,因此花费更长时间也就不足为奇了。不可否认,最好有一种更简单的方式来更直接地输出这些东西(因此速度更快)。然而,处理它的正确方法是使用更合适的算法(也适用于C)。对

  putStr进行简单更改。 unlines $ map show [0 :: Int .. 100000000] 

几乎减少了一半的时间,如果一个人希望它真的很快,一个使用更快的 ByteString I / O并且有效地构建输出,如应用程序的答案

Another microbenchmark: Why is this "loop" (compiled with ghc -O2 -fllvm, 7.4.1, Linux 64bit 3.2 kernel, redirected to /dev/null)

mapM_ print [1..100000000]

about 5x slower than a simple for-cycle in plain C with write(2) non-buffered syscall? I am trying to gather Haskell gotchas.

Even this slow C solution is much faster than Haskell

int i;
char buf[16];
for (i=0; i<=100000000; i++) {
    sprintf(buf, "%d\n", i);
    write(1, buf, strlen(buf));
}

解决方案

Okay, on my box the C code, compiled per gcc -O3 takes about 21.5 seconds to run, the original Haskell code about 56 seconds. So not a factor of 5, a bit above 2.5.

The first nontrivial difference is that

mapM_ print [1..100000000]

uses Integers, that's a bit slower because it involves a check upfront, and then works with boxed Ints, while the Show instance of Int does the conversion work on unboxed Int#s.

Adding a type signature, so that the Haskell code works on Ints,

mapM_ print [1 :: Int .. 100000000]

brings the time down to 47 seconds, a bit above twice the time the C code takes.

Now, another big difference is that show produces a linked list of Char and doesn't just fill a contiguous buffer of bytes. That is slower too.

Then that linked list of Chars is used to fill a byte buffer that then is written to the stdout handle.

So, the Haskell code does more, and more complicated things than the C code, thus it's not surprising that it takes longer.

Admittedly, it would be desirable to have an easy way to output such things more directly (and hence faster). However, the proper way to handle it is to use a more suitable algorithm (that applies to C too). A simple change to

putStr . unlines $ map show [0 :: Int .. 100000000]

almost halves the time taken, and if one wants it really fast, one uses the faster ByteString I/O and builds the output efficiently as exemplified in applicative's answer.

这篇关于Haskell中的基本I / O性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆