如何加快缓冲哈斯克尔IO? [英] How to speed Haskell IO with buffering?

查看:89
本文介绍了如何加快缓冲哈斯克尔IO?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Real World Haskell(第7章,第189页)中读到IO缓冲的原理,试图测试不同的缓冲大小对性能的影响。

  import System.IO 
import Data.Time.Clock
import Data.Char(toUpper)

main :: IO( )
main = do
hInp< - openFilebigFile.txtReadMode
let bufferSize = truncate $ 2 ** 10
hSetBuffering hInp(BlockBuffering(Just bufferSize))
bufferMode< - hGetBuffering hInp
putStrLn $当前缓冲模式:++(show bufferMode)

startTime< - getCurrentTime
inp< - hGetContents hInp
writeFileprocessed.txt(map toUpper inp)
hClose hInp
finishTime< - getCurrentTime
print $ diffUTCTime finishTime startTime
return()

然后我创建了一个bigFile.txt

  -rw-rw-r-- 1个用户用户96Mянв。 26 09:49 bigFile.txt 

并针对此文件运行我的程序,使用不同的缓冲区大小:

 当前缓冲模式:BlockBuffering(只需32)
9.744967s

当前缓冲模式: BlockBuffering(只有1024)
9.667924s

当前缓冲模式:BlockBuffering(只需1048576)
9.494807s

当前缓冲模式:BlockBuffering(只需1073741824 )
9.792453s

但程序运行时间几乎相同。是正常的,还是我做错了什么?

解决方案

在现代操作系统中,缓冲区大小可能很小由于1)由内核执行的预读操作,以及2)如果您最近已经读过文件,该文件可能已经在页面缓存中。



这是一个测量写入缓冲效果的程序。典型的结果是:

  $ ./mkbigfile 32  -  12.864733s 
$ ./mkbigfile 64 - 9.668272s
$ ./mkbigfile 128 - 6.993664s
$ ./mkbigfile 512 - 4.130989s
$ ./mkbigfile 1024 - 3.536652s
$ ./mkbigfile 16384 - 3.055403s
$ ./mkbigfile 1000000 - 3.004879s

来源:

  { - #LANGUAGE OverloadedStrings# - } 

将符合条件的Data.ByteString导入为BS
导入数据。 ByteString(ByteString)
import Control.Monad
import System.IO
import System.Environment
import Data.Time.Clock

main = do
(arg:_)< - getArgs
let size = read arg
let bs =abcdefghijklmnopqrstuvwxyz
n = 96000000`div`(length bs)
h< - openFilebigFile.txtWriteMode
hSetBuffering h(BlockBuffering(Just size))
startTime< - getCurrentTime
replicateM_ n $ hPutStrLn h bs
hClose h
菲尼什hTime< - getCurrentTime
print $ diffUTCTime finishTime startTime
return()


I read about IO buffering in the "Real World Haskell" (ch. 7, p. 189), and tried to test, how different buffering size affects the performance.

import System.IO
import Data.Time.Clock
import Data.Char(toUpper)

main :: IO ()
main = do
  hInp <- openFile "bigFile.txt" ReadMode
  let bufferSize = truncate $ 2**10
  hSetBuffering hInp (BlockBuffering (Just bufferSize))
  bufferMode <- hGetBuffering hInp
  putStrLn $ "Current buffering mode: " ++ (show bufferMode)

  startTime <- getCurrentTime
  inp <- hGetContents hInp
  writeFile "processed.txt" (map toUpper inp)
  hClose hInp
  finishTime <- getCurrentTime
  print $ diffUTCTime finishTime startTime
  return ()

Then I created a "bigFile.txt"

-rw-rw-r-- 1 user user 96M янв.  26 09:49 bigFile.txt

and run my program against this file, with different buffer size:

Current buffering mode: BlockBuffering (Just 32)
9.744967s   

Current buffering mode: BlockBuffering (Just 1024)
9.667924s                                      

Current buffering mode: BlockBuffering (Just 1048576)
9.494807s    

Current buffering mode: BlockBuffering (Just 1073741824)
9.792453s   

But the program running time is almost the same. Is it normal, or I'm doing something wrong?

解决方案

On a modern OS it is likely that the buffer size has little effect on reading a file linearly due to 1) read-ahead performed by the kernel and 2) the file might already be in the page cache if you have already read the file recently.

Here is a program which measures the effect of buffering on writes. Typical results are:

$ ./mkbigfile 32      -- 12.864733s
$ ./mkbigfile 64      --  9.668272s
$ ./mkbigfile 128     --  6.993664s
$ ./mkbigfile 512     --  4.130989s
$ ./mkbigfile 1024    --  3.536652s
$ ./mkbigfile 16384   --  3.055403s
$ ./mkbigfile 1000000 --  3.004879s

Source:

{-# LANGUAGE OverloadedStrings #-}

import qualified Data.ByteString as BS
import Data.ByteString (ByteString)
import Control.Monad
import System.IO
import System.Environment
import Data.Time.Clock

main = do
  (arg:_) <- getArgs
  let size = read arg
  let bs = "abcdefghijklmnopqrstuvwxyz"
      n = 96000000 `div` (length bs)
  h <- openFile "bigFile.txt" WriteMode
  hSetBuffering h (BlockBuffering (Just size))
  startTime <- getCurrentTime
  replicateM_ n $ hPutStrLn h bs
  hClose h
  finishTime <- getCurrentTime
  print $ diffUTCTime finishTime startTime
  return ()

这篇关于如何加快缓冲哈斯克尔IO?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆