如何加快缓冲哈斯克尔IO? [英] How to speed Haskell IO with buffering?
问题描述
我在Real World Haskell(第7章,第189页)中读到IO缓冲的原理,试图测试不同的缓冲大小对性能的影响。
import System.IO
import Data.Time.Clock
import Data.Char(toUpper)
main :: IO( )
main = do
hInp< - openFilebigFile.txtReadMode
let bufferSize = truncate $ 2 ** 10
hSetBuffering hInp(BlockBuffering(Just bufferSize))
bufferMode< - hGetBuffering hInp
putStrLn $当前缓冲模式:++(show bufferMode)
startTime< - getCurrentTime
inp< - hGetContents hInp
writeFileprocessed.txt(map toUpper inp)
hClose hInp
finishTime< - getCurrentTime
print $ diffUTCTime finishTime startTime
return()
然后我创建了一个bigFile.txt
-rw-rw-r-- 1个用户用户96Mянв。 26 09:49 bigFile.txt
并针对此文件运行我的程序,使用不同的缓冲区大小:
当前缓冲模式:BlockBuffering(只需32)
9.744967s
当前缓冲模式: BlockBuffering(只有1024)
9.667924s
当前缓冲模式:BlockBuffering(只需1048576)
9.494807s
当前缓冲模式:BlockBuffering(只需1073741824 )
9.792453s
但程序运行时间几乎相同。是正常的,还是我做错了什么?
在现代操作系统中,缓冲区大小可能很小由于1)由内核执行的预读操作,以及2)如果您最近已经读过文件,该文件可能已经在页面缓存中。
这是一个测量写入缓冲效果的程序。典型的结果是:
$ ./mkbigfile 32 - 12.864733s
$ ./mkbigfile 64 - 9.668272s
$ ./mkbigfile 128 - 6.993664s
$ ./mkbigfile 512 - 4.130989s
$ ./mkbigfile 1024 - 3.536652s
$ ./mkbigfile 16384 - 3.055403s
$ ./mkbigfile 1000000 - 3.004879s
来源:
{ - #LANGUAGE OverloadedStrings# - }
将符合条件的Data.ByteString导入为BS
导入数据。 ByteString(ByteString)
import Control.Monad
import System.IO
import System.Environment
import Data.Time.Clock
main = do
(arg:_)< - getArgs
let size = read arg
let bs =abcdefghijklmnopqrstuvwxyz
n = 96000000`div`(length bs)
h< - openFilebigFile.txtWriteMode
hSetBuffering h(BlockBuffering(Just size))
startTime< - getCurrentTime
replicateM_ n $ hPutStrLn h bs
hClose h
菲尼什hTime< - getCurrentTime
print $ diffUTCTime finishTime startTime
return()
I read about IO buffering in the "Real World Haskell" (ch. 7, p. 189), and tried to test, how different buffering size affects the performance.
import System.IO
import Data.Time.Clock
import Data.Char(toUpper)
main :: IO ()
main = do
hInp <- openFile "bigFile.txt" ReadMode
let bufferSize = truncate $ 2**10
hSetBuffering hInp (BlockBuffering (Just bufferSize))
bufferMode <- hGetBuffering hInp
putStrLn $ "Current buffering mode: " ++ (show bufferMode)
startTime <- getCurrentTime
inp <- hGetContents hInp
writeFile "processed.txt" (map toUpper inp)
hClose hInp
finishTime <- getCurrentTime
print $ diffUTCTime finishTime startTime
return ()
Then I created a "bigFile.txt"
-rw-rw-r-- 1 user user 96M янв. 26 09:49 bigFile.txt
and run my program against this file, with different buffer size:
Current buffering mode: BlockBuffering (Just 32)
9.744967s
Current buffering mode: BlockBuffering (Just 1024)
9.667924s
Current buffering mode: BlockBuffering (Just 1048576)
9.494807s
Current buffering mode: BlockBuffering (Just 1073741824)
9.792453s
But the program running time is almost the same. Is it normal, or I'm doing something wrong?
On a modern OS it is likely that the buffer size has little effect on reading a file linearly due to 1) read-ahead performed by the kernel and 2) the file might already be in the page cache if you have already read the file recently.
Here is a program which measures the effect of buffering on writes. Typical results are:
$ ./mkbigfile 32 -- 12.864733s
$ ./mkbigfile 64 -- 9.668272s
$ ./mkbigfile 128 -- 6.993664s
$ ./mkbigfile 512 -- 4.130989s
$ ./mkbigfile 1024 -- 3.536652s
$ ./mkbigfile 16384 -- 3.055403s
$ ./mkbigfile 1000000 -- 3.004879s
Source:
{-# LANGUAGE OverloadedStrings #-}
import qualified Data.ByteString as BS
import Data.ByteString (ByteString)
import Control.Monad
import System.IO
import System.Environment
import Data.Time.Clock
main = do
(arg:_) <- getArgs
let size = read arg
let bs = "abcdefghijklmnopqrstuvwxyz"
n = 96000000 `div` (length bs)
h <- openFile "bigFile.txt" WriteMode
hSetBuffering h (BlockBuffering (Just size))
startTime <- getCurrentTime
replicateM_ n $ hPutStrLn h bs
hClose h
finishTime <- getCurrentTime
print $ diffUTCTime finishTime startTime
return ()
这篇关于如何加快缓冲哈斯克尔IO?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!