在Haskell中处理大文件 [英] Dealing with large files in Haskell
问题描述
我有一个大文件(4个演出),可以说是4字节浮点数.我想将其视为List,就我而言,我希望能够使用map,filter,foldl等.但是,除了生成带有输出的新列表之外,我还想将输出写回到文件,因此只需将文件的一小部分加载到内存中.你可以说我叫MutableFileList的类型
I have a large file (4+ gigs) of, lets just say, 4 byte floats. I would like to treat it as List, in the sense that I would like to be able to use map, filter, foldl, etc. However, instead of producing a new list with the output, I would like to write the output back into the file, and thus only have to load a small portion of the file in memory. You could say I what a type called MutableFileList
以前有人遇到过这种情况吗?与其重新发明轮子,我在想,是否有一种处理这种问题的骇人方法?
Has anyone ran into this situation before? Instead of re-inventing the wheel I was wondering if there a Hackish way for dealing with this?
推荐答案
您不应将其视为内存中的[Double]
或[Float]
.您可以做的是使用与mmapFile或readFile一起使用的列表形式的压缩数组类型之一,例如uvector/vector/...,一次拉入文件块并进行处理.或使用与延迟字节串等效的延迟打包数组类型.
You should not treat it as a [Double]
or [Float]
in memory. What you could do is use one of the list-like packed array types, such as uvector/vector/... in company with mmapFile or readFile to pull chunks of the file in at a time, and process them. Or use a lazy packed array type, equivalent to lazy bytestrings.
这篇关于在Haskell中处理大文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!