惰性IO - 字符串没有垃圾收集? [英] Lazy IO - string not garbage collected?
问题描述
我正在尝试将XML文件的内容读入 Map Int(Map Int字符串)
,它工作得很好(使用HaXml)。但是,我对我的程序的内存消耗并不满意,并且问题似乎是垃圾回收。
这里是我用来读取XML的代码file:
type TextFile = Map Int(Map Int String)
uildTextFile :: String - > ; IO TextFile
buildTextFile filename = do content< - readFile文件名
let doc = xmlParse文件名内容
con = docContent(posInNewCxt文件名无)doc
返回$ buildTF con
我的猜测是 所以我的问题是:有什么办法告诉编译器字符串 更普遍的是:如何在没有所有猜测的情况下找出问题的真正来源? 编辑: FUZxxl建议我尝试使用deepseq并更改了 不幸的是,这并没有改变任何事情(或我用错了吗?)... ... 不要猜测什么是消费记忆,找出当然 强制计算 如果问题是懒惰评估(你正在构建一个可以计算XML文档类型并将字符串留在堆中的on-heap thunk),则使用rnf和seq: 或者只是使用爆炸模式( I'm currently trying to read the contents of an XML file into a Here's the code I'm using to read the XML file: My guess is that Using strict application ( So my question is: Is there some way to tell the compiler that the string And more generally: How can I find out where the problem really comes from without all the guessing? Edit: As FUZxxl suggested I tried using deepseq and changed the second line of Unfortunately that didn't change anything really (or am I using it wrong?)... Don't Guess What Is Consuming Memory, Find Out For Sure The first step is to determine what types are consuming the most memory. You can see lots of examples of heap profiling here on SO or read the GHC manual. Forcing Computation If the problem is lazy evaluation (you're building an on-heap thunk that can compute the XML document type and leaving the string in heap too) then use rnf and seq: Or just use bang patterns ( 这篇关于惰性IO - 字符串没有垃圾收集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋! doc
或 con
)。我得出这个结论是因为尽管得到的 TextFile
只是一个singleton映射的singleton映射(使用特殊的测试文件,当然,这通常是不同的)。因此,最后,我有一个 Map
$!
)或使用 Data.Text
而不是字符串在
TextFile
中不会改变任何内容。 p>
content
(或者 doc
或 con
)是不再需要的,它可以被垃圾回收?
buildTextFile
的第二行,如下所示:
let doc = content`deepseq` xmlParse filename content
<第一步是确定消耗最多内存的类型。你可以在这里看到很多堆分析的例子,或者阅读 GHC手册。
buildTextFile :: String - > IO TextFile
buildTextFile filename = do content< - readFile文件名
let doc = xmlParse文件名内容
con = docContent(posInNewCxt文件名无)doc
res = buildTF con
返回$ rnf res`seq` res
!res = buildTF con
),无论哪种方式应该强制thunk,并允许GC收集 String
。Map Int (Map Int String)
and it works quite well (using HaXml). However, I'm not satisfied with the memory consumption of my program and the problems seems to be the garbage collection.type TextFile = Map Int (Map Int String)
buildTextFile :: String -> IO TextFile
buildTextFile filename = do content <- readFile filename
let doc = xmlParse filename content
con = docContent (posInNewCxt filename Nothing) doc
return $ buildTF con
content
is held in memory even after the return, although it doesn't need to be (of course it could also be doc
or con
). I come to this conclusion because the memory consumption rises quickly with very large XML files, although the resulting TextFile
is only a singleton map of a singleton map (using a special testing file, generally it's different, of course). So in the end, I have a Map
of a Map Int String
, with only one string in it, but the memory consumption is up to 19 MB.$!
) or using Data.Text
instead of String
in TextFile
doesn't change anything.content
(or doc
or con
) isn't needed anymore and that it can be garbage collected?buildTextFile
like so:let doc = content `deepseq` xmlParse filename content
buildTextFile :: String -> IO TextFile
buildTextFile filename = do content <- readFile filename
let doc = xmlParse filename content
con = docContent (posInNewCxt filename Nothing) doc
res = buildTF con
return $ rnf res `seq` res
!res = buildTF con
), either way that should force the thunks and allow the GC to collect String
.