在清除树时使用ElementTree.iterparse()时是否应该增加内存使用量? [英] Should memory usage increase when using ElementTree.iterparse() when clear()ing trees?

查看:169
本文介绍了在清除树时使用ElementTree.iterparse()时是否应该增加内存使用量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

import os
import xml.etree.ElementTree as et

for ev, el in et.iterparse(os.sys.stdin):
    el.clear()

在ODP结构上运行上述操作 RDF转储总是会增加内存.这是为什么?我了解ElementTree仍然构建了一个解析树,尽管其子节点clear() ed也是如此.如果这是导致这种内存使用模式的原因,那么有没有解决的办法?

Running the above on the ODP structure RDF dump results in always increasing memory. Why is that? I understand ElementTree still builds a parse tree, albeit with the child nodes clear()ed. If that is the cause of this memory usage pattern, is there a way around it?

推荐答案

您正在clear每个元素,但对它们的引用仍保留在根文档中.因此,仍然无法对各个元素进行垃圾回收.请参阅ElementTree文档中的此讨论.

You are clearing each element but references to them remain in the root document. So the individual elements still cannot be garbage collected. See this discussion in the ElementTree documentation.

解决方案是清除根目录中的引用,如下所示:

The solution is to clear references in the root, like so:

# get an iterable
context = iterparse(source, events=("start", "end"))

# turn it into an iterator
context = iter(context)

# get the root element
event, root = context.next()

for event, elem in context:
    if event == "end" and elem.tag == "record":
        ... process record elements ...
        root.clear()

关于内存使用情况要记住的另一件事,这可能不会影响您的情况,即一旦VM从系统为堆存储分配了内存,它通常就不会再退还该内存.大多数Java VM也以这种方式工作.因此,即使堆内存未使用,您也不应该期望topps中解释器的大小会减小.

Another thing to remember about memory usage, which may not be affecting your situation, is that once the VM allocates memory for heap storage from the system, it generally never gives that memory back. Most Java VMs work this way too. So you should not expect the size of the interpreter in top or ps to ever decrease, even if that heap memory is unused.

这篇关于在清除树时使用ElementTree.iterparse()时是否应该增加内存使用量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆