使用xml.dom.minidom解析xml的内存泄漏 [英] memory leak parsing xml using xml.dom.minidom

查看:181
本文介绍了使用xml.dom.minidom解析xml的内存泄漏的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用xml.dom.minidom来解析xml文件,如下所示:

I'm using xml.dom.minidom to parse xml files, somewhat like this:

import xml.dom.minidom as dom

file= open('file.xml')
doc= dom.parse(file)
# SNIP
doc.unlink()

即使在取消链接文档后,内存使用量仍约为120 MiB.当一个人实际上正在使用该程序,导致解析多个xml文件时,内存使用量将攀升至大约300 MiB,这是不可接受的.

Even after unlinking the document, the memory usage is at about 120 MiB. When one is actually using the program, causing multiple xml files to be parsed, memory usage climbs to about 300 MiB, which is unacceptable.

我确定内存泄漏不是由我的代码引起的,而是由琐碎引起的,因为即使这样做也只是

I'm sure the memory leak isn't caused by my code, but by minidom, because even doing just

doc= dom.parse(file)
doc.unlink()

产生相同的结果.

我做错什么了吗,或者这是一个小小的错误?

Am I doing something wrong, or is this a bug in minidom?

PS:我宁愿坚持最小化,因为在我的代码中发生了很多的xml解析,我不想完全重写所有内容,但是我会做的如果没有其他选择.

P.S.: I'd prefer to stick to minidom, because there's a lot of xml parsing happening in my code, and I'd rather not completely rewrite all of it, but I will do it if there's no other choice.

推荐答案

我也观察到了相同的问题!我们并不孤单. 参见例如此处.

I am also observing the same issues with minidom! And we are not alone. See for example here.

建议使用其他具有python绑定的XML实现,例如

There it is suggested to use an other XML implementations with python binding like

  • xml.etree.ElementTree: alternative implementation in the Python standard library
  • libxml2: XML C parser with python bindings
  • lxml: a more pythonic binding to libxml2

这篇关于使用xml.dom.minidom解析xml的内存泄漏的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆