Python脚本从XML文件中删除所有注释 [英] Python script to remove all comments from XML file
问题描述
我正在尝试构建一个Python脚本,它将使用XML文档并从中删除所有注释块。
I am trying to build a python script that will take in an XML document and remove all of the comment blocks from it.
我尝试了一些:
tree = ElementTree()
tree.parse(file)
commentElements = tree.findall('//comment()')
for element in commentElements:
element.parentNode.remove(element)
这样做会导致python出现一个奇怪的错误:KeyError:'()'
Doing this yields a weird error from python: "KeyError: '()'
我知道有很多方法可以轻松使用其他方法(如sed)编辑文件,但是我必须在python脚本中执行。
I know there are ways to easily edit the file using other methods ( like sed ), but I have to do it in a python script.
推荐答案
comment()
是ElementTree不支持的XPath节点测试。
comment()
is an XPath node test that is not supported by ElementTree.
您可以使用注释()
与 lxml 。此库与ElementTree非常相似,完全支持XPath 1.0。
You can use comment()
with lxml. This library is quite similar to ElementTree and it has full support for XPath 1.0.
这是哟你可以使用lxml来删除注释:
Here is how you can remove comments with lxml:
from lxml import etree
XML = """<root>
<!-- COMMENT 1 -->
<x>TEXT 1</x>
<y>TEXT 2 <!-- COMMENT 2 --></y>
</root>"""
tree = etree.fromstring(XML)
comments = tree.xpath('//comment()')
for c in comments:
p = c.getparent()
p.remove(c)
print etree.tostring(tree)
输出:
<root>
<x>TEXT 1</x>
<y>TEXT 2 </y>
</root>
这篇关于Python脚本从XML文件中删除所有注释的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!