Python minidom:如何访问元素 [英] Python minidom: How to access an element
问题描述
我正在用 Python 解析 XML-Sheet.XML 具有如下结构:
I'm working on parsing an XML-Sheet in Python. The XML has a structure like this:
<layer1>
<layer2>
<element>
<info1></info1>
</element>
<element>
<info1></info1>
</element>
<element>
<info1></info1>
</element>
</layer2>
</layer1>
如果没有 layer2,我访问 info1 中的数据没有问题.但是对于layer2,我真的遇到了麻烦.他们的我可以通过以下方式处理 info1:root.firstChild.childNodes[0].childNodes[0].data
Without layer2, I have no problems to acess the data in info1. But with layer2, I'm really in trouble. Their I can adress info1 with: root.firstChild.childNodes[0].childNodes[0].data
所以我的想法是,我可以这样做:root.firstChild.firstChild.childNodes[0].childNodes[0].data
So my thought was, that I can do it similiar like this:root.firstChild.firstChild.childNodes[0].childNodes[0].data
这就是我解决问题的方法:从 xml.etree 导入 cElementTree 作为 ET
So this is how I solved my problem: from xml.etree import cElementTree as ET
从 xml.etree 导入 cElementTree 作为 ET
from xml.etree import cElementTree as ET
tree = ET.parse("test.xml")
root = tree.getroot()
for elem in root.findall('./layer2/'):
for node in elem.findall('element/'):
x = node.find('info1').text
if x != "abc":
elem.remove(node)
推荐答案
如果您能提供帮助,请不要使用 minidom
API.改用 ElementTree API;xml.dom.minidom
文档明确指出:
Don't use the minidom
API if you can help it. Use the ElementTree API instead; the xml.dom.minidom
documentation explicitly states that:
尚未精通 DOM 的用户应考虑改用 xml.etree.ElementTree
模块进行 XML 处理.
Users who are not already proficient with the DOM should consider using the
xml.etree.ElementTree
module for their XML processing instead.
以下是使用 ElementTree
API 访问元素的简短示例:
Here is a short sample that uses the ElementTree
API to access your elements:
from xml.etree import ElementTree as ET
tree = ET.parse('inputfile.xml')
for info in tree.findall('.//element/info1'):
print info.text
这使用 XPath 表达式列出包含在 element
元素中的所有 info1
元素,而不管它们在整个 XML 文档中的位置.
This uses an XPath expression to list all info1
elements that are contained inside a element
element, regardless of their position in the overall XML document.
如果您只需要first info1
元素,请使用.find()
:
If all you need is the first info1
element, use .find()
:
print tree.find('.//info1').text
使用 DOM
API,.firstChild
可以轻松成为 Text
节点而不是 Element
节点;你总是需要遍历 .childNotes
序列来找到第一个 Element
匹配:
With the DOM
API, .firstChild
could easily be a Text
node instead of an Element
node; you always need to loop over the .childNotes
sequence to find the first Element
match:
def findFirstElement(node):
for child in node.childNodes:
if child.nodeType == node.ELEMENT_NODE:
return child
但对于您的情况,也许使用 .getElementsByTagName()
就够了:
but for your case, perhaps using .getElementsByTagName()
suffices:
root.getElementsByTagName('info1').data
这篇关于Python minidom:如何访问元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!