Python ElementTree:使用XPath通过子文本查找元素 [英] Python ElementTree: find element by its child's text using XPath
问题描述
我正在尝试在其子元素之一中找到具有某些文本值的元素.例如
I'm trying to locate an element that has certain text value in one of its child. For example,
<peers>
<peer>
<offset>1</offset>
<tag>TRUE</tag>
</peer>
<peer>
<offset>2</offset>
<tag>FALSE</tag>
</peer>
</peers>
从这个XML文档中,我想直接在 offset
值为1的 peer
元素中找到 tag
.
from this XML document I would like to directly locate tag
in a peer
element whose offset
value is 1.
为此,我有一个如下所示的XPath表达式:
So for that purpose I have a XPath expression as follows:
./peers/peer[offset='1']/tag
但是在ElementTree的 Element.find()
方法中使用这样的表达式会失败,并给出 None
而不是我感兴趣的"tag"元素:
however using such expression in ElementTree's Element.find()
method fails and gives None
rather than the "tag" element of my interest:
from xml.etree.ElementTree import fromstring
doc = fromstring("<peers><peer><offset>1</offset><tag>TRUE</tag></peer><peer><offset>2</offset><tag>FALSE</tag></peer></peers>")
tag = doc.find("./peers/peer[offset='1']/tag")
print tag
=> None
我倾向于认为这是我上面的XPath表达式错误,或者是由于ElementTree根据其文档仅支持XPath的一个子集.寻求帮助.谢谢.
I'm being inclined to believe it's either my above XPath expression is wrong, or due to ElementTree's supporting only a subset of XPath according to its documentation. Looking for help. Thank you.
推荐答案
直接使用 lxml.etree
(相同的应该适用于 ElementTree
),您可以实现以下结果:
Using lxml.etree
directly (the same should apply to ElementTree
), you can achieve the result like this:
doc = lxml.etree.fromstring(...)
tag_elements = doc.xpath("/peers/peer/offset[text()='1']/../tag")
tag_elements
将是属于< peer>
元素的< tag>
元素的列表包含1的< offset>
元素.
tag_elements
will be the list of <tag>
elements belonging to <peer>
elements containing an <offset>
element containing 1.
给出输入(我添加了< peer>
子句以强调 tag_elements
是列表):
Given input (I've added a <peer>
clause to emphasize tag_elements
being a list):
<peers>
<peer>
<offset>1</offset>
<tag>TRUE</tag>
</peer>
<peer>
<offset>1</offset>
<tag>OTHER</tag>
</peer>
<peer>
<offset>2</offset>
<tag>FALSE</tag>
</peer>
</peers>
tag_elements
将包含两个元素:
for tag in tag_elements:
print tag.text
-> TRUE
-> OTHER
更新:
doc.xpath("/peers/peer [offset = 1]/tag")
也可以正常工作.
但是 doc.xpath("./peers/peer [offset = 1]/tag")
不会.
这篇关于Python ElementTree:使用XPath通过子文本查找元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!