ElementTree - findall 递归选择所有子元素 [英] ElementTree - findall to recursively select all child elements
问题描述
Python 代码:
导入 xml.etree.ElementTree 作为 ETroot = ET.parse("h.xml")打印 root.findall('saybye')
h.xml 代码:
<你好><说再见><说再见></saybye></saybye><说再见></saybye></你好>
代码输出,
[<0x7fdbcbbec690 处的元素saybye">,<0x7fdbcbbec790 处的元素saybye">]
saybye
是另一个 saybye
的孩子,这里没有选择.那么,如何指示 findall 递归遍历 DOM 树并收集所有三个 saybye
元素?
引用 findall
,
Element.findall()
仅查找带有标记的元素,这些元素是当前元素的直接子元素.
因为它只找到直接的孩子,我们需要递归地找到其他孩子,像这样
<预><代码>>>>导入 xml.etree.ElementTree 作为 ET>>>>>>def find_rec(节点,元素,结果):...对于 node.findall(element) 中的项目:... result.append(item)... find_rec(项目,元素,结果)...返回结果...>>>find_rec(ET.parse("h.xml"), 'saybye', [])[<0x7f4fce206710 处的元素‘saybye’、0x7f4fce206750 处的元素‘saybye’、0x7f4fce2067d0 处的元素‘saybye’]更好的是,让它成为一个生成器函数,就像这样
<预><代码>>>>def find_rec(节点,元素):...对于 node.findall(element) 中的项目:...产量项目... 对于 find_rec(item, element) 中的孩子:...让孩子...>>>列表(find_rec(ET.parse(h.xml"),'saybye'))[<0x7f4fce206a50 处的元素‘saybye’、0x7f4fce206ad0 处的元素‘saybye’、0x7f4fce206b10 处的元素‘saybye’]Python code:
import xml.etree.ElementTree as ET
root = ET.parse("h.xml")
print root.findall('saybye')
h.xml code:
<hello>
<saybye>
<saybye>
</saybye>
</saybye>
<saybye>
</saybye>
</hello>
Code outputs,
[<Element 'saybye' at 0x7fdbcbbec690>, <Element 'saybye' at 0x7fdbcbbec790>]
saybye
which is a child of another saybye
is not selected here. So, how to instruct findall to recursively walk down the DOM tree and collect all three saybye
elements?
Quoting findall
,
Element.findall()
finds only elements with a tag which are direct children of the current element.
Since it finds only the direct children, we need to recursively find other children, like this
>>> import xml.etree.ElementTree as ET
>>>
>>> def find_rec(node, element, result):
... for item in node.findall(element):
... result.append(item)
... find_rec(item, element, result)
... return result
...
>>> find_rec(ET.parse("h.xml"), 'saybye', [])
[<Element 'saybye' at 0x7f4fce206710>, <Element 'saybye' at 0x7f4fce206750>, <Element 'saybye' at 0x7f4fce2067d0>]
Even better, make it a generator function, like this
>>> def find_rec(node, element):
... for item in node.findall(element):
... yield item
... for child in find_rec(item, element):
... yield child
...
>>> list(find_rec(ET.parse("h.xml"), 'saybye'))
[<Element 'saybye' at 0x7f4fce206a50>, <Element 'saybye' at 0x7f4fce206ad0>, <Element 'saybye' at 0x7f4fce206b10>]
这篇关于ElementTree - findall 递归选择所有子元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!