ElementTree - findall 递归选择所有子元素 [英] ElementTree - findall to recursively select all child elements

查看:49
本文介绍了ElementTree - findall 递归选择所有子元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Python 代码:

导入 xml.etree.ElementTree 作为 ETroot = ET.parse("h.xml")打印 root.findall('saybye')

h.xml 代码:

<你好><说再见><说再见></saybye></saybye><说再见></saybye></你好>

代码输出,

[<0x7fdbcbbec690 处的元素saybye">,<0x7fdbcbbec790 处的元素saybye">]

saybye 是另一个 saybye 的孩子,这里没有选择.那么,如何指示 findall 递归遍历 DOM 树并收集所有三个 saybye 元素?

解决方案

引用 findall,

<块引用>

Element.findall() 仅查找带有标记的元素,这些元素是当前元素的直接子元素.

因为它只找到直接的孩子,我们需要递归地找到其他孩子,像这样

<预><代码>>>>导入 xml.etree.ElementTree 作为 ET>>>>>>def find_rec(节点,元素,结果):...对于 node.findall(element) 中的项目:... result.append(item)... find_rec(项目,元素,结果)...返回结果...>>>find_rec(ET.parse("h.xml"), 'saybye', [])[<0x7f4fce206710 处的元素‘saybye’、0x7f4fce206750 处的元素‘saybye’、0x7f4fce2067d0 处的元素‘saybye’]

更好的是,让它成为一个生成器函数,就像这样

<预><代码>>>>def find_rec(节点,元素):...对于 node.findall(element) 中的项目:...产量项目... 对于 find_rec(item, element) 中的孩子:...让孩子...>>>列表(find_rec(ET.parse(h.xml"),'saybye'))[<0x7f4fce206a50 处的元素‘saybye’、0x7f4fce206ad0 处的元素‘saybye’、0x7f4fce206b10 处的元素‘saybye’]

Python code:

import xml.etree.ElementTree as ET
root = ET.parse("h.xml")
print root.findall('saybye')

h.xml code:

<hello>
  <saybye>
   <saybye>
   </saybye>
  </saybye>
  <saybye>
  </saybye>
</hello>

Code outputs,

[<Element 'saybye' at 0x7fdbcbbec690>, <Element 'saybye' at 0x7fdbcbbec790>]

saybye which is a child of another saybye is not selected here. So, how to instruct findall to recursively walk down the DOM tree and collect all three saybye elements?

解决方案

Quoting findall,

Element.findall() finds only elements with a tag which are direct children of the current element.

Since it finds only the direct children, we need to recursively find other children, like this

>>> import xml.etree.ElementTree as ET
>>> 
>>> def find_rec(node, element, result):
...     for item in node.findall(element):
...         result.append(item)
...         find_rec(item, element, result)
...     return result
... 
>>> find_rec(ET.parse("h.xml"), 'saybye', [])
[<Element 'saybye' at 0x7f4fce206710>, <Element 'saybye' at 0x7f4fce206750>, <Element 'saybye' at 0x7f4fce2067d0>]

Even better, make it a generator function, like this

>>> def find_rec(node, element):
...     for item in node.findall(element):
...         yield item
...         for child in find_rec(item, element):
...             yield child
... 
>>> list(find_rec(ET.parse("h.xml"), 'saybye'))
[<Element 'saybye' at 0x7f4fce206a50>, <Element 'saybye' at 0x7f4fce206ad0>, <Element 'saybye' at 0x7f4fce206b10>]

这篇关于ElementTree - findall 递归选择所有子元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆