Python ElementTree-按顺序遍历子节点和文本 [英] Python ElementTree - iterate through child nodes and text in order

查看:659
本文介绍了Python ElementTree-按顺序遍历子节点和文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用python的第三个和ElementTree API。我有一些xml格式:

I am using python the third and the ElementTree API. I have some xml of the form:

<root>
  <item>Over the <ref id="river" /> and through the <ref id="woods" />.</item>
  <item>To Grandmother's <ref id="house" /> we go.</item>
</root>

我希望能够依次遍历给定项目的文本和子节点。因此,对于第一个项目,我要逐行打印的列表是:

I want to be able to iterate through the text and child nodes for a given item in order. So, for the first item, the list I want printed line by line would be:

Over the 
<Element 'ref' at 0x######>
 and through the 
<Element 'ref' at 0x######>
.

但是我不知道如何使用ElementTree做到这一点。我可以通过 itertext()和子元素按几种方式按顺序获取文本,但不能按顺序将它们交错在一起。我希望可以使用 ./@ text | ./ref 之类的XPath表达式,但是ElementTree的XPath子集似乎不支持属性选择。如果我什至可以只获取每个项目节点的原始xml原始内容,则可以在需要时自行解析。

But I can't figure out how to do this with ElementTree. I can get the text in order via itertext() and the child elements in order in several ways, but not them interleaved together in order. I was hoping I could use an XPath expression like ./@text|./ref, but ElementTree's subset of XPath doesn't seem to support attribute selection. If I could even just get the original raw xml contents of each item node, I could parse it out myself if necessary.

推荐答案

尝试以下操作:

from xml.etree import ElementTree as ET

xml = """<root>
  <item>Over the <ref id="river" /> and through the <ref id="woods" />.</item>
  <item>To Grandmother's <ref id="house" /> we go.</item>
</root>"""

root = ET.fromstring(xml)

for item in root:
    if item.text:
        print(item.text)
    for ref in item:
        print(ref)
        if ref.tail:
            print(ref.tail)

ElementTree 表示混合内容是基于 .text .tail 属性。元素的 .text 表示该元素的文本,直到第一个子元素。然后,该孩子的 .tail 包含其父后面的文本。请参见 API文档

ElementTrees representation of "mixed content" is based on .text and .tail attributes. The .text of an element represents the text of the element up to the first child element. That child's .tail then contains the text of its parent following it. See the API doc.

这篇关于Python ElementTree-按顺序遍历子节点和文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆