python lxml遍历所有标签 [英] python lxml loop through all tags

查看:57
本文介绍了python lxml遍历所有标签的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个字典,将每个xml标签映射到字典键.我想遍历xml中的每个标记和文本字段,并将其与关联的dict键值进行比较,该值是另一个dict中的键.

I have a dict mapping each xml tag to a dict key. I want to loop through each tag and text field in the xml, and compare it with the associated dict key value which is the key in another dict.

<2gMessage>
    <Request>
        <pid>daemon</pid>
        <emf>123456</emf>
        <SENum>2041788209</SENum>
        <MM>
            <MID>jbr1</MID>
            <URL>http://jimsjumbojoint.com</URL>
        </MM>
        <AppID>reddit</AppID>
        <CCS>
            <Mode>
                <SomeDate>true</CardPresent>
                <Recurring>false</Recurring>
            </Mode>
            <Date>
                <ASCII>B4788250000028291^RRR^15121015432112345601</ASCII>
            </Date>
            <Amount>100.00</Amount>
        </CCS>
    </Request>
</2gMessage>

我到目前为止的代码:

parser = etree.XMLParser(ns_clean=True, remove_blank_text=True)
tree   = etree.fromstring(strRequest, parser)
for tag in tree.xpath('//Request'):
    subfields = tag.getchildren()
    for subfield in subfields:
        print (subfield.tag, subfield.text)
return strRequest

但是,这仅打印作为Request的直接子代的标记,如果它是同一循环中的实例,则我希望能够访问子代上的子代.我不想对值进行硬编码,因为标签和结构可以更改.

But, this only prints the tags which are direct children of Request, I want to be able to access the subchildren on children if it is an instance in the same loop. I don't want to hardcode values, as the tags and structure could be changed.

推荐答案

您可以尝试使用 iter()函数.它将遍历所有子元素.长度的比较是仅打印没有子项的那些子

You could try with iter() function. It will traverse through all the children elements. The comparison of the length is to print only those that has no children:

像这样的完整脚本:

from lxml import etree
tree = etree.parse('xmlfile')
for tag in tree.iter():
    if not len(tag):
        print (tag.tag, tag.text)

收益:

pid daemon
emf 123456
SENum 2041788209
MID jbr1
URL http://jimsjumbojoint.com
AppID reddit
CardPresent true
Recurring false
ASCII B4788250000028291^RRR^15121015432112345601
Amount 100.00

这篇关于python lxml遍历所有标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆