python中的lxml xpath,如何处理丢失的标签? [英] lxml xpath in python, how to handle missing tags?

查看:217
本文介绍了python中的lxml xpath,如何处理丢失的标签?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我要使用以下XML解析一个lxml xpath表达式

suppose I want to parse with an lxml xpath expression the folowing xml

<pack xmlns="http://ns.qubic.tv/2010/item">
    <packitem>
        <duration>520</duration>
        <max_count>14</max_count>
    </packitem>
    <packitem>
        <duration>12</duration>
    </packitem>
</pack>

http://python-thoughts.blogspot.fr/2012/01/default-value-for-text-function-using.html

如何实现对不同元素的解析,这些元素一旦被压缩(在zip或izip python函数意义上)就会给我

How can I achieve a parsing of the different elements that would give me once zipped (in the zip or izip python function sense)

[(520,14),(12,None)]

[(520,14),(12,None)]

?

第二个packitem中缺少的max_count标记使我无法获得所需的东西.

The missing max_count tag in the second packitem holds me back from getting what i want.

推荐答案

def lxml_empty_str(context, nodes):
    for node in nodes:
        node.text = node.text or ""
    return nodes

ns = etree.FunctionNamespace('http://ns.qubic.tv/lxmlfunctions')
ns['lxml_empty_str'] = lxml_empty_str

namespaces = {'i':"http://ns.qubic.tv/2010/item",
          'f': "http://ns.qubic.tv/lxmlfunctions"}
packitems_duration = root.xpath('f:lxml_empty_str('//b:pack/i:packitem/i:duration)/text()',
namespaces={'b':billing_ns, 'f' : 'http://ns.qubic.tv/lxmlfunctions'})
packitems_max_count = root.xpath('f:lxml_empty_str('//b:pack/i:packitem/i:max_count)    /text()',
namespaces={'b':billing_ns, 'f' : 'http://ns.qubic.tv/lxmlfunctions'})
packitems = zip(packitems_duration, packitems_max_count)

>>> packitems
[('520','14'), ('','23')]

http://python- Thoughts.blogspot.fr/2012/01/default-value-for-text-function-using.html

这篇关于python中的lxml xpath,如何处理丢失的标签?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆