在lxml中查找具有未知名称空间的元素 [英] Find element that has unknown namespace in lxml

查看:56
本文介绍了在lxml中查找具有未知名称空间的元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个具有多个级别的XML.每个级别可能都有附加的名称空间.我想找到一个我知道其名称但不知道其名称空间的特定元素.例如:

I have an XML with many levels. Each level may have namespace attached to it. I want to find a specific element whose name I know, but not its namespace. For example:

my_file.xml

<?xml version="1.0" encoding="UTF-8"?>
<data xmlns="aaa:bbb:ccc:ddd:eee">
  <country name="Liechtenstein" xmlns="aaa:bbb:ccc:liechtenstein:eee">
    <rank updated="yes">2</rank>
    <year>2008</year>
    <gdppc>141100</gdppc>
    <neighbor name="Austria" direction="E"/>
    <neighbor name="Switzerland" direction="W"/>
  </country>
  <country name="Singapore" xmlns="aaa:bbb:ccc:singapore:eee">
    <continent>Asia</continent>
    <holidays>
      <christmas>Yes</christmas>
    </holidays>
    <rank updated="yes">5</rank>
    <year>2011</year>
    <gdppc>59900</gdppc>
    <neighbor name="Malaysia" direction="N"/>
  </country>
  <country name="Panama" xmlns="aaa:bbb:ccc:panama:eee">
    <rank updated="yes">69</rank>
    <year>2011</year>
    <gdppc>13600</gdppc>
    <neighbor name="Costa Rica" direction="W"/>
    <neighbor name="Colombia" direction="E"/>
  </country>
</data>

import lxml.etree as etree

tree = etree.parse('my_file.xml')
root = tree.getroot()

cntry_node = root.find('.//country')

上面的 find 不会向 cntry_node 返回任何内容.在我的真实数据中,水平比该示例还要深.lxml文档讨论名称空间.当我这样做时:

The find above does not return anything to cntry_node. In my real data, the levels are deeper than this example. The lxml document talks about namespace. When I do this:

root.nsmap

我看到了:

{None: 'aaa:bbb:ccc:ddd:eee'}

是否有人可以解释如何访问完整的 nsmap 和/或如何使用它来查找特定元素?非常感谢.

If someone could explain how to access the full nsmap and/or how to use it to find a specific element? Thanks very much.

推荐答案

您可以声明所有名称空间,但是鉴于示例xml的结构,我认为您最好完全忽略名称空间,而只使用 local-name();所以

You could declare all namespaces, but given the structure of your sample xml, I would argue you are better off disregarding namespaces altogether and just using local-name(); so

cntry_node = root.xpath('.//*[local-name()="country"]')
cntry_node

返回

[<Element {aaa:bbb:ccc:liechtenstein:eee}country at 0x1cddf1d4680>,
 <Element {aaa:bbb:ccc:singapore:eee}country at 0x1cddf1d47c0>,
 <Element {aaa:bbb:ccc:panama:eee}country at 0x1cddf1d45c0>]

这篇关于在lxml中查找具有未知名称空间的元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆