如何在空命名空间节点上从lxml使用xpath? [英] How to use xpath from lxml on null namespaced nodes?

查看:108
本文介绍了如何在空命名空间节点上从lxml使用xpath?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用lxml处理xml文档中某些节点上缺少名称空间的最佳方法是什么?我是否应该首先修改所有无命名"节点以添加"gmd"名称,然后将树属性更改为名称 http: //www.isotc211.org/2005/gmd 作为"gmd"?如果是这样,是否有一种干净的方法可以使用lxml或其他相对干净/安全的方法来做到这一点?

What is the best way to handle the lack of a namespace on some of the nodes in an xml document using lxml? Should I first modify all None named nodes to add the "gmd" name and then change the tree attributes to name http://www.isotc211.org/2005/gmd as "gmd"? If so, is there a clean way to do this with lxml or something else that would be relatively clean/safe?

from lxml import etree
nsmap = charts_tree.nsmap
nsmap.pop(None) # complains without this on the xpath with
# TypeError: empty namespace prefix is not supported in XPath
len (charts_tree.xpath('//*/gml:Polygon',namespaces=nsmap))
# 1180
len (charts_tree.xpath('//*/DS_DataSet',namespaces=nsmap))
# 0 ... Bummer!
len (charts_tree.xpath('//*/DS_DataSet'))
# 0 ... Also a bummer

例如 http://www.charts.noaa.gov/ENCs/ENCProdCat_19115.xml

<DS_Series xmlns="http://www.isotc211.org/2005/gmd" xmlns:gco="http://www.isotc211.org/2005/gco" xmlns:gml="http://www.opengis.net/gml/3.2" xmlns:gsr="http://www.isotc211.org/2005/gsr" xmlns:gss="http://www.isotc211.org/2005/gss" xmlns:gts="http://www.isotc211.org/2005/gts" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.isotc211.org/2005/gmd http://schemas.opengis.net/iso/19139/20070417/gmd/gmd.xsd">
<composedOf>
    <DS_DataSet>
        <has>
            <MD_Metadata>
                <parentIdentifier>
                    <gco:CharacterString>NOAA ENC Product Catalog</gco:CharacterString>
                </parentIdentifier>
...
<EX_BoundingPolygon>
    <polygon>
        <gml:Polygon gml:id="US1AK90M_P1">
            <gml:exterior>
                <gml:LinearRing>
                    <gml:pos>67.61505 -178.99979</gml:pos>
                    <gml:pos>73.99999 -178.99979</gml:pos>
...
                    <gml:pos>64.99997 -178.99979</gml:pos>
                    <gml:pos>67.61505 -178.99979</gml:pos>
                </gml:LinearRing>

推荐答案

我相信您的DS_DataSet是由于位于DS_Series中(这意味着默认名称空间为"http://www.isotc211.org/2005/gmd") ),并带有名称空间.

I believe your DS_DataSet is by virtue of being within the DS_Series (implying a default namespace of "http://www.isotc211.org/2005/gmd") carrying a namespace.

尝试将其映射到您的命名空间字典中(您可能首先可以通过打印进行测试以查看其是否已经存在,否则可以添加它并通过新键引用该命名空间).

Try and map that into your namespace dictionary (you can probably first test through a print to see if it's already in there, otherwise add it and refer to the namespace by your new key).

nsmap['some_ns'] = "http://www.isotc211.org/2005/gmd"
len (charts_tree.xpath('//*/some_ns:DS_DataSet',namespaces=nsmap))

哪个会成为:

nsmap['gmd'] = nsmap[None]
nsmap.pop(None)
len(charts_tree.xpath('//*/gmd:DS_DataSet',namespaces=nsmap))

这篇关于如何在空命名空间节点上从lxml使用xpath?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆