通过python lxml tree.xpath解析xml [英] parsing xml by python lxml tree.xpath

查看:396
本文介绍了通过python lxml tree.xpath解析xml的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试解析一个大文件.示例如下.我尝试使用<Name>,但是我不能 它仅在没有此字符串的情况下起作用

I try to parse a huge file. The sample is below. I try to take <Name>, but I can't It works only without this string

<LevelLayout xmlns="http://schemas.datacontract.org/2004/07/ArcherTech.Common.Domain" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">

 

xml2 = '''<?xml version="1.0" encoding="UTF-8"?>
<PackageLevelLayout>
<LevelLayouts>
    <LevelLayout levelGuid="4a54f032-325e-4988-8621-2cb7b49d8432">
                <LevelLayout xmlns="http://schemas.datacontract.org/2004/07/ArcherTech.Common.Domain" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
                    <LevelLayoutSectionBase>
                        <LevelLayoutItemBase>
                            <Name>Tracking ID</Name>
                        </LevelLayoutItemBase>
                    </LevelLayoutSectionBase>
                </LevelLayout>
            </LevelLayout>
    </LevelLayouts>
</PackageLevelLayout>'''

from lxml import etree
tree = etree.XML(xml2)
nodes = tree.xpath('/PackageLevelLayout/LevelLayouts/LevelLayout[@levelGuid="4a54f032-325e-4988-8621-2cb7b49d8432"]/LevelLayout/LevelLayoutSectionBase/LevelLayoutItemBase/Name')
print nodes

推荐答案

您嵌套的LevelLayout XML文档使用命名空间.我会用:

Your nested LevelLayout XML document uses a namespace. I'd use:

tree.xpath('.//LevelLayout[@levelGuid="4a54f032-325e-4988-8621-2cb7b49d8432"]//*[local-name()="Name"]')

Name元素与较短的XPath表达式匹配(完全忽略名称空间).

to match the Name element with a shorter XPath expression (ignoring the namespace altogether).

另一种方法是使用前缀到命名空间的映射,并在标签上使用这些映射:

The alternative is to use a prefix-to-namespace mapping and use those on your tags:

nsmap = {'acd': 'http://schemas.datacontract.org/2004/07/ArcherTech.Common.Domain'}

tree.xpath('/PackageLevelLayout/LevelLayouts/LevelLayout[@levelGuid="4a54f032-325e-4988-8621-2cb7b49d8432"]/acd:LevelLayout/acd:LevelLayoutSectionBase/acd:LevelLayoutItemBase/acd:Name',
    namespaces=nsmap)

这篇关于通过python lxml tree.xpath解析xml的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆