通过python lxml tree.xpath解析xml [英] parsing xml by python lxml tree.xpath
本文介绍了通过python lxml tree.xpath解析xml的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我尝试解析一个大文件.示例如下.我尝试使用<Name>
,但是我不能
它仅在没有此字符串的情况下起作用
I try to parse a huge file. The sample is below. I try to take <Name>
, but I can't
It works only without this string
<LevelLayout xmlns="http://schemas.datacontract.org/2004/07/ArcherTech.Common.Domain" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
xml2 = '''<?xml version="1.0" encoding="UTF-8"?>
<PackageLevelLayout>
<LevelLayouts>
<LevelLayout levelGuid="4a54f032-325e-4988-8621-2cb7b49d8432">
<LevelLayout xmlns="http://schemas.datacontract.org/2004/07/ArcherTech.Common.Domain" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<LevelLayoutSectionBase>
<LevelLayoutItemBase>
<Name>Tracking ID</Name>
</LevelLayoutItemBase>
</LevelLayoutSectionBase>
</LevelLayout>
</LevelLayout>
</LevelLayouts>
</PackageLevelLayout>'''
from lxml import etree
tree = etree.XML(xml2)
nodes = tree.xpath('/PackageLevelLayout/LevelLayouts/LevelLayout[@levelGuid="4a54f032-325e-4988-8621-2cb7b49d8432"]/LevelLayout/LevelLayoutSectionBase/LevelLayoutItemBase/Name')
print nodes
推荐答案
您嵌套的LevelLayout
XML文档使用命名空间.我会用:
Your nested LevelLayout
XML document uses a namespace. I'd use:
tree.xpath('.//LevelLayout[@levelGuid="4a54f032-325e-4988-8621-2cb7b49d8432"]//*[local-name()="Name"]')
将Name
元素与较短的XPath表达式匹配(完全忽略名称空间).
to match the Name
element with a shorter XPath expression (ignoring the namespace altogether).
另一种方法是使用前缀到命名空间的映射,并在标签上使用这些映射:
The alternative is to use a prefix-to-namespace mapping and use those on your tags:
nsmap = {'acd': 'http://schemas.datacontract.org/2004/07/ArcherTech.Common.Domain'}
tree.xpath('/PackageLevelLayout/LevelLayouts/LevelLayout[@levelGuid="4a54f032-325e-4988-8621-2cb7b49d8432"]/acd:LevelLayout/acd:LevelLayoutSectionBase/acd:LevelLayoutItemBase/acd:Name',
namespaces=nsmap)
这篇关于通过python lxml tree.xpath解析xml的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文