使用Python中的ElementTree解析具有名称空间的XML [英] Parsing XML with namespaces using ElementTree in Python

查看：143 发布时间：2020/10/28 20:42:13 python xml python-2.7 xml-parsing elementtree

本文介绍了使用Python中的ElementTree解析具有名称空间的XML的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个xml，它的一小部分看起来像这样：

I have an xml, small part of it looks like this:

<?xml version="1.0" ?>
<i:insert xmlns:i="urn:com:xml:insert" xmlns="urn:com:xml:data">
  <data>
    <image imageId="1"></image>
    <content>Content</content>
  </data>
</i:insert>

当我使用 ElementTree 解析并保存时它将其保存到我看到的文件中：

When i parse it using ElementTree and save it to a file i see following:

<ns0:insert xmlns:ns0="urn:com:xml:insert" xmlns:ns1="urn:com:xml:data">
  <ns1:data>
    <ns1:image imageId="1"></ns1:image>
    <ns1:content>Content</ns1:content>
  </ns1:data>
</ns0:insert>

为什么更改前缀并将其放置在各处？使用 minidom 我没有这种问题。是否已配置？ ElementTree 的文档非常少。
问题是，这样的解析后我找不到任何节点，例如 image -如果我使用它，无论有没有命名空间都找不到它例如 {namespace}图片或仅图片。为什么？

Why does it change prefixes and put them everywhere? Using minidom i don't have such problem. Is it configured? Documentation for ElementTree is very poor. The problem is, that i can't find any node after such parsing, for example image - can't find it with or without namespace if i use it like {namespace}image or just image. Why's that? Any suggestions are strongly appreciated.

我已经尝试过的内容：

import xml.etree.ElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot()
for a in root.findall('ns1:image'):
    print a.attrib

这将返回错误，而另一个则不返回任何内容：

This returns an error and the other one returns nothing:

for a in root.findall('{urn:com:xml:data}image'):
    print a.attrib

我也试图像这样使用命名空间并使用它：

I also tried to make namespace like this and use it:

namespaces = {'ns1': 'urn:com:xml:data'}
for a in root.findall('ns1:image', namespaces):
    print a.attrib

它不返回任何内容。我在做什么错了？

It returns nothing. What am i doing wrong?

推荐答案

此片段来自您的问题，

for a in root.findall('{urn:com:xml:data}image'):
    print a.attrib

不输出任何内容，因为它只查找直接的 {urn：com：xml：data}图像树根的子代。

does not output anything because it only looks for direct {urn:com:xml:data}image children of the root of the tree.

此代码稍作修改，

for a in root.findall('.//{urn:com:xml:data}image'):
    print a.attrib

将打印 {'imageId'：'1'} ，因为它使用的是。/ / ，它会在所有级别上选择匹配的子元素。

will print {'imageId': '1'} because it uses .//, which selects matching subelements on all levels.

参考： https://docs.python.org/2/library/xml.etree.elementtree.html#supported-xpath-syntax 。

令人讨厌的是，ElementTree不仅保留了默认情况下是原始名称空间前缀，但请记住，无论如何前缀都不重要。 register_namespace（）函数可用于在对XML进行序列化时设置所需的前缀。该功能对解析或搜索没有任何影响。

It is a bit annoying that ElementTree does not just retain the original namespace prefixes by default, but keep in mind that it is not the prefixes that matter anyway. The register_namespace() function can be used to set the wanted prefix when serializing the XML. The function does not have any effect on parsing or searching.

这篇关于使用Python中的ElementTree解析具有名称空间的XML的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用Python中的ElementTree解析具有名称空间的XML [英] Parsing XML with namespaces using ElementTree in Python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用Python中的ElementTree解析具有名称空间的XML [英] Parsing XML with namespaces using ElementTree in Python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭