Python:xml ElementTree(或lxml)中的名称空间 [英] Python: namespaces in xml ElementTree (or lxml)

查看:78
本文介绍了Python:xml ElementTree(或lxml)中的名称空间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想检索旧的xml文件,进行操作并保存。



这是我的代码:

 从xml.etree导入cElementTree作为ET 
NS = {http://www.somedomain.com/XI/Traffic/10}

def fix_xml(文件名):
f = ET.parse(文件名)
root = f.getroot()
eventlist = root.findall(%(ns)Event%{' ns':NS})
xpath =%(ns)sEventDetail /%(ns)sEventDescription%{'ns':NS}
用于事件列表中的事件:
desc =事件。 find(xpath)
desc.text = desc.text.upper()#对文本进行一些编辑。

ET.ElementTree(root,nsmap = NS).write( out.xml,encoding = utf-8)


short_xml( test.xml)

我加载的文件包含:

  xmlns = http://www.somedomain.com/XI/Traffic/10 
xmlns:xsi = http://www.w3。 org / 2001 / XMLSchema-instance
xsi:schemaLocation = http://www.somedomain.com/XI/Traffic/10 10.xds

在根标记处。



我遇到以下与命名空间有关的问题:




  • 如您所见,对于每个标记调用,我在开始检索孩子时就给了命名空间。

  • 开头的生成的xml文件没有<?xml version = 1.0 encoding = utf-8?>

  • 输出中的标签包含这样的< ns0:eventDescription> ,而我需要将输出作为原始的< eventDescription> ,开头没有命名空间。



解决方案

看看有关名称空间的lxml教程部分。也是有关ElementTree中名称空间的文章



问题1:像其他所有人一样忍受。代替%(ns)Event%{'ns':NS} 尝试 NS + Event 。 / p>

问题2:默认情况下,仅在需要时才编写XML声明。您可以在 write()调用中使用 xml_declaration = True 强制执行此操作(仅lxml)。



问题3: nsmap arg似乎仅适用于lxml。 AFAICT它需要MAPping,而不是字符串。尝试 nsmap = {None:NS} 。 effbot文章的一节描述了解决方法。


I want to retrieve a legacy xml file, manipulate and save it.

Here is my code:

from xml.etree import cElementTree as ET
NS = "{http://www.somedomain.com/XI/Traffic/10}"

def fix_xml(filename):
    f = ET.parse(filename)
    root = f.getroot()
    eventlist = root.findall("%(ns)Event" % {'ns':NS })
    xpath = "%(ns)sEventDetail/%(ns)sEventDescription" % {'ns':NS }
    for event in eventlist:
        desc = event.find(xpath)
        desc.text = desc.text.upper() # do some editting to the text.

    ET.ElementTree(root, nsmap=NS).write("out.xml", encoding="utf-8")


shorten_xml("test.xml")

The file I load contains:

xmlns="http://www.somedomain.com/XI/Traffic/10"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.somedomain.com/XI/Traffic/10 10.xds"

at the root tag.

I have the following problems, related to namespace:

  • As you see, for each tag call, I have give the namespace at the begining to retreive a child.
  • Generated xml file doesn't have <?xml version="1.0" encoding="utf-8"?> at the begining.
  • The tags at the output contains such <ns0:eventDescription> while I need output as the original <eventDescription>, without namespace at the begining.

How can these be solved?

解决方案

Have a look at the lxml tutorial section on namespaces. Also this article about namespaces in ElementTree.

Problem 1: Put up with it, like everybody else does. Instead of "%(ns)Event" % {'ns':NS } try NS+"Event".

Problem 2: By default, the XML declaration is written only if it is required. You can force it (lxml only) by using xml_declaration=True in your write() call.

Problem 3: The nsmap arg appears to be lxml-only. AFAICT it needs a MAPping, not a string. Try nsmap={None: NS}. The effbot article has a section describing a workaround for this.

这篇关于Python:xml ElementTree(或lxml)中的名称空间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆