使用lxml从python中的xml中删除名称空间和前缀 [英] Remove namespace and prefix from xml in python using lxml
问题描述
我需要打开一个xml文件并进行一些更改,其中之一是删除名称空间和前缀,然后保存到另一个文件中. 这是xml:
I have an xml file I need to open and make some changes to, one of those changes is to remove the namespace and prefix and then save to another file. Here is the xml:
<?xml version='1.0' encoding='UTF-8'?>
<package xmlns="http://apple.com/itunes/importer">
<provider>some data</provider>
<language>en-GB</language>
</package>
我可以进行所需的其他更改,但是找不到如何删除名称空间和前缀的方法.这是我需要的reusklt xml:
I can make the other changes I need, but can't find out how to remove the namespace and prefix. This is the reusklt xml I need:
<?xml version='1.0' encoding='UTF-8'?>
<package>
<provider>some data</provider>
<language>en-GB</language>
</package>
这是我的脚本,它将打开并解析xml并将其保存:
And here is my script which will open and parse the xml and save it:
metadata = '/Users/user1/Desktop/Python/metadata.xml'
from lxml import etree
parser = etree.XMLParser(remove_blank_text=True)
open(metadata)
tree = etree.parse(metadata, parser)
root = tree.getroot()
tree.write('/Users/user1/Desktop/Python/done.xml', pretty_print = True, xml_declaration = True, encoding = 'UTF-8')
那么我该如何在脚本中添加代码,以删除名称空间和前缀?
So how would I add code in my script which will remove the namespace and prefix?
推荐答案
按照Uku Loskit的建议替换标签.除此之外,请使用 lxml.objectify.deannotate .
Replace tag as Uku Loskit suggests. In addition to that, use lxml.objectify.deannotate.
from lxml import etree, objectify
metadata = '/Users/user1/Desktop/Python/metadata.xml'
parser = etree.XMLParser(remove_blank_text=True)
tree = etree.parse(metadata, parser)
root = tree.getroot()
####
for elem in root.getiterator():
if not hasattr(elem.tag, 'find'): continue # (1)
i = elem.tag.find('}')
if i >= 0:
elem.tag = elem.tag[i+1:]
objectify.deannotate(root, cleanup_namespaces=True)
####
tree.write('/Users/user1/Desktop/Python/done.xml',
pretty_print=True, xml_declaration=True, encoding='UTF-8')
更新
某些标签,例如Comment
,在访问tag
属性时会返回一个函数.为此增加了一个警卫. (1)
Some tags like Comment
return a function when accessing tag
attribute. added a guard for that. (1)
这篇关于使用lxml从python中的xml中删除名称空间和前缀的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!