删除 lxml 中的所有命名空间? [英] Drop all namespaces in lxml?
问题描述
我正在使用一些 google 的数据 API,使用 python 中的 lxml 库.命名空间在这里是一个巨大的麻烦.对于我正在做的很多工作(主要是 xpath 的东西),最好直接忽略它们.
I'm working with some of google's data APIs, using the lxml library in python. Namespaces are a huge hassle here. For a lot of the work I'm doing (xpath stuff, mainly), it would be nice to just plain ignore them.
是否有一种简单的方法可以忽略 python/lxml 中的 xml 命名空间?
Is there a simple way to ignore xml namespaces in python/lxml?
谢谢!
推荐答案
如果您想从元素和属性中删除所有命名空间,我建议使用下面显示的代码.
If you'd like to remove all namespaces from elements and attributes, I suggest the code shown below.
上下文:在我的应用程序中,我正在获取 SOAP 响应流的 XML 表示,但我对在客户端构建对象不感兴趣;我只对 XML 表示本身感兴趣.此外,我对任何命名空间的事情都不感兴趣,这只会使事情比我需要的更复杂,就我的目的而言.因此,我只需从元素中删除命名空间,并删除所有包含命名空间的属性.
Context: In my application I'm obtaining XML representations of SOAP response streams, but I'm not interested on building objects on the client side; I'm only interested on XML representations themselves. Moreover, I'm not interested on any namespace thing, which only makes things more complicated than they need to be, for my purposes. So, I simply remove namespaces from elements and I drop all attributes which contain namespaces.
def dropns(root):
for elem in root.iter():
parts = elem.tag.split(':')
if len(parts) > 1:
elem.tag = parts[-1]
entries = []
for attrib in elem.attrib:
if attrib.find(':') > -1:
entries.append(attrib)
for entry in entries:
del elem.attrib[entry]
# Test case
name = '~/tmp/mantisbt/test.xml'
f = open(name, 'rb')
import lxml.etree as etree
parser = etree.XMLParser(ns_clean=True, recover=True)
root = etree.parse(f, parser=parser)
print('=====================================================================')
print etree.tostring(root, pretty_print = True)
print('=====================================================================')
dropns(root)
print etree.tostring(root, pretty_print = True)
print('=====================================================================')
打印:
=====================================================================
<SOAP-ENV:Envelope SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
<SOAP-ENV:Body>
<ns1:mc_issue_getResponse>
<return xsi:type="tns:IssueData">
<id xsi:type="xsd:integer">356</id>
<view_state xsi:type="tns:ObjectRef">
<id xsi:type="xsd:integer">10</id>
<name xsi:type="xsd:string">public</name>
</view_state>
</return>
</ns1:mc_issue_getResponse>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>
=====================================================================
<Envelope>
<Body>
<mc_issue_getResponse>
<return>
<id>356</id>
<view_state>
<id>10</id>
<name>public</name>
</view_state>
</return>
</mc_issue_getResponse>
</Body>
</Envelope>
=====================================================================
这篇关于删除 lxml 中的所有命名空间?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!