删除 lxml 中的所有命名空间? [英] Drop all namespaces in lxml?

查看:44
本文介绍了删除 lxml 中的所有命名空间?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用一些 google 的数据 API,使用 python 中的 lxml 库.命名空间在这里是一个巨大的麻烦.对于我正在做的很多工作(主要是 xpath 的东西),最好直接忽略它们.

I'm working with some of google's data APIs, using the lxml library in python. Namespaces are a huge hassle here. For a lot of the work I'm doing (xpath stuff, mainly), it would be nice to just plain ignore them.

是否有一种简单的方法可以忽略 python/lxml 中的 xml 命名空间?

Is there a simple way to ignore xml namespaces in python/lxml?

谢谢!

推荐答案

如果您想从元素和属性中删除所有命名空间,我建议使用下面显示的代码.

If you'd like to remove all namespaces from elements and attributes, I suggest the code shown below.

上下文:在我的应用程序中,我正在获取 SOAP 响应流的 XML 表示,但我对在客户端构建对象不感兴趣;我只对 XML 表示本身感兴趣.此外,我对任何命名空间的事情都不感兴趣,这只会使事情比我需要的更复杂,就我的目的而言.因此,我只需从元素中删除命名空间,并删除所有包含命名空间的属性.

Context: In my application I'm obtaining XML representations of SOAP response streams, but I'm not interested on building objects on the client side; I'm only interested on XML representations themselves. Moreover, I'm not interested on any namespace thing, which only makes things more complicated than they need to be, for my purposes. So, I simply remove namespaces from elements and I drop all attributes which contain namespaces.

def dropns(root):
    for elem in root.iter():
        parts = elem.tag.split(':')
        if len(parts) > 1:
            elem.tag = parts[-1]
        entries = []
        for attrib in elem.attrib:
            if attrib.find(':') > -1:
                entries.append(attrib)
        for entry in entries:
            del elem.attrib[entry]

# Test case
name = '~/tmp/mantisbt/test.xml'
f = open(name, 'rb')
import lxml.etree as etree
parser = etree.XMLParser(ns_clean=True, recover=True)
root = etree.parse(f, parser=parser)
print('=====================================================================')
print etree.tostring(root, pretty_print = True)
print('=====================================================================')
dropns(root)
print etree.tostring(root, pretty_print = True)
print('=====================================================================')

打印:

=====================================================================
<SOAP-ENV:Envelope SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
  <SOAP-ENV:Body>
    <ns1:mc_issue_getResponse>
      <return xsi:type="tns:IssueData">
        <id xsi:type="xsd:integer">356</id>
        <view_state xsi:type="tns:ObjectRef">
          <id xsi:type="xsd:integer">10</id>
          <name xsi:type="xsd:string">public</name>
        </view_state>
    </return>
  </ns1:mc_issue_getResponse>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>
=====================================================================
<Envelope>
  <Body>
    <mc_issue_getResponse>
      <return>
        <id>356</id>
        <view_state>
          <id>10</id>
          <name>public</name>
        </view_state>
    </return>
  </mc_issue_getResponse>
</Body>
</Envelope>
=====================================================================

这篇关于删除 lxml 中的所有命名空间?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆