解析前在lxml中注册名称空间 [英] Registering namespaces with lxml before parsing

查看:39
本文介绍了解析前在lxml中注册名称空间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用lxml从具有名称空间的外部服务中解析XML,但未在 xmlns 中注册它们.我正在尝试使用 register_namespace 手动注册它,但这似乎不起作用.

I am using lxml to parse XML from an external service that has namespaces, but doesn't register them with xmlns. I am trying to register it by hand with register_namespace, but that doesn't seem to work.

from lxml import etree

xml = """
    <Foo xsi:type="xsd:string">bar</Foo>
"""

etree.register_namespace('xsi', 'http://www.w3.org/2001/XMLSchema-instance')
el = etree.fromstring(xml) # lxml.etree.XMLSyntaxError: Namespace prefix xsi for type on Foo is not defined

我想念什么?奇怪的是,查看lxml源代码以尝试了解我可能做错了什么,看来 xsi 命名空间应该已经作为默认值之一存在了命名空间.

What am I missing? Oddly enough, looking at the lxml source code to try and understand what I might be doing wrong, it seems as if the xsi namespace should already be there as one of the default namespaces.

推荐答案

解析 XML 文档然后再次保存时,lxml 不会更改任何前缀(并且 register_namespace 无效).

When an XML document is parsed and then saved again, lxml does not change any prefixes (and register_namespace has no effect).

如果您的XML文档未声明其命名空间前缀,则它不是命名空间格式正确的.解析前使用 register_namespace 不能解决此问题.

If your XML document does not declare its namespace prefixes, it is not namespace-well-formed. Using register_namespace before parsing cannot fix this.

register_namespace 定义了序列化新创建的XML文档时要使用的前缀.

register_namespace defines the prefixes to be used when serializing a newly created XML document.

from lxml import etree

el = etree.Element('{http://example.com}Foo')
print(etree.tostring(el).decode())

输出:

<ns0:Foo xmlns:ns0="http://example.com"/>

示例2(具有 register_namespace ):

从lxml导入etree的

Example 2 (with register_namespace):

from lxml import etree

etree.register_namespace("abc", "http://example.com")

el = etree.Element('{http://example.com}Foo')
print(etree.tostring(el).decode())

输出:

<abc:Foo xmlns:abc="http://example.com"/>

示例3(没有 register_namespace ,但具有与常规前缀相关联的知名"命名空间):

从lxml导入etree的

Example 3 (without register_namespace, but with a "well-known" namespace associated with a conventional prefix):

from lxml import etree

el = etree.Element('{http://www.w3.org/2001/XMLSchema-instance}Foo')
print(etree.tostring(el).decode())

输出:

<xsi:Foo xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"/>

这篇关于解析前在lxml中注册名称空间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆