奇怪的lxml行为 [英] Strange lxml behavior
问题描述
我手动创建xml,然后尝试使用xsd方案对其进行验证.最初不会通过验证,但是如果我将xml转换为字符串然后返回-那么新的xml将通过验证.
I create xml manually and then try to validate it with xsd scheme. Validation doesn't pass at first, but if I convert xml to a string and back - then new xml passes validation.
from lxml import etree
xsd = etree.fromstring("""
<schema xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="some_namespace">
<element name="el"></element>
</schema>""")
schema = etree.XMLSchema(xsd)
xml1 = etree.Element('el', nsmap={None: "some_namespace"})
xml2 = etree.fromstring(etree.tostring(xml1))
schema.assertValid(xml2) # this passes
schema.assertValid(xml1) # this fails
我看到xml1和xml2具有不同的标记:
I see that xml1 and xml2 have different tags:
print xml1.tag # --> el
print xml2.tag # --> {some_namespace}el
但是,为什么xml1和xml2有如此大的区别?看起来它们应该是相同的.
But why xml1 and xml2 have such a difference? Looks like they should be the same.
推荐答案
在此处创建 el
元素(无名称空间):
Here you create an el
element (no namespace):
xml1 = etree.Element('el', nsmap={None: "some_namespace"})
使用 nsmap
参数不会将元素绑定到名称空间;它只是提供了序列化的映射.
Using a nsmap
parameter does not bind the element to a namespace; it just provides a mapping for serialization.
执行 etree.tostring(xml1)
时,序列化行为开始".解析序列化结果后, xml2
是一个 {some_namespace} el
元素,而不是 el
.
When etree.tostring(xml1)
is executed, the serialization behaviour "kicks in". When the serialized result has been parsed, xml2
is an {some_namespace}el
element instead of el
.
要使其正常运行,请将行更改为:
To make it work, change the line to:
xml1 = etree.Element('{some_namespace}el', nsmap={None: "some_namespace"})
这篇关于奇怪的lxml行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!