使用ElementTree解析XML时使用名称空间 [英] Working with namespace while parsing XML using ElementTree
问题描述
以下是针对使用ElementTree修改XML
我现在在我的XML中拥有名称空间,并尝试在中解析答案通过"ElementTree"在XML中具有命名空间的XML ,并具有以下内容.
I am now having namespaces in my XML and tried understanding the answer at Parsing XML with namespace in Python via 'ElementTree' and have the following.
XML文件.
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<grandParent>
<parent>
<child>Sam/Astronaut</child>
</parent>
</grandParent>
</project>
查看通过"ElementTree"在Python中使用命名空间解析XML之后,我的python代码'
import xml.etree.ElementTree as ET
spaces='xmlns':'http://maven.apache.org/POM/4.0.0','schemaLocation':'http://maven.apache.org/xsd/maven-4.0.0.xsd'}
tree = ET.parse("test.xml")
a=tree.find('parent')
for b in a.findall('child', namespaces=spaces):
if b.text.strip()=='Jay/Doctor':
print "child exists"
break
else:
ET.SubElement(a,'child').text="Jay/Doctor"
tree.write("test.xml")
我得到了错误: AttributeError:"NoneType"对象没有属性"findall"
I get the error: AttributeError: 'NoneType' object has no attribute 'findall'
推荐答案
此行有两个问题:
a=tree.find('parent')
首先,<parent>
不是根元素的直接子代. <parent>
是根元素的孙代.父级的路径类似于/project/grandparent/parent
.要搜索<parent>
,请尝试使用XPath表达式*/parent
或可能的//parent
.
First, <parent>
is not an immediate child of the root element. <parent>
is a grandchild of the root element. The path to parent looks like /project/grandparent/parent
. To search for <parent>
, try the XPath expression */parent
or possiblly //parent
.
第二,<parent>
存在于默认名称空间中,因此仅凭其简单名称就无法使用.find()
.您需要添加名称空间.
Second, <parent>
exists in the default namespace, so you won't be able to .find()
it with just its simple name. You'll need to add the namespace.
这里有两个对tree.find()
的有效调用,每个调用都应该找到<parent>
节点:
Here are two equally valid calls to tree.find()
, each of which should find the <parent>
node:
a=tree.find('*/{http://maven.apache.org/POM/4.0.0}parent')
a=tree.find('*/xmlns:parent', namespaces=spaces)
接下来,对findall()
的调用需要一个名称空间限定符:
Next, the call to findall()
needs a namespace qualifier:
for b in a.findall('xmlns:child', namespaces=spaces)
第四,创建新的子元素的调用需要一个名称空间限定符.可能有一种使用快捷方式名称的方法,但是我找不到它.我必须使用长名称.
Fourth, the call to create the new child element needs a namespace qualifier. There may be a way to use the shortcut name, but I couldn't find it. I had to use the long form of the name.
ET.SubElement(a,'{http://maven.apache.org/POM/4.0.0}child').text="Jay/Doctor"
最后,除非您提供默认的名称空间,否则您的XML输出看起来很难看:
Finally, your XML output will look ugly unless you provide a default namespace:
tree.write('test.xml', default_namespace=spaces['xmlns'])
与XML方面无关,您错误地从上一个问题中复制了我的答案. else
与for
对齐,而不与if
对齐:
Unrelated to the XML aspects, you copied my answer from the previous question incorrectly. The else
lines up with the for
, not with the if
:
for ...
if ...
else ...
这篇关于使用ElementTree解析XML时使用名称空间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!