使用ElementTree解析XML时使用名称空间 [英] Working with namespace while parsing XML using ElementTree

查看:285
本文介绍了使用ElementTree解析XML时使用名称空间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下是针对使用ElementTree修改XML

我现在在我的XML中拥有名称空间,并尝试在中解析答案通过"ElementTree"在XML中具有命名空间的XML ,并具有以下内容.

I am now having namespaces in my XML and tried understanding the answer at Parsing XML with namespace in Python via 'ElementTree' and have the following.

XML文件.

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
 <grandParent>
  <parent>
   <child>Sam/Astronaut</child>
  </parent>
 </grandParent>
</project>

查看通过"ElementTree"在Python中使用命名空间解析XML之后,我的python代码'

import xml.etree.ElementTree as ET

spaces='xmlns':'http://maven.apache.org/POM/4.0.0','schemaLocation':'http://maven.apache.org/xsd/maven-4.0.0.xsd'}

tree = ET.parse("test.xml")
a=tree.find('parent')          
for b in a.findall('child', namespaces=spaces):
 if b.text.strip()=='Jay/Doctor':
    print "child exists"
    break
else:
    ET.SubElement(a,'child').text="Jay/Doctor"

tree.write("test.xml")

我得到了错误: AttributeError:"NoneType"对象没有属性"findall"

I get the error: AttributeError: 'NoneType' object has no attribute 'findall'

推荐答案

此行有两个问题:

a=tree.find('parent')          

首先,<parent>不是根元素的直接子代. <parent>是根元素的孙代.父级的路径类似于/project/grandparent/parent.要搜索<parent>,请尝试使用XPath表达式*/parent或可能的//parent.

First, <parent> is not an immediate child of the root element. <parent> is a grandchild of the root element. The path to parent looks like /project/grandparent/parent. To search for <parent>, try the XPath expression */parent or possiblly //parent.

第二,<parent>存在于默认名称空间中,因此仅凭其简单名称就无法使用.find().您需要添加名称空间.

Second, <parent> exists in the default namespace, so you won't be able to .find() it with just its simple name. You'll need to add the namespace.

这里有两个对tree.find()的有效调用,每个调用都应该找到<parent>节点:

Here are two equally valid calls to tree.find(), each of which should find the <parent> node:

a=tree.find('*/{http://maven.apache.org/POM/4.0.0}parent')
a=tree.find('*/xmlns:parent', namespaces=spaces)

接下来,对findall()的调用需要一个名称空间限定符:

Next, the call to findall() needs a namespace qualifier:

for b in a.findall('xmlns:child', namespaces=spaces) 

第四,创建新的子元素的调用需要一个名称空间限定符.可能有一种使用快捷方式名称的方法,但是我找不到它.我必须使用长名称.

Fourth, the call to create the new child element needs a namespace qualifier. There may be a way to use the shortcut name, but I couldn't find it. I had to use the long form of the name.

ET.SubElement(a,'{http://maven.apache.org/POM/4.0.0}child').text="Jay/Doctor"

最后,除非您提供默认的名称空间,否则您的XML输出看起来很难看:

Finally, your XML output will look ugly unless you provide a default namespace:

tree.write('test.xml', default_namespace=spaces['xmlns'])

与XML方面无关,您错误地从上一个问题中复制了我的答案. elsefor对齐,而不与if对齐:

Unrelated to the XML aspects, you copied my answer from the previous question incorrectly. The else lines up with the for, not with the if:

for ...
  if ...
else ...

这篇关于使用ElementTree解析XML时使用名称空间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆