使用名称空间的Python ElementTree find() [英] Python ElementTree find() using namespaces

查看:580
本文介绍了使用名称空间的Python ElementTree find()的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Python的ElementTree来解析和修改xml文件。 XML命名空间带来了混乱。我可以使用findall和finditer来获取所有服务器名称。但是,我无法使用xpath查询来查找特定的服务器。



我需要做的是通过名称或机器元素找到正确的服务器并修改参数

 < ;? xml version ='1.0'encoding ='UTF-8'?> 
< domain xmlns = http://xmlns.oracle.com.weblogic/domain>
< server>
< name> Server1-rma< / name>
< machine>服务器1< / machine>
< server-start>
< arguments> -Xms4g< / arguments>
< / server-start>
< / server>
< server>
< name> Server2-rma< / name>
< machine>服务器2< / machine>
< server-start>
< arguments> -Xms4g< / arguments>
< / server-start>
< / server>
< server>
< name> Server3-rma< / name>
< machine>服务器3< / machine>
< server-start>
< arguments> -Xms4g< / arguments>
< / server-start>
< / server>
< / domain>

我尝试了查询的各种迭代。但是,我是XPath的新手,必须做一些错误的事情:



失败:
root + ns0:server / [ns0 :machine ='server2']



失败:
root + ns0:server / ns0: [machine ='server2']



失败:
root + ns0:server / [ns0 :machine = ns0:'server2']



示例代码:

 将xml.etree.ElementTree导入为ET 
名称空间= {'ns0':'http://xmlns.oracle.com.weblogic/domain'}

树= ET.parse('config.xml')
根= tree.getroot()
用于root.find((根+ ns0:server / [ns0:machine ='server2 ']),命名空间)
print(item.tag)


输出:
{http://xmlns.oracle.com.weblogic/domain}服务器

我希望能够匹配 machine元素并拉出父元素以便访问正确的参数元素。



我是xpath和elementtree的初学者,所以我很肯定,我是不要做错事。我只是不确定。任何帮助将不胜感激。

解决方案

像亚历杭德罗在评论中提到的那样, ElementTree对XPath的支持有限。。对于您要执行的操作来说,这无关紧要。如果需要完全的XPath 1.0支持,请考虑lxml



但是,它还有其他一些怪癖。其中之一是它将自己的名称空间前缀添加到您的默认名称空间。要保留默认名称空间,您必须使用 register_namespace()



Alejandro也很正确,选择服务器的正确XPath应该是:

  / ns0:domain / ns0:server [ns0:machine ='server2'] 

但是,当您构建树(使用 ET.parse())或获取根(使用 getroot()),上下文已经是 ns0:domain ,因此该上下文中的XPath实际上是:

 。/ns0:server [ns0:machine ='server2'] 

因为您要更新参数的 服务器,我们也可以将其添加到XPath中:

 。/ns0:server [ns0:马赫ine ='server2'] / ns0:server-start / ns0:arguments 

请参阅此处以获取有关XPath位置路径的更多信息。



这是一个完整的例子。 (我使用前缀 wl 代替 ns0 只是为了表明前缀实际上并没有多久因为它遵循命名空间规则前缀。)



XML输入(test.xml;固定引号和XML声明,因此它将为格式正确

 <?xml version ='1.0'encoding ='UTF-8'?> 
< domain xmlns = http://xmlns.oracle.com.weblogic/domain>
< server>
< name> Server1-rma< / name>
< machine>服务器1< / machine>
< server-start>
< arguments> -Xms4g< / arguments>
< / server-start>
< / server>
< server>
< name> Server2-rma< / name>
< machine>服务器2< / machine>
< server-start>
< arguments> -Xms4g< / arguments>
< / server-start>
< / server>
< server>
< name> Server3-rma< / name>
< machine>服务器3< / machine>
< server-start>
< arguments> -Xms4g< / arguments>
< / server-start>
< / server>
< / domain>

Python



< pre class = lang-py prettyprint-override> 将xml.etree.ElementTree导入为ET

tree = ET.parse( test.xml)

ns = { wl: http://xmlns.oracle.com.weblogic/domain}

ET.register_namespace(,ns [ wl])

尝试:
tree.find( ./ wl:server [wl:machine ='server2'] / wl:server-start / wl:arguments,namespaces = ns).text =糟糕!!!
AttributeError:
print(无法找到正确的服务器元素。)

tree.write( output.xml,xml_declaration = True,encoding = UTF -8)

XML输出(output.xml)

 <?xml version ='1.0'encoding ='UTF-8'?> 
< domain xmlns = http://xmlns.oracle.com.weblogic/domain>
< server>
< name> Server1-rma< / name>
< machine>服务器1< / machine>
< server-start>
< arguments> -Xms4g< / arguments>
< / server-start>
< / server>
< server>
< name> Server2-rma< / name>
< machine>服务器2< / machine>
< server-start>
< arguments> BAM !!!< / arguments>
< / server-start>
< / server>
< server>
< name> Server3-rma< / name>
< machine>服务器3< / machine>
< server-start>
< arguments> -Xms4g< / arguments>
< / server-start>
< / server>
< / domain>


I am attempting to use Python’s ElementTree to parse and modify an xml file. The confusion comes with the XML Namespace. I can use the findall and finditer to get all of the servers names. However, I can't get the xpath query to work to find a specific server. instead the find just brings back the parent element.

What I need to do is find the correct server by the "name" or "machine" element and modify the "arguments".

<? xml version=’1.0’ encoding=’UTF-8’?>
<domain xmlns="http://xmlns.oracle.com.weblogic/domain">
  <server>
    <name>Server1-rma</name>
    <machine>server1</machine>
    <server-start>
      <arguments> -Xms4g</arguments>
    </server-start>
  </server>
  <server>
    <name>Server2-rma</name>
    <machine>server2</machine>
    <server-start>
      <arguments> -Xms4g</arguments>
    </server-start>
  </server>
  <server>
    <name>Server3-rma</name>
    <machine>server3</machine>
    <server-start>
      <arguments> -Xms4g</arguments>
    </server-start>
  </server>
</domain>

I have attempted various iterations of the query. However, I am new to XPath and must be doing something wrong:

Failed: root + "ns0:server/[ns0:machine=’server2’]

Failed: root + "ns0:server/ns0:[machine=’server2’]

Failed: root + "ns0:server/[ns0:machine=ns0:’server2’]

sample code:

import xml.etree.ElementTree as ET
namespace = {‘ns0’: ‘ http://xmlns.oracle.com.weblogic/domain’}

tree = ET.parse(‘config.xml’)
root = tree.getroot()
for item in root.find((root + "ns0:server/[ns0:machine=’server2’]), namespace)
    print(item.tag)


output:
{http://xmlns.oracle.com.weblogic/domain}server

I was hoping be able to match the "machine" element and pull the parent element in order to access the correct "arguments" element.

I am a beginner at xpath and elementtree so I am positive, that I am just doing something incorrectly. I am just not sure what. Any help would be greatly appreciated.

解决方案

Like Alejandro mentioned in a comment, ElementTree has limited support for XPath. That shouldn't matter too much for what you're trying to do. If you need full XPath 1.0 support, consider lxml.

However, it also has some other quirks. One of them is that it will add it's own namespace prefix to your default namespace. To keep the default namespace you'll have to register it with register_namespace().

Alejandro is also correct that the correct XPath to select the server would be:

/ns0:domain/ns0:server[ns0:machine='server2']

However, when you build the tree (with ET.parse()) or get the root (with getroot()), the context is already ns0:domain so the XPath in that context would actually be:

./ns0:server[ns0:machine='server2']

Since you're wanting to update the arguments of the server, we can add that to the XPath too:

./ns0:server[ns0:machine='server2']/ns0:server-start/ns0:arguments

See here for more info on XPath location paths.

Here's a full example. (I'm using the prefix wl instead of ns0 just to show that the prefix doesn't really matter as long as it follows the rules for namespace prefixes.)

XML Input (test.xml; fixed quotes and XML declaration so it would be well-formed)

<?xml version='1.0' encoding='UTF-8'?>
<domain xmlns="http://xmlns.oracle.com.weblogic/domain">
  <server>
    <name>Server1-rma</name>
    <machine>server1</machine>
    <server-start>
      <arguments> -Xms4g</arguments>
    </server-start>
  </server>
  <server>
    <name>Server2-rma</name>
    <machine>server2</machine>
    <server-start>
      <arguments> -Xms4g</arguments>
    </server-start>
  </server>
  <server>
    <name>Server3-rma</name>
    <machine>server3</machine>
    <server-start>
      <arguments> -Xms4g</arguments>
    </server-start>
  </server>
</domain>

Python

import xml.etree.ElementTree as ET

tree = ET.parse("test.xml")

ns = {"wl": "http://xmlns.oracle.com.weblogic/domain"}

ET.register_namespace("", ns["wl"])

try:
    tree.find("./wl:server[wl:machine='server2']/wl:server-start/wl:arguments", namespaces=ns).text = "BAM!!!"
except AttributeError:
    print("Unable to find the correct server element.")

tree.write("output.xml", xml_declaration=True, encoding="UTF-8")

XML Output (output.xml)

<?xml version='1.0' encoding='UTF-8'?>
<domain xmlns="http://xmlns.oracle.com.weblogic/domain">
  <server>
    <name>Server1-rma</name>
    <machine>server1</machine>
    <server-start>
      <arguments> -Xms4g</arguments>
    </server-start>
  </server>
  <server>
    <name>Server2-rma</name>
    <machine>server2</machine>
    <server-start>
      <arguments>BAM!!!</arguments>
    </server-start>
  </server>
  <server>
    <name>Server3-rma</name>
    <machine>server3</machine>
    <server-start>
      <arguments> -Xms4g</arguments>
    </server-start>
  </server>
</domain>

这篇关于使用名称空间的Python ElementTree find()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆