我如何使用ElementTree在xml文件中搜索具有特定值的特定“父”标签的标签? (蟒蛇) [英] How do I search for a Tag in xml file using ElementTree where i have a certain "Parent"tag with a specific value? (python)
问题描述
我刚刚开始学习Python,必须编写一个解析xml文件的程序。我必须在2个不同的文件中找到一个称为OrganisationReference的标签并将其返回。实际上,有多个使用此名称的Tag,但只有一个,我要返回的Tag,具有带有值为DEALER的Tag OrganisationType作为父Tag的Tag(不确定该术语是否正确)。我试图为此使用ElementTree。下面是代码:
将xml.etree.ElementTree导入为ET
tree1 = ET.parse ('Master1.xml')
root1 = tree1.getroot()
tree2 = ET.parse('Master2.xml')
root2 = tree2.getroot()
用于root1.findall( ./ Organisation / OrganisationId / [@ OrganisationType ='DEALER'] / OrganisationReference)中的OrganisationReference :)
print(OrganisationReference.attrib)
$ b rootb.findall( ./ Organisation / OrganisationId / [@ OrganisationType ='DEALER'] / OrganisationReference)中的OrganisationReference的$ b:
print(OrganisationReference.attrib)
但这不会返回任何内容(也没有错误)。有人可以帮我吗?
我的文件如下:
< MessageOrganisationCount> a< / MessageOrganisationCount>
< MessageVehicleCount> x< / MessageVehicleCount>
< MessageCreditLineCount> y< / MessageCreditLineCount>
< MessagePlanCount> z< / MessagePlanCount>
< OrganisationData>
< Organization>
< OrganisationId>
< OrganisationType>经销商< / OrganisationType>
< OrganizationReference> WHATINEED< / OrganisationReference>
< / OrganisationId>
< OrganisationName> XYZ。< / OrganisationName>
....
由于OrganisationReference在此出现了多次文件在开始和结束标签之间具有不同的文本,我想得到的正是在第9行中看到的:它具有OrganisationId作为父标签,而DEALER也是OrganisationId的子标签。
您与原始尝试非常接近。您只需要对xpath进行一些更改,并对python进行微小的更改。
xpath的第一部分以开头。 /组织
。由于您是从根目录开始执行xpath,因此它期望 Organization
是孩子。不是;它是一个后代。
尝试将 ./组织
更改为 .//组织
。 ( //
是 / descendant-or-self :: node()/
的缩写。有关详细信息,请参见此处。)
第二个问题是 OrganisationId / [@ OrganisationType ='DEALER']
。那是无效的xpath。 /
应该从 OrganisationId
和谓词。
此外, @
是属性的缩写语法::
轴和 OrganisationType
是元素,而不是属性。
尝试将 OrganisationId / [@ OrganisationType ='DEALER']
更改为 OrganisationId [OrganisationType ='经销商']
。
python问题与 print(OrganisationReference.attrib)
。 OrganisationReference
没有任何属性;
尝试将 print(OrganisationReference.attrib)
更改为 print(OrganisationReference。文本)
。
下面是一个示例,仅出于演示目的使用一个XML文件...
XML输入(Master1.xml;添加了 doc
元素以使其格式正确)
< doc>
< MessageOrganisationCount> a< / MessageOrganisationCount>
< MessageVehicleCount> x< / MessageVehicleCount>
< MessageCreditLineCount> y< / MessageCreditLineCount>
< MessagePlanCount> z< / MessagePlanCount>
< OrganisationData>
< Organization>
< OrganisationId>
< OrganisationType>经销商< / OrganisationType>
< OrganizationReference> WHATINEED< / OrganisationReference>
< / OrganisationId>
< OrganisationName> XYZ。< / OrganisationName>
< / Organisation>
< / OrganisationData>
< / doc>
Python
< pre class = lang-py prettyprint-override>
将xml.etree.ElementTree导入为ET
tree1 = ET.parse('Master1.xml')
root1 = tree1.getroot()
用于root1.findall( .// Organisation / OrganisationId [OrganisationType ='DEALER'] / OrganisationReference)中的OrganisationReference:
print(OrganisationReference.text )
打印输出
已处理的
还要注意,似乎根本不需要使用 getroot()
。您可以直接在树上使用 findall()
...
将xml.etree.ElementTree导入为ET
tree1 = ET.parse('Master1.xml')
for tree1.findall( .//Organisation/OrganisationId[OrganisationType='DEALER']/OrganisationReference):
打印(OrganisationReference.text)
I just started learning Python and have to write a program, that parses xml files. I have to find a certain Tag called OrganisationReference in 2 different files and return it. In fact there are multiple Tags with this name, but only one, the one I am trying to return, that has the Tag OrganisationType with the value DEALER as a parent Tag (not quite sure whether the term is right). I tried to use ElementTree for this. Here is the code:
import xml.etree.ElementTree as ET
tree1 = ET.parse('Master1.xml')
root1 = tree1.getroot()
tree2 = ET.parse('Master2.xml')
root2 = tree2.getroot()
for OrganisationReference in root1.findall("./Organisation/OrganisationId/[@OrganisationType='DEALER']/OrganisationReference"):
print(OrganisationReference.attrib)
for OrganisationReference in root2.findall("./Organisation/OrganisationId/[@OrganisationType='DEALER']/OrganisationReference"):
print(OrganisationReference.attrib)
But this returns nothing (also no error). Can somebody help me?
My file looks like this:
<MessageOrganisationCount>a</MessageOrganisationCount>
<MessageVehicleCount>x</MessageVehicleCount>
<MessageCreditLineCount>y</MessageCreditLineCount>
<MessagePlanCount>z</MessagePlanCount>
<OrganisationData>
<Organisation>
<OrganisationId>
<OrganisationType>DEALER</OrganisationType>
<OrganisationReference>WHATINEED</OrganisationReference>
</OrganisationId>
<OrganisationName>XYZ.</OrganisationName>
....
Due to the fact that OrganisationReference appears a few more times in this file with different text between start and endtag, I want to get exactly the one, that you see in line 9: it has OrganisationId as a parent tag, and DEALER is also a child tag of OrganisationId.
You were super close with your original attempt. You just need to make a couple of changes to your xpath and a tiny change to your python.
The first part of your xpath starts with ./Organization
. Since you're doing the xpath from root, it expects Organization
to be a child. It's not; it's a descendant.
Try changing ./Organization
to .//Organization
. (//
is short for /descendant-or-self::node()/
. See here for more info.)
The second issue is with OrganisationId/[@OrganisationType='DEALER']
. That's invalid xpath. The /
should be removed from between OrganisationId
and the predicate.
Also, @
is abbreviated syntax for the attribute::
axis and OrganisationType
is an element, not an attribute.
Try changing OrganisationId/[@OrganisationType='DEALER']
to OrganisationId[OrganisationType='DEALER']
.
The python issue is with print(OrganisationReference.attrib)
. The OrganisationReference
doesn't have any attributes; just text.
Try changing print(OrganisationReference.attrib)
to print(OrganisationReference.text)
.
Here's an example using just one XML file for demo purposes...
XML Input (Master1.xml; with doc
element added to make it well-formed)
<doc>
<MessageOrganisationCount>a</MessageOrganisationCount>
<MessageVehicleCount>x</MessageVehicleCount>
<MessageCreditLineCount>y</MessageCreditLineCount>
<MessagePlanCount>z</MessagePlanCount>
<OrganisationData>
<Organisation>
<OrganisationId>
<OrganisationType>DEALER</OrganisationType>
<OrganisationReference>WHATINEED</OrganisationReference>
</OrganisationId>
<OrganisationName>XYZ.</OrganisationName>
</Organisation>
</OrganisationData>
</doc>
Python
import xml.etree.ElementTree as ET
tree1 = ET.parse('Master1.xml')
root1 = tree1.getroot()
for OrganisationReference in root1.findall(".//Organisation/OrganisationId[OrganisationType='DEALER']/OrganisationReference"):
print(OrganisationReference.text)
Printed Output
WHATINEED
Also note that it doesn't appear that you need to use getroot()
at all. You can use findall()
directly on the tree...
import xml.etree.ElementTree as ET
tree1 = ET.parse('Master1.xml')
for OrganisationReference in tree1.findall(".//Organisation/OrganisationId[OrganisationType='DEALER']/OrganisationReference"):
print(OrganisationReference.text)
这篇关于我如何使用ElementTree在xml文件中搜索具有特定值的特定“父”标签的标签? (蟒蛇)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!