使用python提取特定的xml标记值 [英] Extracting specific xml tag value using python

查看:98
本文介绍了使用python提取特定的xml标记值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有如下所示的XML数据:

I have XML data which looks like this:

    <root>
      <results preview='0'>
        <meta>
          <fieldOrder>
        <field>title</field>
        <field>search</field>
          </fieldOrder>
        </meta>
        <messages>
          <msg type="DEBUG">msg1</msg>
          <msg type="DEBUG">msg2</msg>
        </messages>
        <result offset='0'>
          <field k='title'>
        <value>
          <text>text1</text>
        </value>
          </field>
          <field k='search'>
        <value>
          <text>text2</text>
        </value>
          </field>
        </result>
      </results>
    </root>

我要提取标记值 text2 从标签 k ='search'> value> text 中获取。

I want to extract the tag value text2 from the tag k='search'>value>text.

在我的代码中,我正在尝试以下:

In my code, I am trying the following:

for atype in root.findall(".//text"):
    print(atype.text)

这给了我 text1 text2 作为输出。其中,我只需要 text2 。我可以在程序中使用 if 语句来仅过滤 text2 值,但是我想找到在 findall()中有更强大的方法。

This gives me both text1 and text2 as output. Out of these I need only text2. I could handle this in my program to have an if statement to filter only the text2 value, but I want to find a more robust way to do this in findall().

我尝试使用此代码来专门提取 text2 作为输出。

I have tried this code instead to specifically extract only text2 as output.

for atype in root.findall(".//field[@k='search']//text"):
    print(atype.text)

但这给我一个错误-

File "command_curl", line 49, in <module>
for atype in root.findall(".//field[@k='search']//text"):
File "/usr/lib64/python2.6/xml/etree/ElementTree.py", line 355, in findall
return ElementPath.findall(self, path)
File "/usr/lib64/python2.6/xml/etree/ElementPath.py", line 198, in findall
return _compile(path).findall(element)
File "/usr/lib64/python2.6/xml/etree/ElementPath.py", line 176, in _compile
p = Path(path)
File "/usr/lib64/python2.6/xml/etree/ElementPath.py", line 93, in __init__
"expected path separator (%s)" % (op or tag)
SyntaxError: expected path separator ([)

我应该更改为仅得到 text2 作为我的输出?

What should I change to get only text2 as my output?

推荐答案

您可以使用以下示例从标记中提取文本

You can extract text from tag, using below example

import xml.etree.ElementTree as ET

tree = ET.parse("sample.xml")
root = tree.getroot()
for tags in root.findall(".//text"):
    print(tags.text)

这篇关于使用python提取特定的xml标记值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆