如何使用python从xml中提取特定数据? [英] How do I extract specific data from xml using python?

查看:49
本文介绍了如何使用python从xml中提取特定数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对 python 比较陌生.我一直在尝试通过动手的方法来学习 python(我通过做 euler 项目学习了 c/c++).现在我正在学习如何从文件中提取数据.我已经掌握了从简单文本文件中提取数据的窍门,但我有点卡在 xml 文件上.我试图做的一个例子.我在谷歌驱动器上备份了我的通话记录,它们很多(大约 4000)这是xml文件示例

<call number="+91234567890" duration="49" date="1483514046018" type="3"presentation="1" readable_date="04-Jan-2017 12:44:06 PM"contact_name="爸爸"/>

我想把所有的电话都接给我爸爸并像这样显示

编号 = 234567890持续时间 = "49" date="04-Jan-2017 12:44:06 PM"持续时间 = "x" 日期 = "y"持续时间 = "n" 日期 = "z"

诸如此类.你建议我怎么做?

解决方案

建议在问题中提供足够的信息,以便可以重新创建问题.

首先我们需要弄清楚我们可以iter 在哪些元素上.由于 是这里的根元素,我们对其进行迭代.

注意:如果您在提供的行之前有标签/元素,您将需要找出正确的根元素而不是 call.

<预><代码>>>>[i for i in root.iter('call')][<0x29d3410 处的元素调用">]

在这里你可以看到,我们可以在元素calliter.

然后我们简单地iter在元素上并根据要求分离出元素属性键和值.

工作代码

导入 xml.etree.ElementTree 作为 ET数据文件 = 'test.xml'树 = ET.parse(data_file)root = tree.getroot()对于我在 root.iter('call'):打印 'duration', "=", i.attrib['duration']打印 'data', "=", i.attrib['date']

结果

<预><代码>>>>持续时间 = 49数据 = 1483514046018>>>

I'm relatively new to python. I've been trying to learn python through a hands-on approach (I learnt c/c++ through the doing the euler project). Right now I'm learning how to extract data from files. I've gotten the hang of extracting data from simple text files but I'm kinda stuck on xml files. An example of what I was trying to do. I have my call logs backed up on google drive and they're a lot (about 4000) Here is the xml file example

<call number="+91234567890" duration="49" date="1483514046018" type="3" presentation="1" readable_date="04-Jan-2017 12:44:06 PM" contact_name="Dad" />

I want to take all the calls to my dad and display them like this

number = 234567890
duration = "49"  date="04-Jan-2017 12:44:06 PM"
duration = "x"   date="y"
duration = "n"   date="z"

and so on like that. How do you propose I do that?

解决方案

It's advisable to provide sufficient information in a question so that problem can be recreated.

<?xml version="1.0" encoding="UTF-8"?>
<call number="+91234567890" duration="49" date="1483514046018" type="3" 
 presentation="1" readable_date="04-Jan-2017 12:44:06 PM" 
    contact_name="Dad" />

First we need to figure out what elements can we iter on. Since <call ../> is root element over here, we iter over that.

NOTE: if you have tags/element prior to the line provided, you will need to figure out proper root element instead of call.

>>> [i for i in root.iter('call')]
[<Element 'call' at 0x29d3410>]

Here you can see, we can iter on element call.

Then we simply iter over the element and separate out element attribute key and values as per requirements.

Working Code

import xml.etree.ElementTree as ET
data_file = 'test.xml'
tree = ET.parse(data_file)
root = tree.getroot()

for i in root.iter('call'):
    print 'duration', "=", i.attrib['duration']
    print 'data', "=", i.attrib['date']

Result

>>> 
duration = 49
data = 1483514046018
>>> 

这篇关于如何使用python从xml中提取特定数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆