将 XML 文件解析为 Python 对象 [英] Parse XML file into Python object

查看:53
本文介绍了将 XML 文件解析为 Python 对象的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个如下所示的 XML 文件:

I have an XML file which looks like this:

<encspot>
  <file>
   <Name>some filename.mp3</Name>
   <Encoder>Gogo (after 3.0)</Encoder>
   <Bitrate>131</Bitrate>
   <Mode>joint stereo</Mode>
   <Length>00:02:43</Length>
   <Size>5,236,644</Size>
   <Frame>no</Frame>
   <Quality>good</Quality>
   <Freq.>44100</Freq.>
   <Frames>6255</Frames>
   ..... and so forth ......
  </file>
  <file>....</file>
</encspot>

我想将它读入一个 python 对象,类似于字典列表.因为标记是绝对固定的,所以我很想使用正则表达式(我很擅长使用这些).但是,我想我会检查是否有人知道如何在此处轻松避免使用正则表达式.虽然我在 SAX 或其他解析方面没有太多经验,但我愿意学习.

I want to read it into a python object, something like a list of dictionaries. Because the markup is absolutely fixed, I'm tempted to use regex (I'm quite good at using those). However, I thought I'll check if someone knows how to easily avoid regexes here. I don't have much experience with SAX or other parsing, though, but I'm willing to learn.

我期待看到如何在没有 Python 正则表达式的情况下快速完成此操作.感谢您的帮助!

I'm looking forward to be shown how this is done quickly without regexes in Python. Thanks for your help!

推荐答案

如果您认为正则表达式比这更容易,我心爱的 SD Chargers 帽子就交给您了:

My beloved SD Chargers hat is off to you if you think a regex is easier than this:

#!/usr/bin/env python
import xml.etree.cElementTree as et

sxml="""
<encspot>
  <file>
   <Name>some filename.mp3</Name>
   <Encoder>Gogo (after 3.0)</Encoder>
   <Bitrate>131</Bitrate>
  </file>
  <file>
   <Name>another filename.mp3</Name>
   <Encoder>iTunes</Encoder>
   <Bitrate>128</Bitrate>  
  </file>
</encspot>
"""
tree=et.fromstring(sxml)

for el in tree.findall('file'):
    print '-------------------'
    for ch in el.getchildren():
        print '{:>15}: {:<30}'.format(ch.tag, ch.text) 

print "\nan alternate way:"  
el=tree.find('file[2]/Name')  # xpath
print '{:>15}: {:<30}'.format(el.tag, el.text)  

输出:

-------------------
           Name: some filename.mp3             
        Encoder: Gogo (after 3.0)              
        Bitrate: 131                           
-------------------
           Name: another filename.mp3          
        Encoder: iTunes                        
        Bitrate: 128                           

an alternate way:
           Name: another filename.mp3  

如果您对正则表达式的吸引力是简洁的,那么这里有一个同样难以理解的列表理解来创建数据结构:

If your attraction to a regex is being terse, here is an equally incomprehensible bit of list comprehension to create a data structure:

[(ch.tag,ch.text) for e in tree.findall('file') for ch in e.getchildren()]

按文档顺序创建 的 XML 子项的元组列表:

Which creates a list of tuples of the XML children of <file> in document order:

[('Name', 'some filename.mp3'), 
 ('Encoder', 'Gogo (after 3.0)'), 
 ('Bitrate', '131'), 
 ('Name', 'another filename.mp3'), 
 ('Encoder', 'iTunes'), 
 ('Bitrate', '128')]

通过多几行和多一点思考,显然,您可以使用 ElementTree 从 XML 创建您想要的任何数据结构.它是 Python 发行版的一部分.

With a few more lines and a little more thought, obviously, you can create any data structure that you want from XML with ElementTree. It is part of the Python distribution.

编辑

代码高尔夫开始了!

[{item.tag: item.text for item in ch} for ch in tree.findall('file')] 
[ {'Bitrate': '131', 
   'Name': 'some filename.mp3', 
   'Encoder': 'Gogo (after 3.0)'}, 
  {'Bitrate': '128', 
   'Name': 'another filename.mp3', 
   'Encoder': 'iTunes'}]

如果您的 XML 只有 file 部分,您可以选择您的高尔夫.如果你的 XML 有其他标签、其他部分,你需要考虑孩子所在的部分,你需要使用 findall

If your XML only has the file section, you can choose your golf. If your XML has other tags, other sections, you need to account for the section the children are in and you will need to use findall

Effbot.org

这篇关于将 XML 文件解析为 Python 对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆