在 Python 中处理 XML 的方法真的很简单吗? [英] Really simple way to deal with XML in Python?

查看:16
本文介绍了在 Python 中处理 XML 的方法真的很简单吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑到一个最近提出的问题,我开始怀疑是否有一个非常简单的在 Python 中处理 XML 文档的方法.如果您愿意,可以使用 Pythonic 方式.

也许我可以最好地解释一下,如果我举个例子:让我们说以下 - 我认为这是一个很好的例子,说明 XML 在网络服务中是如何(错误)使用的 - 是我从 http 请求到 http://www.google.com/ig/api?weather=94043

<weather module_id="0" tab_id="0" mobile_row="0" mobile_zipped="1" row="0" section="0" ><预测信息><city data="Mountain View, CA"/><postal_code data="94043"/><latitude_e6 data=""/><longitude_e6 data=""/><forecast_date data="2010-06-23"/><current_date_time data="2010-06-24 00:02:54 +0000"/><unit_system data="US"/></forecast_information><当前条件><条件数据=晴天"/><temp_f data="68"/><temp_c data="20"/><湿度数据="湿度:61%"/><icon data="/ig/images/weather/sunny.gif"/><wind_condition data="Wind: NW at 19 mph"/></current_conditions>...<forecast_conditions><day_of_week 数据="周六"/><低数据=59"/><高数据="75"/><icon data="/ig/images/weather/partly_cloudy.gif"/><条件数据="部分多云"/></forecast_conditions></天气></xml_api_reply>

加载/解析此类文档后,我希望能够像说一样简单地访问信息

<预><代码>>>>xml['xml_api_reply']['weather']['forecast_information']['city'].data'山景城,加利福尼亚'

<预><代码>>>>xml.xml_api_reply.weather.current_conditions.temp_f['数据']'68'

从我目前看到的情况来看,ElementTree 似乎最接近我的梦想.但它不存在,在使用 XML 时仍有一些摸索要做.OTOH,我想的并不复杂——可能只是解析器顶部的薄饰面——但它可以减少处理 XML 的烦恼.有这么神奇的吗?(如果不是 - 为什么?)

附注.注意我已经尝试过 BeautifulSoup,虽然我喜欢它的方法,但它在 <element/>s 空的情况下存在真正的问题 - 请参见下面的评论中的示例.>

解决方案

您想要薄贴面?这很容易煮.首先尝试以下围绕 ElementTree 的简单包装器:

#geetree.py导入 xml.etree.ElementTree 作为 ET类 GeeElem(对象):"""包装在 ElementTree 元素周围.a['foo'] 获取属性 foo, a.foo 获取第一个子元素 foo."""def __init__(self, elem):self.etElem = elemdef __getitem__(self, name):res = self._getattr(name)如果 res 是 None:引发 AttributeError,没有名为 '%s' 的属性"% name返回资源def __getattr__(self, name):res = self._getelem(name)如果 res 是 None:引发 IndexError,没有名为 '%s' 的元素"% name返回资源def _getelem(self, name):res = self.etElem.find(name)如果 res 是 None:返回无返回 GeeElem(res)def _getattr(self, name):返回 self.etElem.get(name)类 GeeTree(对象):围绕 ElementTree 进行包装."def __init__(self, fname):self.doc = ET.parse(fname)def __getattr__(self, name):如果 self.doc.getroot().tag != name:引发 IndexError,没有名为 '%s' 的元素"% name返回 GeeElem(self.doc.getroot())def getroot(self):返回 self.doc.getroot()

你这样调用它:

<预><代码>>>>进口吉利>>>t = geetree.GeeTree('foo.xml')>>>t.xml_api_reply.weather.forecast_information.city['数据']'山景城,加利福尼亚'>>>t.xml_api_reply.weather.current_conditions.temp_f['data']'68'

Musing over a recently asked question, I started to wonder if there is a really simple way to deal with XML documents in Python. A pythonic way, if you will.

Perhaps I can explain best if i give example: let's say the following - which i think is a good example of how XML is (mis)used in web services - is the response i get from http request to http://www.google.com/ig/api?weather=94043

<xml_api_reply version="1">
  <weather module_id="0" tab_id="0" mobile_row="0" mobile_zipped="1" row="0" section="0" >
    <forecast_information>
      <city data="Mountain View, CA"/>
      <postal_code data="94043"/>
      <latitude_e6 data=""/>
      <longitude_e6 data=""/>
      <forecast_date data="2010-06-23"/>
      <current_date_time data="2010-06-24 00:02:54 +0000"/>
      <unit_system data="US"/>
    </forecast_information>
    <current_conditions>
      <condition data="Sunny"/>
      <temp_f data="68"/>
      <temp_c data="20"/>
      <humidity data="Humidity: 61%"/>
      <icon data="/ig/images/weather/sunny.gif"/>
      <wind_condition data="Wind: NW at 19 mph"/>
    </current_conditions>
    ...
    <forecast_conditions>
      <day_of_week data="Sat"/>
      <low data="59"/>
      <high data="75"/>
      <icon data="/ig/images/weather/partly_cloudy.gif"/>
      <condition data="Partly Cloudy"/>
    </forecast_conditions>
  </weather>
</xml_api_reply>

After loading/parsing such document, i would like to be able to access the information as simple as say

>>> xml['xml_api_reply']['weather']['forecast_information']['city'].data
'Mountain View, CA'

or

>>> xml.xml_api_reply.weather.current_conditions.temp_f['data']
'68'

From what I saw so far, seems that ElementTree is the closest to what I dream of. But it's not there, there is still some fumbling to do when consuming XML. OTOH, what I am thinking is not that complicated - probably just thin veneer on top of a parser - and yet it can decrease annoyance of dealing with XML. Is there such a magic? (And if not - why?)

PS. Note I have tried BeautifulSoup already and while I like its approach, it has real issues with empty <element/>s - see below in comments for examples.

解决方案

You want a thin veneer? That's easy to cook up. Try the following trivial wrapper around ElementTree as a start:

# geetree.py
import xml.etree.ElementTree as ET

class GeeElem(object):
    """Wrapper around an ElementTree element. a['foo'] gets the
       attribute foo, a.foo gets the first subelement foo."""
    def __init__(self, elem):
        self.etElem = elem

    def __getitem__(self, name):
        res = self._getattr(name)
        if res is None:
            raise AttributeError, "No attribute named '%s'" % name
        return res

    def __getattr__(self, name):
        res = self._getelem(name)
        if res is None:
            raise IndexError, "No element named '%s'" % name
        return res

    def _getelem(self, name):
        res = self.etElem.find(name)
        if res is None:
            return None
        return GeeElem(res)

    def _getattr(self, name):
        return self.etElem.get(name)

class GeeTree(object):
    "Wrapper around an ElementTree."
    def __init__(self, fname):
        self.doc = ET.parse(fname)

    def __getattr__(self, name):
        if self.doc.getroot().tag != name:
            raise IndexError, "No element named '%s'" % name
        return GeeElem(self.doc.getroot())

    def getroot(self):
        return self.doc.getroot()

You invoke it so:

>>> import geetree
>>> t = geetree.GeeTree('foo.xml')
>>> t.xml_api_reply.weather.forecast_information.city['data']
'Mountain View, CA'
>>> t.xml_api_reply.weather.current_conditions.temp_f['data']
'68'

这篇关于在 Python 中处理 XML 的方法真的很简单吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆