保存一个'lxml.etree._ElementTree'对象 [英] saving an 'lxml.etree._ElementTree' object

查看:672
本文介绍了保存一个'lxml.etree._ElementTree'对象的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最近几天我一直在学习lxml的基础知识;特别是使用lxml.html解析网站并创建内容的ElementTree.理想情况下,我想保存返回的ElementTree,以便我可以加载它并对其进行实验,而无需每次修改脚本时都解析网站.我以为腌制是必经之路,但是我现在开始怀疑.虽然我可以在腌制后检索ElementTree对象...

I've spent the last couple of days getting to grips with the basics of lxml; in particular using lxml.html to parse websites and create an ElementTree of the content. Ideally, I want to save the returned ElementTree so that I can load it up and experiment with it, without having to parse the website every time I modify my script. I assumed that pickling would be the way to go, however I'm now beginning to wonder. Although I am able to retrieve an ElementTree object after pickling...

type(myObject) 

返回

<class 'lxml.etree._ElementTree'>

对象本身似乎是空"的,因为我对其进行的后续方法/属性调用均不会产生任何输出.

the object itself appears to be 'empty', since none of the subsequent method/attribute calls I make on it yield any output.

我的猜测是,在这里酸洗是不合适的,但是任何人都可以建议替代方法吗?

My guess is that pickling isn't appropriate here, but can anyone suggest an alternative?

(如果很重要,上述情况将在python3.2,lxml 2.3.2,snow-leopard中发生)

(In case it matters, the above is happening in: python3.2, lxml 2.3.2, snow-leopard))

推荐答案

您已经在处理XML,lxml非常适合解析XML.所以我认为 最简单的事情是将序列化为XML:

You are already dealing with XML, and lxml is great at parsing XML. So I think the simplest thing to do would be to serialize to XML:

要写入文件:

import lxml.etree as ET

filename = '/tmp/test.xml'
myobject.write(filename)

要调用write方法,请注意myobject必须是lxml.etree._ElementTree.如果是 lxml.etree._Element,那么您需要 myobject.getroottree().write(filename).

To call the write method, note that myobject must be an lxml.etree._ElementTree. If it is an lxml.etree._Element, then you would need myobject.getroottree().write(filename).

要从文件名/路径,文件对象或URL进行解析,请执行以下操作:

To parse from file name/path, file object, or URL:

myobject = ET.parse(file_or_url)

要从字符串中解析:

myobject = ET.fromstring(content)

这篇关于保存一个'lxml.etree._ElementTree'对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆