cElementTree清晰的语义 [英] cElementTree clear semantics

查看：74 发布时间：2019/6/4 23:29:15 python

本文介绍了cElementTree清晰的语义的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想了解cElementTree的清晰工作原理：我有一个

（相对）大的XML文件，我不想加载到内存中。

所以，当然，我尝试过这样的事情：

来自cElementTree的
导入iterparse

for event，elem in iterparse（" data.xml"）：

if elem.tag ==" schnappi"：

count + = 1

elem.clear（）

....导致内存中所有元素的缓存除了

< schnappi> （即该过程的内存占用增长更多

及更多）。然后我虽然清楚地知道我所没有的所有元素，但是b $ b确实需要：

来自cElementTree的
导入iterparse

for event，elem in iterparse（" data.xml"）：

if elem.tag ==" schnappi"：

count + = 1
elem.clear（）

....这给了一个适当的小内存空间，*但是因为

< schnappi>有许多子元素，我订阅了

''结束'' - 事件，< schnappi>在读取所有

子元素并清除（）后，返回元素。所以，我确实看到了

< schnappi>元素，但调用它的getiterator（）给了我完全

空子元素，这不是我想要的:(

最后，我想跟踪什么时候通过订阅开始和结束元素来清除和什么时候没有通过订阅开始和结束元素（这样我将收集

整个< schnappi> -subtree在内存中而且只是发布it）：

来自cElementTree的
导入iterparse

clear_flag = True

为事件，elem在iterparse中（data.xml ;，（开始，结束））：

if event ==" start" and elem.tag ==" schnappi"：

#start collect elements

clear_flag = False

if event ==" end" and elem.tag ==" schnappi"：

clear_flag = True

＃用elem做什么

＃除非我们收集元素，清除（）

if clear_flag：

elem.clear（）

这给了我理想的表现r，但是：

*它看起来非常*丑陋

*它的速度是看到''end''的版本的两倍 - 事件只有。

现在，有*是*更好的方式。我错过了什么？

提前致谢，

ivr

-

...但它是HDTV - 它的分辨率比现实世界更好。

- Fry，当外星人攻击时

Hi,
I am trying to understand how cElementTree''s clear works: I have a
(relatively) large XML file, that I do not wish to load into memory.
So, naturally, I tried something like this:

from cElementTree import iterparse
for event, elem in iterparse("data.xml"):
if elem.tag == "schnappi":
count += 1
elem.clear()

.... which resulted in caching of all elements in memory except for
those named <schnappi> (i.e. the process'' memory footprint grew more
and more). Then I though about clear()''ing all elements that I did not
really need:

from cElementTree import iterparse
for event, elem in iterparse("data.xml"):
if elem.tag == "schnappi":
count += 1
elem.clear()

.... which gave a suitably small memory footprint, *BUT* since
<schnappi> has a number of subelements, and I subscribe to
''end''-events, the <schnappi> element is returned after all of its
subelements have been read and clear()''ed. So, I see indeed a
<schnappi> element, but calling its getiterator() gives me completely
empty subelements, which is not what I wanted :(

Finally, I thought about keeping track of when to clear and when not
to by subscribing to start and end elements (so that I would collect
the entire <schnappi>-subtree in memory and only than release it):

from cElementTree import iterparse
clear_flag = True
for event, elem in iterparse("data.xml", ("start", "end")):
if event == "start" and elem.tag == "schnappi":
# start collecting elements
clear_flag = False
if event == "end" and elem.tag == "schnappi":
clear_flag = True
# do something with elem
# unless we are collecting elements, clear()
if clear_flag:
elem.clear()

This gave me the desired behaviour, but:

* It looks *very* ugly
* It''s twice as slow as version which sees ''end''-events only.

Now, there *has* to be a better way. What am I missing?

Thanks in advance,

ivr
--
"...but it''s HDTV -- it''s got a better resolution than the real world."
-- Fry, "When aliens attack"

cElementTree清晰的语义 [英] cElementTree clear semantics

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

cElementTree清晰的语义 [英] cElementTree clear semantics

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭