有限生成器的长度 [英] Length of a finite generator
问题描述
我有这两种实现来计算有限生成器的长度,同时保留数据以供进一步处理:
I have these two implementations to compute the length of a finite generator, while keeping the data for further processing:
def count_generator1(generator):
'''- build a list with the generator data
- get the length of the data
- return both the length and the original data (in a list)
WARNING: the memory use is unbounded, and infinite generators will block this'''
l = list(generator)
return len(l), l
def count_generator2(generator):
'''- get two generators from the original generator
- get the length of the data from one of them
- return both the length and the original data, as returned by tee
WARNING: tee can use up an unbounded amount of memory, and infinite generators will block this'''
for_length, saved = itertools.tee(generator, 2)
return sum(1 for _ in for_length), saved
两者都有弊端,两者都能胜任.有人可以对它们发表评论,甚至提供更好的选择吗?
Both have drawbacks, both do the job. Could somebody comment on them, or even offer a better alternative?
推荐答案
如果必须这样做,第一种方法会更好-由于消耗了所有值,因此itertools.tee()
无论如何都必须存储所有值,意味着列表会更有效.
If you have to do this, the first method is much better - as you consume all the values, itertools.tee()
will have to store all the values anyway, meaning a list will be more efficient.
引用文档:
此itertool可能需要大量辅助存储(取决于 需要存储多少临时数据).一般而言,如果 迭代器会在另一个迭代器启动之前使用大部分或全部数据, 使用list()代替tee()更快.
This itertool may require significant auxiliary storage (depending on how much temporary data needs to be stored). In general, if one iterator uses most or all of the data before another iterator starts, it is faster to use list() instead of tee().
这篇关于有限生成器的长度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!