为什么将列表转换为集合要比将生成器转换为集合快? [英] Why converting list to set is faster than converting generator to set?
问题描述
这是一个例子
>>> from timeit import timeit
>>> print(timeit('[y for y in range(100)]', number=100000))
0.7025867114395824
>>> print(timeit('(y for y in range(100))', number=100000))
0.09295392291478244
>>> print(timeit('set([y for y in range(100)])', number=100000))
1.0864544935180334
>>> print(timeit('set((y for y in range(100)))', number=100000))
1.1277489876506621
这非常令人困惑.生成器花费的时间更少(这是可以理解的),但是为什么将生成器转换为set却比转换列表要慢(据我所知)却要慢得多.
It is very confusing. Generator takes less time to create(and that is understandable) but why converting generator to set is slower than converting list when it should(atleast to my knowledge) have been the opposite.
推荐答案
首先,计时生成器表达式的时间没有意义.创建生成器不会迭代内容,因此非常快.找出在一个元素和超过一千万个元素之间生成生成器表达式的区别:
First of all, there is no point in timing the creation of a generator expression. Creating a generator doesn't iterate over the contents, so it's very fast. Spot the differences between creating a generator expression over one element vs. over 10 million:
>>> print(timeit('(y for y in range(1))', number=100000))
0.060932624037377536
>>> print(timeit('(y for y in range(10000000))', number=100000))
0.06168231705669314
与列表对象相比,生成器要花更多的时间进行迭代:
Generators take more time to iterate over than, say a list object:
>>> from collections import deque
>>> def drain_iterable(it, _deque=deque):
... deque(it, maxlen=0)
...
>>> def produce_generator():
... return (y for y in range(100))
...
>>> print(timeit('drain_iterable(next(generators))',
... 'from __main__ import drain_iterable, produce_generator;'
... 'generators=iter([produce_generator() for _ in range(100000)])',
... number=100000))
0.5204695729771629
>>> print(timeit('[y for y in range(100)]', number=100000))
0.3088444779859856
在这里,我通过快速丢弃所有元素来测试了生成器表达式的迭代尽可能.
这是因为生成器本质上是一个正在执行的函数,直到生成一个值,然后暂停,然后针对下一个值再次激活,然后再次暂停.参见收益"是什么?关键字吗?以获取良好的概述.与此过程相关的管理需要时间.相比之下,列表解析不必花时间,它可以完成所有循环,而无需为每个产生的值重新激活和停用函数.
That's because a generator is essentially a function being executed until it yields a value, then is paused, then is activated again for the next value, then paused again. See What does the "yield" keyword do? for a good overview. The administration involved with this process takes time. In contrast, a list comprehension doesn't have to spend this time, it does all looping without re-activating and de-activating a function for every value produced.
生成器对内存有效,对执行效率不高.它们可以节省执行时间,有时 ,但这通常是因为您避免分配和取消分配更大的内存块.
Generators are memory efficient, not execution efficient. They can save execution time, sometimes, but usually because you are avoiding allocating and deallocating larger blocks of memory.
这篇关于为什么将列表转换为集合要比将生成器转换为集合快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!