python如何优化条件列表的理解 [英] How does python optimize conditional list comprehensions
问题描述
我了解了在Python中没有[]的列表理解,所以我现在知道
I read about List comprehension without [ ] in Python so now I know that
''.join([str(x) for x in mylist])
比
''.join(str(x) for x in mylist)
因为列表理解得到了高度优化"
because "list comprehensions are highly optimized"
因此,我认为优化过程依赖于for
表达式的解析,可以看到mylist
,计算其长度,并使用它来预先分配确切的数组大小,这样可以节省大量的重新分配.
So I suppose that the optimization relies on the parsing of the for
expression, sees mylist
, computes its length, and uses it to pre-allocate the exact array size, which saves a lot of reallocation.
使用''.join(str(x) for x in mylist)
时,join
盲目接收生成器,并且必须在不事先知道大小的情况下建立其列表.
When using ''.join(str(x) for x in mylist)
, join
recieves a generator blindly and has to build its list without knowing the size in advance.
但现在考虑一下:
mylist = [1,2,5,6,3,4,5]
''.join([str(x) for x in mylist if x < 4])
python如何确定列表推导的大小?它是由mylist
的大小计算得出的,还是在迭代完成后减小的大小(如果列表很大并且条件过滤掉了99%的元素,这可能会很糟糕),还是返回到"don"事先不知道大小吗?
How does python decide of the size of the list comprehension? Is it computed from the size of mylist
, and downsized when iterations are done (which could be very bad if the list is big and the condition filters out 99% of the elements), or does it revert back to the "don't know the size in advance" case?
我已经做了一些小的基准测试,似乎可以确认是否存在优化:
I've done some small benchmarks and it seems to confirm that there's an optimization:
无条件:
import timeit
print(timeit.timeit("''.join([str(x) for x in [1,5,6,3,5,23,334,23234]])"))
print(timeit.timeit("''.join(str(x) for x in [1,5,6,3,5,23,334,23234])"))
收益(如预期):
3.11010817019474
3.3457350077491026
有条件:
print(timeit.timeit("''.join([str(x) for x in [1,5,6,3,5,23,334,23234] if x < 50])"))
print(timeit.timeit("''.join(str(x) for x in [1,5,6,3,5,23,334,23234] if x < 50)"))
产量:
2.7942209702566965
3.0316467566203276
因此有条件的listcomp仍然更快.
so conditional listcomp still is faster.
推荐答案
列表理解不要预先调整列表大小,即使它们完全可以.您假设存在尚未完成的优化.
List comprehensions don't pre-size the list, even when they totally could. You're assuming the presence of an optimization that isn't actually done.
列表理解速度更快,因为所有迭代器机制以及进入和退出genexp堆栈框架的工作都需要一定的成本.列表理解不需要支付这笔费用.
The list comprehension is faster because all the iterator machinery and the work of entering and exiting the genexp stack frame has a cost. The list comprehension doesn't need to pay that cost.
这篇关于python如何优化条件列表的理解的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!