总的来说可能会更快 [英] could sum be faster on lists
问题描述
这是对该的后续行动>问题
因此,首先,您会注意到无法在字符串列表上执行sum
来连接它们,python告诉您使用str.join
,这是一个很好的建议,因为无论您如何使用+
在字符串上,性能很差.
So first, you'll notice that you cannot perform a sum
on a list of strings to concatenate them, python tells you to use str.join
instead, and that's good advice because no matter how you use +
on strings, the performance is bad.
不能使用sum
"限制不适用于list
,尽管itertools.chain.from_iterable
是执行这种列表扁平化的首选方法.
The "cannot use sum
" restriction doesn't apply to list
, and though, itertools.chain.from_iterable
is the preferred way to perform such list flattening.
但是,当x
是列表列表时,sum(x,[])
肯定是错误的.
But sum(x,[])
when x
is a list of lists is definitively bad.
但是它应该保持这种状态吗?
But should it stay that way?
我比较了3种方法
import time
import itertools
a = [list(range(1,1000)) for _ in range(1000)]
start=time.time()
sum(a,[])
print(time.time()-start)
start=time.time()
list(itertools.chain.from_iterable(a))
print(time.time()-start)
start=time.time()
z=[]
for s in a:
z += s
print(time.time()-start)
结果:
-
列表上的
-
sum()
:10.46647310256958.好的,我们知道. -
itertools.chain
:0.07705187797546387 - 使用就地加法的自定义累加总和:0.057044029235839844(如您所见,可能比
itertools.chain
快)
sum()
on the list of lists: 10.46647310256958. Okay, we knew.itertools.chain
: 0.07705187797546387- custom accumulated sum using in-place addition: 0.057044029235839844 (can be faster than
itertools.chain
as you see)
所以sum
落后于它,因为它执行result = result + b
而不是result += b
So sum
is way behind because it performs result = result + b
instead of result += b
所以现在我的问题:
为什么sum
在可用时不能使用此累积方法?
Why can't sum
use this accumulative approach when available?
(这对于已经存在的应用程序来说是透明的,并且可以使用内置的sum
来有效地扁平化列表)
(That would be transparent for already existing applications and would make possible the use of the sum
built-in to flatten lists efficiently)
推荐答案
我们可以尝试使 sum()更智能,但Alex Martelli和Guido van Rossum希望将其重点放在算术求和上.
We could try to make sum() smarter, but Alex Martelli and Guido van Rossum wanted to keep it focused on arithmetic summations.
FWIW,您应该使用以下简单代码获得合理的性能:
FWIW, you should get reasonable performance with this simple code:
result = []
for seq in mylists:
result += seq
对于您的另一个问题,为什么总之不能使用这种累积方法?",请参见Python/bltinmodule.c中关于builtin_sum()的注释:
For your other question, "why can't sum use this accumulative approach when available?", see this comment for builtin_sum() in Python/bltinmodule.c:
/* It's tempting to use PyNumber_InPlaceAdd instead of
PyNumber_Add here, to avoid quadratic running time
when doing 'sum(list_of_lists, [])'. However, this
would produce a change in behaviour: a snippet like
empty = []
sum([[x] for x in range(10)], empty)
would change the value of empty. */
这篇关于总的来说可能会更快的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!