总的来说可能会更快 [英] could sum be faster on lists

查看:112
本文介绍了总的来说可能会更快的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是对该的后续行动>问题

因此,首先,您会注意到无法在字符串列表上执行sum来连接它们,python告诉您使用str.join,这是一个很好的建议,因为无论您如何使用+在字符串上,性能很差.

So first, you'll notice that you cannot perform a sum on a list of strings to concatenate them, python tells you to use str.join instead, and that's good advice because no matter how you use + on strings, the performance is bad.

不能使用sum"限制不适用于list,尽管itertools.chain.from_iterable是执行这种列表扁平化的首选方法.

The "cannot use sum" restriction doesn't apply to list, and though, itertools.chain.from_iterable is the preferred way to perform such list flattening.

但是,当x是列表列表时,sum(x,[])肯定是错误的.

But sum(x,[]) when x is a list of lists is definitively bad.

但是它应该保持这种状态吗?

But should it stay that way?

我比较了3种方法

import time
import itertools

a = [list(range(1,1000)) for _ in range(1000)]

start=time.time()
sum(a,[])
print(time.time()-start)

start=time.time()
list(itertools.chain.from_iterable(a))
print(time.time()-start)


start=time.time()
z=[]
for s in a:
    z += s
print(time.time()-start)

结果:

    列表上的
  • sum():10.46647310256958.好的,我们知道.
  • itertools.chain:0.07705187797546387
  • 使用就地加法的自定义累加总和:0.057044029235839844(如您所见,可能比itertools.chain快)
  • sum() on the list of lists: 10.46647310256958. Okay, we knew.
  • itertools.chain: 0.07705187797546387
  • custom accumulated sum using in-place addition: 0.057044029235839844 (can be faster than itertools.chain as you see)

所以sum落后于它,因为它执行result = result + b而不是result += b

So sum is way behind because it performs result = result + b instead of result += b

所以现在我的问题:

为什么sum在可用时不能使用此累积方法?

Why can't sum use this accumulative approach when available?

(这对于已经存在的应用程序来说是透明的,并且可以使用内置的sum来有效地扁平化列表)

(That would be transparent for already existing applications and would make possible the use of the sum built-in to flatten lists efficiently)

推荐答案

我们可以尝试使 sum()更智能,但Alex Martelli和Guido van Rossum希望将其重点放在算术求和上.

We could try to make sum() smarter, but Alex Martelli and Guido van Rossum wanted to keep it focused on arithmetic summations.

FWIW,您应该使用以下简单代码获得合理的性能:

FWIW, you should get reasonable performance with this simple code:

result = []
for seq in mylists:
    result += seq

对于您的另一个问题,为什么总之不能使用这种累积方法?",请参见Python/bltinmodule.c中关于builtin_sum()的注释:

For your other question, "why can't sum use this accumulative approach when available?", see this comment for builtin_sum() in Python/bltinmodule.c:

    /* It's tempting to use PyNumber_InPlaceAdd instead of
       PyNumber_Add here, to avoid quadratic running time
       when doing 'sum(list_of_lists, [])'.  However, this
       would produce a change in behaviour: a snippet like

         empty = []
         sum([[x] for x in range(10)], empty)

       would change the value of empty. */

这篇关于总的来说可能会更快的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆