用 for 循环求和比用 reduce 更快? [英] Summing with a for loop faster than with reduce?

查看:41
本文介绍了用 for 循环求和比用 reduce 更快?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想看看 reduce 比使用 for 循环进行简单的数值运算快多少.这是我发现的(使用标准 timeit 库):

I wanted to see how much faster reduce was than using a for loop for simple numerical operations. Here's what I found (using the standard timeit library):

In [54]: print(setup)
from operator import add, iadd
r = range(100)

In [55]: print(stmt1)    
c = 0
for i in r:
    c+=i        

In [56]: timeit(stmt1, setup)
Out[56]: 8.948904991149902
In [58]: print(stmt3)    
reduce(add, r)    

In [59]: timeit(stmt3, setup)
Out[59]: 13.316915035247803

再看一点:

In [68]: timeit("1+2", setup)
Out[68]: 0.04145693778991699

In [69]: timeit("add(1,2)", setup)
Out[69]: 0.22807812690734863

这是怎么回事?显然,reduce 的循环速度比 for 快,但函数调用似乎占主导地位.不应该减少版本几乎完全在 C 中运行吗?在 for 循环版本中使用 iadd(c,i) 使其在 ~24 秒内运行.为什么使用 operator.add 会比 + 慢这么多?我的印象是 + 和 operator.add 运行相同的 C 代码(我检查以确保 operator.add 不只是在 python 或任何东西中调用 +).

What's going on here? Obviously, reduce does loop faster than for, but the function call seems to dominate. Shouldn't the reduce version run almost entirely in C? Using iadd(c,i) in the for loop version makes it run in ~24 seconds. Why would using operator.add be so much slower than +? I was under the impression + and operator.add run the same C code (I checked to make sure operator.add wasn't just calling + in python or anything).

顺便说一句,只需使用 sum 即可在约 2.3 秒内运行.

BTW, just using sum runs in ~2.3 seconds.

In [70]: print(sys.version)
2.7.1 (r271:86882M, Nov 30 2010, 09:39:13) 
[GCC 4.0.1 (Apple Inc. build 5494)]

推荐答案

reduce(add, r) 必须调用 add() 函数 100 次,所以函数调用的开销加起来——reduce 使用 PyEval_CallObject 在每次迭代时调用 add:

The reduce(add, r) must invoke the add() function 100 times, so the overhead of the function calls adds up -- reduce uses PyEval_CallObject to invoke add on each iteration:

for (;;) {
    ...
    if (result == NULL)
        result = op2;
    else {
        # here it is creating a tuple to pass the previous result and the next
        # value from range(100) into func add():
        PyTuple_SetItem(args, 0, result);
        PyTuple_SetItem(args, 1, op2);
        if ((result = PyEval_CallObject(func, args)) == NULL)
            goto Fail;
    }

更新:回复评论中的问题.

当您在 Python 源代码中键入 1 + 2 时,字节码编译器会执行添加并用 3 替换该表达式:

When you type 1 + 2 in Python source code, the bytecode compiler performs the addition in place and replaces that expression with 3:

f1 = lambda: 1 + 2
c1 = byteplay.Code.from_code(f1.func_code)
print c1.code

1           1 LOAD_CONST           3
            2 RETURN_VALUE         

如果添加两个变量 a + b 编译器将生成字节码,加载这两个变量并执行 BINARY_ADD,这比调用函数执行加法要快得多:

If you add two variables a + b the compiler will generate bytecode which loads the two variables and performs a BINARY_ADD, which is far faster than calling a function to perform the addition:

f2 = lambda a, b: a + b
c2 = byteplay.Code.from_code(f2.func_code)
print c2.code

1           1 LOAD_FAST            a
            2 LOAD_FAST            b
            3 BINARY_ADD           
            4 RETURN_VALUE         

这篇关于用 for 循环求和比用 reduce 更快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆