为什么在Python中减法比加法要快? [英] Why is subtraction faster than addition in Python?

查看:163
本文介绍了为什么在Python中减法比加法要快?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在优化一些Python代码,并尝试了以下实验:

I was optimising some Python code, and tried the following experiment:

import time

start = time.clock()
x = 0
for i in range(10000000):
    x += 1
end = time.clock()

print '+=',end-start

start = time.clock()
x = 0
for i in range(10000000):
    x -= -1
end = time.clock()

print '-=',end-start

第二个循环可靠地更快,从晶须到10%不等,具体取决于运行它的系统。我尝试过改变循环的顺序,执行的次数等,但它似乎仍然有效。

The second loop is reliably faster, anywhere from a whisker to 10%, depending on the system I run it on. I've tried varying the order of the loops, number of executions etc, and it still seems to work.

Stranger,

for i in range(10000000, 0, -1):

(即向后运行循环)比

for i in range(10000000):

即使循环内容相同。

什么给出了,这里还有更通用的编程课程吗?

What gives, and is there a more general programming lesson here?

推荐答案

我可以在我的Q6600上重现它(Python 2.6.2) ;将范围增加到100000000:

I can reproduce this on my Q6600 (Python 2.6.2); increasing the range to 100000000:

('+=', 11.370000000000001)
('-=', 10.769999999999998)

首先,一些观察:


  • 这是微不足道的操作的5%。

  • 本机加法和减法操作码的速度无关紧要。它在本底噪声中,与字节码评估完全相形见.。就是说大约一千条本地指令。

  • 字节码生成的指令数量完全相同;唯一的区别是 INPLACE_ADD INPLACE_SUBTRACT 和+1与-1。

  • This is 5% for a trivial operation. That's significant.
  • The speed of the native addition and subtraction opcodes is irrelevant. It's in the noise floor, completely dwarfed by the bytecode evaluation. That's talking about one or two native instructions around thousands.
  • The bytecode generates exactly the same number of instructions; the only difference is INPLACE_ADD vs. INPLACE_SUBTRACT and +1 vs -1.

看一下Python源代码,我可以猜测一下。这在ceval.c中的 PyEval_EvalFrameEx 中进行处理。 INPLACE_ADD 有很多额外的代码块,用于处理字符串连接。该块在 INPLACE_SUBTRACT 中不存在,因为您不能减去字符串。这意味着 INPLACE_ADD 包含更多的本机代码。取决于(大量!)编译器生成代码的方式,这些额外的代码可能与INPLACE_ADD代码的其余部分内联,这意味着添加内容对指令缓存的影响比减法要大。 可能会导致额外的L2缓存命中,这可能会导致明显的性能差异。

Looking at the Python source, I can make a guess. This is handled in ceval.c, in PyEval_EvalFrameEx. INPLACE_ADD has a significant extra block of code, to handle string concatenation. That block doesn't exist in INPLACE_SUBTRACT, since you can't subtract strings. That means INPLACE_ADD contains more native code. Depending (heavily!) on how the code is being generated by the compiler, this extra code may be inline with the rest of the INPLACE_ADD code, which means additions can hit the instruction cache harder than subtraction. This could be causing extra L2 cache hits, which could cause a significant performance difference.

这在很大程度上取决于您所使用的系统(不同的处理器具有不同数量的缓存和缓存体系结构),正在使用的编译器,包括特定的版本和编译选项(不同的编译器将决定关键路径上的哪些代码位,这决定了如何将汇编代码集中在一起)

This is heavily dependent on the system you're on (different processors have different amounts of cache and cache architectures), the compiler in use, including the particular version and compilation options (different compilers will decide differently which bits of code are on the critical path, which determines how assembly code is lumped together), and so on.

此外,差异在Python 3.0.1中是相反的(+:15.66,-:16.71);毫无疑问,这个关键功能已经发生了很大变化。

Also, the difference is reversed in Python 3.0.1 (+: 15.66, -: 16.71); no doubt this critical function has changed a lot.

这篇关于为什么在Python中减法比加法要快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆