为什么提早归还比别的要慢? [英] Why is early return slower than else?

查看:64
本文介绍了为什么提早归还比别的要慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是对我几天前给出的答案的后续问题. 编辑:该问题的OP似乎已经使用了我发布给他的代码来问同一问题,但我没有意识到.道歉.虽然提供的答案有所不同!

This is a follow-up question to an answer I gave a few days back. it seems that the OP of that question already used the code I posted to him to ask the same question, but I was unaware of it. Apologies. The answers provided are different though!

我基本上观察到:

>>> def without_else(param=False):
...     if param:
...         return 1
...     return 0
>>> def with_else(param=False):
...     if param:
...         return 1
...     else:
...         return 0
>>> from timeit import Timer as T
>>> T(lambda : without_else()).repeat()
[0.3011460304260254, 0.2866089344024658, 0.2871549129486084]
>>> T(lambda : with_else()).repeat()
[0.27536892890930176, 0.2693932056427002, 0.27011704444885254]
>>> T(lambda : without_else(True)).repeat()
[0.3383951187133789, 0.32756996154785156, 0.3279120922088623]
>>> T(lambda : with_else(True)).repeat()
[0.3305950164794922, 0.32186388969421387, 0.3209099769592285]

...或者换句话说:无论是否触发if条件,使用else子句都更快.

...or in other words: having the else clause is faster regardless of the if condition being triggered or not.

我认为这与两者生成的不同字节码有关,但是有人能详细确认/解释吗?

I assume it has to do with different bytecode generated by the two, but is anybody able to confirm/explain in detail?

编辑:似乎并不是每个人都可以重现我的计时,所以我认为在我的系统上提供一些信息可能很有用.我正在运行默认安装了python的Ubuntu 11.10 64位. python生成以下版本信息:

Seems not everybody is able to reproduce my timings, so I thought it might be useful to give some info on my system. I'm running Ubuntu 11.10 64 bit with the default python installed. python generates the following version information:

Python 2.7.2+ (default, Oct  4 2011, 20:06:09) 
[GCC 4.6.1] on linux2

这是Python 2.7中反汇编的结果:

Here are the results of the disassembly in Python 2.7:

>>> dis.dis(without_else)
  2           0 LOAD_FAST                0 (param)
              3 POP_JUMP_IF_FALSE       10

  3           6 LOAD_CONST               1 (1)
              9 RETURN_VALUE        

  4     >>   10 LOAD_CONST               2 (0)
             13 RETURN_VALUE        
>>> dis.dis(with_else)
  2           0 LOAD_FAST                0 (param)
              3 POP_JUMP_IF_FALSE       10

  3           6 LOAD_CONST               1 (1)
              9 RETURN_VALUE        

  5     >>   10 LOAD_CONST               2 (0)
             13 RETURN_VALUE        
             14 LOAD_CONST               0 (None)
             17 RETURN_VALUE        

推荐答案

这是一个纯粹的猜测,我还没有找到一种简单的方法来检查它是否正确,但是我有一个理论适合您.

This is a pure guess, and I haven't figured out an easy way to check whether it is right, but I have a theory for you.

我尝试了您的代码并获得了相同的结果,without_else()反复比with_else()慢一点:

I tried your code and get the same of results, without_else() is repeatedly slightly slower than with_else():

>>> T(lambda : without_else()).repeat()
[0.42015745017874906, 0.3188967452567226, 0.31984281521812363]
>>> T(lambda : with_else()).repeat()
[0.36009842032996175, 0.28962249392031936, 0.2927151355828528]
>>> T(lambda : without_else(True)).repeat()
[0.31709728471076915, 0.3172671387005721, 0.3285821242644147]
>>> T(lambda : with_else(True)).repeat()
[0.30939889008243426, 0.3035132258429485, 0.3046679117038593]

考虑到字节码是相同的,唯一的区别是函数的名称.尤其是时序测试会在全局名称上进行查找.尝试重命名without_else(),区别消失:

Considering that the bytecode is identical, the only difference is the name of the function. In particular the timing test does a lookup on the global name. Try renaming without_else() and the difference disappears:

>>> def no_else(param=False):
    if param:
        return 1
    return 0

>>> T(lambda : no_else()).repeat()
[0.3359846013948413, 0.29025818923918223, 0.2921801513879245]
>>> T(lambda : no_else(True)).repeat()
[0.3810395594970828, 0.2969634408842694, 0.2960104566362247]

我的猜测是without_elseglobals()中的其他内容存在哈希冲突,因此全局名称查找稍微慢一些.

My guess is that without_else has a hash collision with something else in globals() so the global name lookup is slightly slower.

编辑:具有7个或8个键的字典可能具有32个插槽,因此without_else__builtins__发生哈希冲突:

Edit: A dictionary with 7 or 8 keys probably has 32 slots, so on that basis without_else has a hash collision with __builtins__:

>>> [(k, hash(k) % 32) for k in globals().keys() ]
[('__builtins__', 8), ('with_else', 9), ('__package__', 15), ('without_else', 8), ('T', 21), ('__name__', 25), ('no_else', 28), ('__doc__', 29)]

澄清散列的工作原理:

__builtins__散列为-1196389688,这减小了表大小(32)的模数,这意味着它存储在表的#8插槽中.

__builtins__ hashes to -1196389688 which reduced modulo the table size (32) means it is stored in the #8 slot of the table.

without_else哈希到505688136,将模32的模数减为8,因此发生冲突.要解决此Python计算问题,请执行以下操作:

without_else hashes to 505688136 which reduced modulo 32 is 8 so there's a collision. To resolve this Python calculates:

开始于:

j = hash % 32
perturb = hash

重复此操作,直到找到可用的插槽:

Repeat this until we find a free slot:

j = (5*j) + 1 + perturb;
perturb >>= 5;
use j % 2**i as the next table index;

赋予它17作为下一个索引.幸运的是,它是免费的,因此循环仅重复一次.哈希表的大小为2的幂,因此2**i是哈希表的大小,i是哈希值j中使用的位数.

which gives it 17 to use as the next index. Fortunately that's free so the loop only repeats once. The hash table size is a power of 2, so 2**i is the size of the hash table, i is the number of bits used from the hash value j.

对该表的每个探测都可以找到以下之一:

Each probe into the table can find one of these:

  • 插槽为空,在这种情况下,探测停止,我们知道 值不在表格中.

  • The slot is empty, in that case the probing stops and we know the value is not in the table.

该插槽尚未使用,但在过去使用过,在这种情况下,我们可以尝试 下一个如上计算得出的值.

The slot is unused but was used in the past in which case we go try the next value calculated as above.

插槽已满,但表中存储的完整哈希值不完整 与我们要查找的键的哈希值相同(这就是 发生在__builtins__without_else的情况下.)

The slot is full but the full hash value stored in the table isn't the same as the hash of the key we are looking for (that's what happens in the case of __builtins__ vs without_else).

插槽已满,并且具有我们想要的哈希值,然后是Python 检查以查看键和我们要查找的对象是否为 同一对象(在这种情况下,这是因为短字符串 可能是标识符被拘禁,因此使用相同的标识符 完全相同的字符串).

The slot is full and has exactly the hash value we want, then Python checks to see if the key and the object we are looking up are the same object (which in this case they will be because short strings that could be identifiers are interned so identical identifiers use the exact same string).

最后,当插槽已满时,哈希值完全匹配,但是键 不是相同的对象,那么只有这样,Python才会尝试 比较它们是否平等.这比较慢,但是在 名称查询实际上不应该发生.

Finally when the slot is full, the hash matches exactly, but the keys are not the identical object, then and only then will Python try comparing them for equality. This is comparatively slow, but in the case of name lookups shouldn't actually happen.

这篇关于为什么提早归还比别的要慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆