为什么显式调用魔术方法比“加糖"慢?句法? [英] Why are explicit calls to magic methods slower than "sugared" syntax?
问题描述
当我遇到一组看起来很奇怪的计时结果时,我正在处理一个需要可散列、可比较和快速的小型自定义数据对象.这个对象的一些比较(和散列方法)只是委托给一个属性,所以我使用了类似的东西:
def __hash__(self):返回 self.foo.__hash__()
但是经过测试,我发现 hash(self.foo)
明显更快.出于好奇,我测试了 __eq__
、__ne__
和其他神奇的比较,结果发现如果我使用含糖形式,所有 都运行得更快(==
、!=
、<
等).为什么是这样?我认为加糖的形式必须在幕后进行相同的函数调用,但也许事实并非如此?
时间结果
设置:围绕控制所有比较的实例属性的薄包装.
Python 3.3.4 (v3.3.4:7ff62415e426, Feb 10 2014, 18:13:51) [MSC v.1600 64 位 (AMD64)] on win32输入帮助"、版权"、信用"或许可"以获取更多信息.>>>导入时间>>>>>>糖设置 = '''\... 导入日期时间...类瘦(对象):... def __init__(self, f):... self._foo = f... def __hash__(self):...返回哈希(self._foo)... def __eq__(self, other):...返回 self._foo == other._foo... def __ne__(self, other):...返回 self._foo != other._foo... def __lt__(self, other):...返回 self._foo <其他._foo... def __gt__(self, other):...返回 self._foo >其他._foo...'''>>>显式设置 = '''\... 导入日期时间...类瘦(对象):... def __init__(self, f):... self._foo = f... def __hash__(self):...返回 self._foo.__hash__()... def __eq__(self, other):...返回 self._foo.__eq__(other._foo)... def __ne__(self, other):...返回 self._foo.__ne__(other._foo)... def __lt__(self, other):...返回 self._foo.__lt__(other._foo)... def __gt__(self, other):...返回 self._foo.__gt__(other._foo)...'''
测试
我的自定义对象包装了一个 datetime
,所以这就是我使用的,但它应该没有任何区别.是的,我正在测试中创建日期时间,因此显然那里有一些相关的开销,但是从一个测试到另一个测试的开销是恒定的,所以它不应该有什么不同.为简洁起见,我省略了 __ne__
和 __gt__
测试,但这些结果与此处显示的结果基本相同.
结果
<预><代码>>>>分钟(timeit.repeat(test_hash,explicit_setup,number=1000,repeat=20))1.0805227295846862>>>分钟(timeit.repeat(test_hash,sugar_setup,number=1000,repeat=20))1.0135617737162192>>>min(timeit.repeat(test_eq,explicit_setup,number=1000,repeat=20))2.349765956168767>>>分钟(timeit.repeat(test_eq,sugar_setup,number=1000,repeat=20))2.1486044757355103>>>分钟(timeit.repeat(test_lt,explicit_setup,number=500,repeat=20))1.156479287717275>>>分钟(timeit.repeat(test_lt,sugar_setup,number=500,repeat=20))1.0673696685109917- 哈希:
- 显式: 1.0805227295846862
- 加糖: 1.0135617737162192
- 相等:
- 显式: 2.349765956168767
- 加糖: 2.1486044757355103
- 小于:
- 显式: 1.156479287717275
- 加糖: 1.0673696685109917
两个原因:
API 查找仅查看类型.他们不看
self.foo.__hash__
,他们寻找type(self.foo).__hash__
.少了一本字典要查.C 槽查找比纯 Python 属性查找(将使用
__getattribute__
)快;而查找方法对象(包括描述符绑定)完全在 C 中完成,绕过__getattribute__
.
所以你必须在本地缓存 type(self._foo).__hash__
查找,即使那样调用也不会像 C 代码那样快.如果速度非常重要,请坚持使用标准库函数.
避免直接调用魔术方法的另一个原因是,比较操作符更多不仅仅是调用一个魔术方法;这些方法也有反映版本;对于 <代码> x x.__lt__
未定义或 x.__lt__(y)
返回 NotImplemented
单例,y.__gt__(x)
也被咨询.
I was messing around with a small custom data object that needs to be hashable, comparable, and fast, when I ran into an odd-looking set of timing results. Some of the comparisons (and the hashing method) for this object simply delegate to an attribute, so I was using something like:
def __hash__(self):
return self.foo.__hash__()
However upon testing, I discovered that hash(self.foo)
is noticeably faster. Curious, I tested __eq__
, __ne__
, and the other magic comparisons, only to discover that all of them ran faster if I used the sugary forms (==
, !=
, <
, etc.). Why is this? I assumed the sugared form would have to make the same function call under the hood, but perhaps this isn't the case?
Timeit results
Setups: thin wrappers around an instance attribute that controls all the comparisons.
Python 3.3.4 (v3.3.4:7ff62415e426, Feb 10 2014, 18:13:51) [MSC v.1600 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import timeit
>>>
>>> sugar_setup = '''\
... import datetime
... class Thin(object):
... def __init__(self, f):
... self._foo = f
... def __hash__(self):
... return hash(self._foo)
... def __eq__(self, other):
... return self._foo == other._foo
... def __ne__(self, other):
... return self._foo != other._foo
... def __lt__(self, other):
... return self._foo < other._foo
... def __gt__(self, other):
... return self._foo > other._foo
... '''
>>> explicit_setup = '''\
... import datetime
... class Thin(object):
... def __init__(self, f):
... self._foo = f
... def __hash__(self):
... return self._foo.__hash__()
... def __eq__(self, other):
... return self._foo.__eq__(other._foo)
... def __ne__(self, other):
... return self._foo.__ne__(other._foo)
... def __lt__(self, other):
... return self._foo.__lt__(other._foo)
... def __gt__(self, other):
... return self._foo.__gt__(other._foo)
... '''
Tests
My custom object is wrapping a datetime
, so that's what I used, but it shouldn't make any difference. Yes, I'm creating the datetimes within the tests, so there's obviously some associated overhead there, but that overhead is constant from one test to another so it shouldn't make a difference. I've omitted the __ne__
and __gt__
tests for brevity, but those results were essentially identical to the ones shown here.
>>> test_hash = '''\
... for i in range(1, 1000):
... hash(Thin(datetime.datetime.fromordinal(i)))
... '''
>>> test_eq = '''\
... for i in range(1, 1000):
... a = Thin(datetime.datetime.fromordinal(i))
... b = Thin(datetime.datetime.fromordinal(i+1))
... a == a # True
... a == b # False
... '''
>>> test_lt = '''\
... for i in range(1, 1000):
... a = Thin(datetime.datetime.fromordinal(i))
... b = Thin(datetime.datetime.fromordinal(i+1))
... a < b # True
... b < a # False
... '''
Results
>>> min(timeit.repeat(test_hash, explicit_setup, number=1000, repeat=20))
1.0805227295846862
>>> min(timeit.repeat(test_hash, sugar_setup, number=1000, repeat=20))
1.0135617737162192
>>> min(timeit.repeat(test_eq, explicit_setup, number=1000, repeat=20))
2.349765956168767
>>> min(timeit.repeat(test_eq, sugar_setup, number=1000, repeat=20))
2.1486044757355103
>>> min(timeit.repeat(test_lt, explicit_setup, number=500, repeat=20))
1.156479287717275
>>> min(timeit.repeat(test_lt, sugar_setup, number=500, repeat=20))
1.0673696685109917
- Hash:
- Explicit: 1.0805227295846862
- Sugared: 1.0135617737162192
- Equal:
- Explicit: 2.349765956168767
- Sugared: 2.1486044757355103
- Less Than:
- Explicit: 1.156479287717275
- Sugared: 1.0673696685109917
Two reasons:
The API lookups look at the type only. They don't look at
self.foo.__hash__
, they look fortype(self.foo).__hash__
. That's one less dictionary to look in.The C slot lookup is faster than the pure-Python attribute lookup (which will use
__getattribute__
); instead looking up the method objects (including the descriptor binding) is done entirely in C, bypassing__getattribute__
.
So you'd have to cache the type(self._foo).__hash__
lookup locally, and even then the call would not be as fast as from C code. Just stick to the standard library functions if speed is at a premium.
Another reason to avoid calling the magic methods directly is that the comparison operators do more than just call one magic method; the methods have reflected versions too; for x < y
, if x.__lt__
isn't defined or x.__lt__(y)
returns the NotImplemented
singleton, y.__gt__(x)
is consulted as well.
这篇关于为什么显式调用魔术方法比“加糖"慢?句法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!