Python 3 中 len(set) 与 set.__len__() 的性能分析 [英] Profiled performance of len(set) vs. set.__len__() in Python 3
问题描述
在分析我的 Python 应用程序时,我发现 len()
在使用集合时似乎是一个非常昂贵的应用程序.看下面的代码:
While profiling my Python's application, I've discovered that len()
seems to be a very expensive one when using sets. See the below code:
import cProfile
def lenA(s):
for i in range(1000000):
len(s);
def lenB(s):
for i in range(1000000):
s.__len__();
def main():
s = set();
lenA(s);
lenB(s);
if __name__ == "__main__":
cProfile.run("main()","stats");
根据下面分析器的统计数据,lenA()
似乎比 lenB()
慢 14 倍:
According to profiler's stats below, lenA()
seems to be 14 times slower than lenB()
:
ncalls tottime percall cumtime percall filename:lineno(function)
1 1.986 1.986 3.830 3.830 .../lentest.py:5(lenA)
1000000 1.845 0.000 1.845 0.000 {built-in method len}
1 0.273 0.273 0.273 0.273 .../lentest.py:9(lenB)
我错过了什么吗?目前我使用 __len__()
而不是 len()
,但代码看起来很脏:(
Am I missing something? Currently I use __len__()
instead of len()
, but the code looks dirty :(
推荐答案
显然,len
有一些开销,因为它执行函数调用并将 AttributeError
转换为 类型错误
.此外,set.__len__
是一个如此简单的操作,与几乎任何东西相比,它肯定会非常快,但是在使用 timeit<时,我仍然没有发现任何类似 14 倍差异的东西/代码>:
Obviously, len
has some overhead, since it does a function call and translates AttributeError
to TypeError
. Also, set.__len__
is such a simple operation that it's bound to be very fast in comparison to just about anything, but I still don't find anything like the 14x difference when using timeit
:
In [1]: s = set()
In [2]: %timeit s.__len__()
1000000 loops, best of 3: 197 ns per loop
In [3]: %timeit len(s)
10000000 loops, best of 3: 130 ns per loop
你应该总是只调用 len
,而不是 __len__
.如果对 len
的调用是你程序的瓶颈,你应该重新考虑它的设计,例如在某处缓存大小或在不调用 len
的情况下计算它们.
You should always just call len
, not __len__
. If the call to len
is the bottleneck in your program, you should rethink its design, e.g. cache sizes somewhere or calculate them without calling len
.
这篇关于Python 3 中 len(set) 与 set.__len__() 的性能分析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!