为什么 [] 比 list() 快? [英] Why is [] faster than list()?

查看:31
本文介绍了为什么 [] 比 list() 快?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近比较了 []list() 的处理速度,并惊讶地发现 [] 运行更多比 list() 快三倍.我使用 {}dict() 运行了相同的测试,结果几乎相同:[]{} 都需要大约 0.128 秒/百万周期,而 list()dict() 每个大约需要 0.428 秒/百万周期.

这是为什么?[]{}(可能还有 ()'')立即传回一个副本一些空的股票文字,而它们明确命名的对应物(list()dict()tuple()str()) 完全去创建一个对象,不管它们是否真的有元素?

我不知道这两种方法有何不同,但我很想知道.我在文档或 SO 上找不到答案,结果发现搜索空括号比我预期的要麻烦.

我通过调用 timeit.timeit("[]")timeit.timeit("list()")timeit 得到我的计时结果.timeit("{}")timeit.timeit("dict()"),分别用于比较列表和字典.我正在运行 Python 2.7.9.

我最近发现为什么 if True 比 if 1 慢?" 比较了 if Trueif 1 的性能,似乎触及了类似的文字与全局场景;也许也值得考虑.

解决方案

因为 []{}文字语法.Python 可以创建字节码来创建列表或字典对象:

<预><代码>>>>导入文件>>>dis.dis(compile('[]', '', 'eval'))1 0 BUILD_LIST 03 RETURN_VALUE>>>dis.dis(compile('{}', '', 'eval'))1 0 BUILD_MAP 03 RETURN_VALUE

list()dict() 是独立的对象.它们的名称需要解析,必须涉及堆栈以推送参数,必须存储帧以供稍后检索,并且必须进行调用.这一切都需要更多时间.

对于空的情况,这意味着您至少有一个 LOAD_NAME(必须搜索全局命名空间以及 builtins 模块) 后跟一个 CALL_FUNCTION,必须保留当前帧:

<预><代码>>>>dis.dis(compile('list()', '', 'eval'))1 0 LOAD_NAME 0(列表)3 CALL_FUNCTION 06 RETURN_VALUE>>>dis.dis(compile('dict()', '', 'eval'))1 0 LOAD_NAME 0 (字典)3 CALL_FUNCTION 06 RETURN_VALUE

您可以使用 timeit 单独计时名称查找:

<预><代码>>>>导入时间>>>timeit.timeit('list', number=10**7)0.30749011039733887>>>timeit.timeit('dict', number=10**7)0.4215109348297119

时间差异可能是字典哈希冲突.从调用这些对象的时间中减去这些时间,并将结果与​​使用文字的时间进行比较:

<预><代码>>>>timeit.timeit('[]', number=10**7)0.30478692054748535>>>timeit.timeit('{}', number=10**7)0.31482696533203125>>>timeit.timeit('list()', number=10**7)0.9991960525512695>>>timeit.timeit('dict()', number=10**7)1.0200958251953125

因此,每 1000 万次调用需要额外的 1.00 - 0.31 - 0.30 == 0.39 秒.

您可以通过将全局名称别名为本地名称来避免全局查找成本(使用 timeit 设置,您绑定到名称的所有内容都是本地名称):

<预><代码>>>>timeit.timeit('_list', '_list = list', number=10**7)0.1866450309753418>>>timeit.timeit('_dict', '_dict = dict', number=10**7)0.19016098976135254>>>timeit.timeit('_list()', '_list = list', number=10**7)0.841480016708374>>>timeit.timeit('_dict()', '_dict = dict', number=10**7)0.7233691215515137

但你永远无法克服CALL_FUNCTION成本.

I recently compared the processing speeds of [] and list() and was surprised to discover that [] runs more than three times faster than list(). I ran the same test with {} and dict() and the results were practically identical: [] and {} both took around 0.128sec / million cycles, while list() and dict() took roughly 0.428sec / million cycles each.

Why is this? Do [] and {} (and probably () and '', too) immediately pass back a copies of some empty stock literal while their explicitly-named counterparts (list(), dict(), tuple(), str()) fully go about creating an object, whether or not they actually have elements?

I have no idea how these two methods differ but I'd love to find out. I couldn't find an answer in the docs or on SO, and searching for empty brackets turned out to be more problematic than I'd expected.

I got my timing results by calling timeit.timeit("[]") and timeit.timeit("list()"), and timeit.timeit("{}") and timeit.timeit("dict()"), to compare lists and dictionaries, respectively. I'm running Python 2.7.9.

I recently discovered "Why is if True slower than if 1?" that compares the performance of if True to if 1 and seems to touch on a similar literal-versus-global scenario; perhaps it's worth considering as well.

解决方案

Because [] and {} are literal syntax. Python can create bytecode just to create the list or dictionary objects:

>>> import dis
>>> dis.dis(compile('[]', '', 'eval'))
  1           0 BUILD_LIST               0
              3 RETURN_VALUE        
>>> dis.dis(compile('{}', '', 'eval'))
  1           0 BUILD_MAP                0
              3 RETURN_VALUE        

list() and dict() are separate objects. Their names need to be resolved, the stack has to be involved to push the arguments, the frame has to be stored to retrieve later, and a call has to be made. That all takes more time.

For the empty case, that means you have at the very least a LOAD_NAME (which has to search through the global namespace as well as the builtins module) followed by a CALL_FUNCTION, which has to preserve the current frame:

>>> dis.dis(compile('list()', '', 'eval'))
  1           0 LOAD_NAME                0 (list)
              3 CALL_FUNCTION            0
              6 RETURN_VALUE        
>>> dis.dis(compile('dict()', '', 'eval'))
  1           0 LOAD_NAME                0 (dict)
              3 CALL_FUNCTION            0
              6 RETURN_VALUE        

You can time the name lookup separately with timeit:

>>> import timeit
>>> timeit.timeit('list', number=10**7)
0.30749011039733887
>>> timeit.timeit('dict', number=10**7)
0.4215109348297119

The time discrepancy there is probably a dictionary hash collision. Subtract those times from the times for calling those objects, and compare the result against the times for using literals:

>>> timeit.timeit('[]', number=10**7)
0.30478692054748535
>>> timeit.timeit('{}', number=10**7)
0.31482696533203125
>>> timeit.timeit('list()', number=10**7)
0.9991960525512695
>>> timeit.timeit('dict()', number=10**7)
1.0200958251953125

So having to call the object takes an additional 1.00 - 0.31 - 0.30 == 0.39 seconds per 10 million calls.

You can avoid the global lookup cost by aliasing the global names as locals (using a timeit setup, everything you bind to a name is a local):

>>> timeit.timeit('_list', '_list = list', number=10**7)
0.1866450309753418
>>> timeit.timeit('_dict', '_dict = dict', number=10**7)
0.19016098976135254
>>> timeit.timeit('_list()', '_list = list', number=10**7)
0.841480016708374
>>> timeit.timeit('_dict()', '_dict = dict', number=10**7)
0.7233691215515137

but you never can overcome that CALL_FUNCTION cost.

这篇关于为什么 [] 比 list() 快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆