Python的collections模块中的defaultdict是否真的比使用setdefault更快? [英] Is the defaultdict in Python's collections module really faster than using setdefault?

查看:159
本文介绍了Python的collections模块中的defaultdict是否真的比使用setdefault更快?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经看到其他Python程序员在以下用例中使用collections模块中的defaultdict:

I've seen other Python programmers use defaultdict from the collections module for the following use case:

from collections import defaultdict

s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]

def main():
    d = defaultdict(list)
    for k, v in s:
        d[k].append(v)

我通常通过使用setdefault来解决此问题:

I've typically approached this problem by using setdefault instead:

def main():
    d = {}
    for k, v in s:
        d.setdefault(k, []).append(v)

文档实际上确实声称使用defaultdict更快,但是当我测试自己时,我已经看到相反的事实:

The docs do in fact claim that using defaultdict is faster, but I've seen the opposite to be true when testing myself:

$ python -mtimeit -s "from withsetdefault import main; s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)];" "main()"
100000 loops, best of 3: 4.51 usec per loop
$ python -mtimeit -s "from withdefaultdict import main; s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)];" "main()"
100000 loops, best of 3: 5.38 usec per loop

我设置测试的方式有问题吗?

Is there something wrong with how I've set up the tests?

作为参考,我使用的是Python 2.7.3 [GCC 4.2.1(Apple Inc.版本5666)

For reference, I'm using Python 2.7.3 [GCC 4.2.1 (Apple Inc. build 5666)

推荐答案

是的,有些错误":

您已将(default)dict的创建放入语句中,而不是设置中.构造一个新的defaultdict比普通的dict昂贵,通常这不是您应该在程序中进行概要分析的瓶颈-毕竟,您一次构建了数据结构,但是却多次使用它们.

You have put the creation of the (default)dict into the statement instead of the setup. Constructing a new defaultdict is more expensive than a normal dict, and usually that's not the bottleneck you should be profiling in a program - after all, you build your data structures once but you use them many times.

如果按照以下方式进行测试,则会发现defaultdict操作确实更快:

If you do your tests like below, you see that defaultdict operations are indeed faster:

>>> import timeit
>>> setup1 = """from collections import defaultdict
... s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
... d = defaultdict(list)"""
>>> stmt1 = """for k, v in s:
...     d[k].append(v)"""
>>> setup2 = """s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
... d = {}"""
>>> stmt2 = """for k, v in s:
...     d.setdefault(k, []).append(v)"""
>>> timeit.timeit(setup=setup1, stmt=stmt1)
1.0283400125194078
>>> timeit.timeit(setup=setup2, stmt=stmt2)
1.7767367580925395

Win7 x64上的Python 2.7.3.

Python 2.7.3 on Win7 x64.

这篇关于Python的collections模块中的defaultdict是否真的比使用setdefault更快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆