处理 Python 字典的成本有多高? [英] How expensive are Python dictionaries to handle?

查看:14
本文介绍了处理 Python 字典的成本有多高?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

正如标题所述,处理 Python 词典的成本有多高?创建、插入、更新、删除,应有尽有.

As the title states, how expensive are Python dictionaries to handle? Creation, insertion, updating, deletion, all of it.

渐近时间复杂度本身很有趣,但它们与例如元组或普通列表.

Asymptotic time complexities are interesting themselves, but also how they compare to e.g. tuples or normal lists.

推荐答案

dicts (就像 sets 当你不需要为每个键关联一个值时但只需记录密钥是否存在)都经过了大量优化.从 N 个键或键/值对创建 dictO(N),取为 O(1),put 分摊 O(1) 等等.对于任何非微型容器来说,真的没有比这更好的事情了!

dicts (just like sets when you don't need to associate a value to each key but simply record if a key is present or absent) are pretty heavily optimized. Creating a dict from N keys or key/value pairs is O(N), fetching is O(1), putting is amortized O(1), and so forth. Can't really do anything substantially better for any non-tiny container!

对于小型容器,您可以使用基于 timeit 的基准轻松检查边界.例如:

For tiny containers, you can easily check the boundaries with timeit-based benchmarks. For example:

$ python -mtimeit -s'empty=()' '23 in empty'
10000000 loops, best of 3: 0.0709 usec per loop
$ python -mtimeit -s'empty=set()' '23 in empty'
10000000 loops, best of 3: 0.101 usec per loop
$ python -mtimeit -s'empty=[]' '23 in empty'
10000000 loops, best of 3: 0.0716 usec per loop
$ python -mtimeit -s'empty=dict()' '23 in empty'
10000000 loops, best of 3: 0.0926 usec per loop

这表明检查空列表或元组中的成员资格比检查空集或字典中的成员资格要快 20-30 纳秒;当每一纳秒都很重要时,这些信息可能与您有关.向上移动一点...:

this shows that checking membership in empty lists or tuples is faster, by a whopping 20-30 nanoseconds, than checking membership in empty sets or dicts; when every nanosecond matters, this info might be relevant to you. Moving up a bit...:

$ python -mtimeit -s'empty=range(7)' '23 in empty'
1000000 loops, best of 3: 0.318 usec per loop
$ python -mtimeit -s'empty=tuple(range(7))' '23 in empty'
1000000 loops, best of 3: 0.311 usec per loop
$ python -mtimeit -s'empty=set(range(7))' '23 in empty'
10000000 loops, best of 3: 0.109 usec per loop
$ python -mtimeit -s'empty=dict.fromkeys(range(7))' '23 in empty'
10000000 loops, best of 3: 0.0933 usec per loop

您会看到,对于 7 项容器(不包括感兴趣的容器),性能平衡发生了变化,现在 dicts 和 set 具有数百纳秒的优势.当感兴趣的项目存在时:

you see that for 7-items containers (not including the one of interest) the balance of performance has shifted, and now dicts and sets have the advantages by HUNDREDS of nanoseconds. When the item of interest IS present:

$ python -mtimeit -s'empty=range(7)' '5 in empty'
1000000 loops, best of 3: 0.246 usec per loop
$ python -mtimeit -s'empty=tuple(range(7))' '5 in empty'
1000000 loops, best of 3: 0.25 usec per loop
$ python -mtimeit -s'empty=dict.fromkeys(range(7))' '5 in empty'
10000000 loops, best of 3: 0.0921 usec per loop
$ python -mtimeit -s'empty=set(range(7))' '5 in empty'
10000000 loops, best of 3: 0.112 usec per loop

dicts 和 set 不会获得太多,但是 tuples 和 list 可以,尽管 dicts 和 set 仍然要快得多.

dicts and sets don't gain much, but tuples and list do, even though dicts and set remain vastly faster.

依此类推——timeit 使运行微基准测试变得非常容易(严格来说,仅适用于纳秒很重要的极少数情况,但是,很容易这样做,检查其他情况并没有太大困难;-)

And so on, and so forth -- timeit makes it trivially easy to run micro-benchmarks (strictly speaking, warranted only for those exceedingly rare situations where nanoseconds DO matter, but, easy enough to do, that it's no big hardship to check for OTHER cases;-).

这篇关于处理 Python 字典的成本有多高?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆