在Python中的OrderedDict vs Dict [英] OrderedDict vs Dict in python

查看：246 发布时间：2017/5/21 17:39:20 python dictionary ordereddictionary

本文介绍了在Python中的OrderedDict vs Dict的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在蒂姆·彼得的回答中，有没有任何理由不使用有序字典，他说

OrderedDict是dict的子类。

这不是很慢，但至少现在，在通过一个

/ q / 25054003/1860929>特别问题，我使用 ipython 尝试了一些抽样检查，并且这两个都与早先的推理相矛盾：

dict 和 OrderedDict 的大小相同

在 OrderedDict 上运行的时间比在 dict上运行的时间要多7-8倍code>（因此很慢）

有人可以向我解释我在推理中的错误吗？ / p>

 
 
 创建一个大的Dict和OrderedDict nd比较大小：
 
 
 
  import sys 
 import random 
 from collections import OrderedDict 
 
 test_dict = {} 
 test_ordered_dict = OrderedDict（）
 
在范围（10000）中：
 test_dict [key] = random.random（）
 test_ordered_dict [key] = random.random（）
 
 sys.getsizeof（test_dict）
 786712 
 
 sys.getsizeof（test_ordered_dict）
 786712 
  
 
 
 
使用％timeit 检查插入时间： / h3> 
 
 
  import sys 
 import random 
从集合导入OrderedDict 
 
 def operate_on_dict r）：
 test_dict = {} 
在范围（r）中：
 test_dict [key] = random.random（）
 
 def operate_on_ordered_dict（r） ：
 test_ordered_dict = OrderedDict（）
范围（r）中的键值：
 test_ordered_dict [key] = random.random（）
 
％x范围内的时间（100）：operate_on_ordered_dict（100）
 100个循环，最好3：9.24 ms每循环
 
％x在范围（100）中的时间：operating_on_dict（100）
 1000循环，最好3：1.23 ms每循环
  
 
 
解决方案
我认为大小的问题是由于没有 __ sizeof __ 在Python 2.X中定义的方法执行 OrderedDict  ，所以它只是回到dict的 __ sizeof __ 方法。 / p> 
 
 
为了证明这一点，我在这里创建了一个类 A ，其中扩展了 list ，并添加了一个额外的方法 foo 来检查是否影响大小。
 $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b $ b print'abcde'
 
>>> a = A（range（1000））
>>>> b = list（range（1000））
  
但是， sys.getsizeof ：
 >>> sys.getsizeof（a），sys.getsizeof（b）
（9120，9120）
  
当然 A 将会很慢，因为它的方法在Python中运行，而列表的方法将以纯C运行。
 >>> %% timeit 
 ... for _ in xrange（1000）：
 ... a [_] 
 ... 
 1000循环，最好为3：449μsper循环
>>>> %% timeit 
 for _ in xrange（1000）：
b [_] 
 ... 
 10000循环，最佳3：52μs每循环
  
这似乎是在Python 3中修复的，其中有一个定义明确的  __ sizeof __  方法现在：
  def __sizeof __（self）：
 sizeof = _sys.getsizeof 
n = len（self）+ 1＃ root 
 size = sizeof（self .__ dict__）＃实例字典
 size + = sizeof（self .__ map）* 2＃internal dict和inherited dict 
 size + = sizeof（self .__ hardroot） * n＃link对象
 size + = sizeof（self .__ root）* n＃代理对象
返回大小
  
 
In Tim Peter's answer to "Are there any reasons not to use an ordered dictionary", he says

  OrderedDict is a subclass of dict.
  
  It's not a lot slower, but at least doubles the memory over using a plain dict.
Now, while going through a particular question, I tried some sample checks using ipython and both of them contradict the earlier reasoning:

both dict and OrderedDict are of same size
operating on an OrderedDict takes easily around 7-8 times more time than operating on a dict (Hence a lot slower)
Can someone explain to me where I'm going wrong in my reasoning?




Create a large Dict and OrderedDict and compare sizes:

import sys
import random
from collections import OrderedDict

test_dict = {}
test_ordered_dict = OrderedDict()

for key in range(10000):
    test_dict[key] = random.random()
    test_ordered_dict[key] = random.random()

sys.getsizeof(test_dict)
786712

sys.getsizeof(test_ordered_dict)
786712


Check time taken for the insertions using %timeit:

import sys
import random
from collections import OrderedDict

def operate_on_dict(r):
    test_dict = {}
    for key in range(r):
        test_dict[key] = random.random()

def operate_on_ordered_dict(r):
    test_ordered_dict = OrderedDict()
    for key in range(r):
        test_ordered_dict[key] = random.random()

%timeit for x in range(100): operate_on_ordered_dict(100)
100 loops, best of 3: 9.24 ms per loop

%timeit for x in range(100): operate_on_dict(100)
1000 loops, best of 3: 1.23 ms per loop

 解决方案 
I think the problem with size is due to the fact that there's no __sizeof__ method defined in Python 2.X implementation of OrderedDict, so it simply falls back to dict's __sizeof__ method.

To prove this here I've created a class A here which extends list and also added an additional method foo to check if that affects the size.
class A(list):
    def __getitem__(self, k):
        return list.__getitem__(self, k)
    def foo(self):
        print 'abcde'

>>> a = A(range(1000))
>>> b = list(range(1000))
But still same size is returned by sys.getsizeof:
>>> sys.getsizeof(a), sys.getsizeof(b)
(9120, 9120)
Of course A is going to be slow because its methods are running in Python while list's method will run in pure C.
>>> %%timeit
... for _ in xrange(1000):
...     a[_]
... 
1000 loops, best of 3: 449 µs per loop
>>> %%timeit
for _ in xrange(1000):
    b[_]
... 
10000 loops, best of 3: 52 µs per loop
And this seems to be fixed in Python 3 where there's a well defined __sizeof__ method now:
def __sizeof__(self):
    sizeof = _sys.getsizeof
    n = len(self) + 1                       # number of links including root
    size = sizeof(self.__dict__)            # instance dictionary
    size += sizeof(self.__map) * 2          # internal dict and inherited dict
    size += sizeof(self.__hardroot) * n     # link objects
    size += sizeof(self.__root) * n         # proxy objects
    return size


                        
这篇关于在Python中的OrderedDict vs Dict的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

在Python中的OrderedDict vs Dict [英] OrderedDict vs Dict in python

问题描述

创建一个大的Dict和OrderedDict nd比较大小：

Create a large Dict and OrderedDict and compare sizes:

Check time taken for the insertions using `%timeit`:

相关文章

Python最新文章

热门教程

热门工具

登录关闭

在Python中的OrderedDict vs Dict [英] OrderedDict vs Dict in python

问题描述

创建一个大的Dict和OrderedDict nd比较大小：

Create a large Dict and OrderedDict and compare sizes:

Check time taken for the insertions using %timeit:

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

Check time taken for the insertions using `%timeit`:

登录关闭