在Python 3.6及更高版本中按位置访问字典项 [英] Accessing dictionary items by position in Python 3.6+ efficiently

查看:128
本文介绍了在Python 3.6及更高版本中按位置访问字典项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我了解字典是在Python 3.6+中排序的插入,作为3.6中的实现细节,而3.7+中则为官方.

I understand dictionaries are insertion ordered in Python 3.6+, as an implementation detail in 3.6 and official in 3.7+.

鉴于它们是有序的,因此似乎不存在通过插入顺序检索字典中第 i th 项的方法,这似乎很奇怪.可用的仅解决方案似乎具有O( n )复杂度,或者:

Given they are ordered, it seems strange that no methods exist to retrieve the ith item of a dictionary by insertion order. The only solutions available appear to have O(n) complexity, either:

  1. 通过O( n )进程转换为列表,然后使用list.__getitem__.
  2. enumerate字典项在循环中,并在达到所需索引时返回该值.同样,时间复杂度为O( n ).
  1. Convert to a list via an O(n) process and then use list.__getitem__.
  2. enumerate dictionary items in a loop and return the value when the desired index is reached. Again, with O(n) time complexity.

由于从list获取项目的复杂度为O(1),有没有办法用字典实现相同的复杂度?使用常规dictcollections.OrderedDict都可以.

Since getting an item from a list has O(1) complexity, is there a way to achieve the same complexity with dictionaries? Either with regular dict or collections.OrderedDict would work.

如果不可能,是出于结构原因阻止了这种方法,还是仅仅是尚未考虑/实现的功能?

If it's not possible, is there a structural reason preventing such a method, or is this just a feature which has not yet been considered / implemented?

推荐答案

对于OrderedDict而言,其本质上是O(n),因为排序记录在

For an OrderedDict it's inherently O(n) because the ordering is recorded in a linked list.

对于内置dict,有一个向量(一个连续的数组)而不是一个链表,但是最后几乎是一样的:向量包含几种假人",特殊的内部值表示没有".密钥已存储在此处"或曾经存储在此处但不再存储的密钥".这样一来,例如,删除密钥就变得非常便宜(只需用虚拟值覆盖密钥即可).

For the builtin dict, there's a vector (a contiguous array) rather than a linked list, but pretty much the same thing in the end: the vector contains a few kind of "dummies", special internal values that mean "no key has been stored here yet" or "a key used to be stored here but no longer". That makes, e.g., deleting a key extremely cheap (just overwrite the key with a dummy value).

但是,如果不在其上添加辅助数据结构,就无法跳过虚拟变量而不一次移动它们.由于Python使用一种开放式寻址的形式来解决冲突,并将负载因子保持在2/3以下,因此向量的条目 中至少有三分之一是虚拟变量. the_vector[i]可以在O(1)时间访问,但实际上与第i个非虚拟条目没有可预测的关系.

But without adding auxiliary data structures on top of that, there's no way to skip over the dummies without marching over them one at a time. Because Python uses a form of open addressing for collision resolution, and keeps the load factor under 2/3, at least a third of the vector's entries are dummies. the_vector[i] can be accessed in O(1) time, but really has no predictable relation to the i'th non-dummy entry.

这篇关于在Python 3.6及更高版本中按位置访问字典项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆