字典.keys()和.values()应该返回一个集合[考虑到Python 3000] [英] Dictionary .keys() and .values() should return a set [with Python 3000 in mind]

查看:82
本文介绍了字典.keys()和.values()应该返回一个集合[考虑到Python 3000]的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这一直困扰着我。只是想知道它是否只是我或者其他人也想到了这一点:为什么不应该将字典的

键集表示为一组而不是列表?我知道这些套装是在很晚以后推出的,而且列表/字典是用来代替的,但我认为是唯一正确的方法。现在是为

字典设置键值和值。目前{1:0,2:0,3:0} .keys()

将产生[1,2,3]但它也可能产生[3,1,2]或[3 ,2,1]在

不同的机器架构上。 Python文档声明:

"""

键和值列在_arbitrary_(我的重点)订单中

是非随机的,在Python实现中各不相同,并且取决于

字典的插入和删除历史。

"""


所以在同一台机器上就是这样的情况:{1:0,2:0,

3:0} .keys()== {3: 0,2:0,1:0} .keys()为True。但是如果有两个列表

同一个字典的键,一个在另一台机器上腌制,

是一个不同的任意非随机键。订购,那么钥匙将不会相互等于
。似乎这个问题可以通过

来解决,而不是返回一个集合。


值()也是如此。在这里,大多数人都认为

值不一定是唯一的,所以列表更合适,但实际上值是唯一的,它只是一个以上的密钥可以

映射到相同的值。在字典中看到的是映射

规则。这些值也没有排序,不应该是可索引的 -

它们应该是一组。就像keys()一样,一台机器上的字典中一个有序的值列表

不一定等于另一台机器上的另一个值列表

,而事实上,他们应该是



在更基础和一般的水平上,字典实际上是一个

显式函数,也称为地图。传统上称为域的传统上称为域的一组键被映射到一组值,传统上称为

''范围''。这种映射最多产生一个反射函数(即两个

或更多的键可以映射到相同的值,并且所有值都映射到

某些键)。请注意,传统的键和

值是设置而不仅仅是列表。这看起来像理论上的喋喋不休,但是在Python保持真实理论的情况下,
通常是

GoodThing(tm)。


我喜欢Python主要是因为它只用一个,正确的,

和合理的方式做某事。同样的原则应该适用于Python

本身,包括其内置的数据结构。


当然兼容性将被打破。依赖于

的任何代码返回的键()或值()的某些顺序需要更新

。有人可能会争辩说,这样的代码并不完全正确

首先要对此进行假设,并且无论如何都应该修复。


显然这个修复不应该在Python 2.X中,但是对于Python 3000来说可能是值得考虑的价值。每个人都在想什么?

解决方案

有几个很好的理由。

1 - 金手铐。 90%的时间打破旧代码很糟糕

2 - 创建一个可能会慢一些。


Python的集合似乎意味着他们将始终是哈希映射。在

这种情况​​下,一些创意哈希映射映射可以允许一个人创建一个

集而不计算哈希码(使得set hashmap具有与字典一样的
维度和规则)。

如果有意图允许Python实现使用树来设置,那么列表创建起来要快得多(O(n)时间而不是

O (nlogn))。


3 - 使用套装有时会更慢(就像使用列表有时候会慢一些)b $ b $

我不能代表您的代码,但这是最常用的按键

我的编码:

#d是一些词典

keys = d.keys()

keys.sort()
键盘中的k


#blah


集无法排序,而列表可以。如果keys()返回了一个集合,那么

我每次都必须把它变成一个列表。


有可能添加"视图" to python(一个字典的关键视图

是一个包含字典'的密钥的冻结集,每当字典更新时更新

),但我认为这是另一个主题

这超出了您的原创范围。


cm ************ @ yaho.com 写道:


>我不能代表您的代码,但这是我的编码中最常用的密钥:
#d是一些字典
keys = d.keys()
keys.sort()
用于键盘中的k:
#blah



这可以非常有效地重写为:


for k in sorted(d):

#blah


--Scott David Daniels
sc *********** @ acm.org


1 - 金手铐。 90%的时间打破旧代码很糟糕

我同意你的意见,主要是对于依赖于列表方法的代码

of keys()的结果 - 就像你一样稍后用排序显示。但是有一个

的旁注:无论如何,假设特定的密钥排序或

值的旧代码都会被破坏。所以,即使ks = {1:0,2:0,3:0} .keys()在我的机器上返回

[1,2,3]我也不应该像

''my_savings_account + ks [0]''该代码应该被修复,因为在
a不同的机器上它可能会为ks [0]产生不同的值。
< blockquote class =post_quotes>
2 - 创建一个集可能会慢一些。



从字典的键创建一个集合应该不会慢很多

因为键已经是唯一的,没有需要检查每个键

对其他键只需将它们作为一组返回。我认为这是

你的意思是让set hashmap具有相同的尺寸,并且

规则为字典一。也许字典会在内部

只是将其密钥复制到集合并返回它而不是构造为从头开始设置(带有重复检查和所有)。


> 3 - 使用套装有时更慢



再次,取决于它的使用方式。在你的情况下,你认为你

通常对键进行排序,所以列表更方便。但

不同的用例可以在检索后调用密钥()

上的不同操作。如果有人想要一个

交叉点来查找与另一个字典的公共密钥,那么设置

会更合适。 set()类型的意图是不要临时

任何人因为列表是

可索引而假设键()的排序。并且在dict类型的

文档中消除了长脚注的需要,该文档讨论了''任意

非随机排序'' - 它只需要掌握这意味着......


一般来说,我认为为了获得正确和一致的数据类型,可以接受小的性能损失,以获得正确且一致的数据类型,尤其是对于Python

即我可能不会为Perl或C争论相同。


-Nick V.

cm ************ @ yaho.com 写道:
< blockquote class =post_quotes>
有几个很好的理由。

1 - 金手铐。 90%的时间打破旧代码很糟糕

2 - 创建一个可能会慢一些。


Python的集合似乎意味着他们将始终是哈希映射。在

这种情况​​下,一些创意哈希映射映射可以允许一个人创建一个

集而不计算哈希码(使得set hashmap具有与字典一样的
维度和规则)。

如果有意图允许Python实现使用树来设置,那么列表创建起来要快得多(O(n)时间而不是

O (nlogn))。


3 - 使用套装有时会更慢(就像使用列表有时候会慢一些)b $ b $

我不能代表您的代码,但这是最常用的按键

我的编码:

#d是一些词典

keys = d.keys()

keys.sort()
键盘中的k


#blah


集无法排序,而列表可以。如果keys()返回了一个集合,那么

我每次都必须把它变成一个列表。


有可能添加"视图" to python(一个字典的关键视图

是一个包含字典'的密钥的冻结集,每当字典更新时更新

),但我认为这是另一个主题

这超出了您的原创想法的范围。


This has been bothering me for a while. Just want to find out if it
just me or perhaps others have thought of this too: Why shouldn''t the
keyset of a dictionary be represented as a set instead of a list? I
know that sets were introduced a lot later and lists/dictionaries were
used instead but I think "the only correct way" now is for the
dictionary keys and values to be sets. Presently {1:0,2:0,3:0}.keys()
will produce [1,2,3] but it could also produce [3,1,2] or [3,2,1] on a
different machine architecture. The Python documentation states that:
"""
Keys and values are listed in an _arbitrary_(my emphasis) order which
is non-random, varies across Python implementations, and depends on the
dictionary''s history of insertions and deletions.
"""

So on the same machine it will be the case that: {1:0, 2:0,
3:0}.keys() == {3:0, 2:0, 1:0}.keys() is True. But if there are 2 lists
of keys of the same dictionary, one is pickled on another machine, with
a different "arbitrary non-random" ordering, then the keys will not be
equal to each other. It seems like the problem could be solved by
returning a set instead.

The same thing goes for the values(). Here most people will argue that
values are not necessarily unique, so a list is more appropriate, but
in fact the values are unique it is just that more than one key could
map to the same value. What is ''seen'' in dictionary is the mapping
rule. Also the values are not ordered and should not be indexable --
they should be a set. Just as the keys(), one ordered list of values
from a dictionary on one machine will not necessarily be equal to
another list of values an on another machine, while in fact, they
should be.

On a more fundamental and general level, a dictionary is actually an
explicit function, also called a ''map''. A set of keys, traditionally
called a ''domain'' are mapped to a set of values, traditionally called a
''range''. This mapping produces at most a surjective function (i.e. two
or more keys can map to same value and all the values are mapped to by
some keys). Notice that the traditional counterparts to keys and
values are sets and not just lists. This seems like theory babble, but
in the case of Python staying ''true'' to the theory is usually a
GoodThing(tm).

I love Python primarily because it does something in only one, correct,
and reasonable way. The same principle should probably apply to Python
itself including to its built-in data structures.

Of course the compatibilty will be broken. Any code relying on a
certain ordering of keys() or values() returned would need to be
updated. One could argue though that such code was not entirely correct
in the first place to asssume that, and should be fixed anyway.

Obviously this fix should not be in Python 2.X but perhaps would be
worth considering for Python 3000. What does everyone think?

解决方案

There''s a few good reasons.
1 - golden handcuffs. Breaking old code is bad 90% of the time
2 - creating a set MAY be slower.

Python''s sets seem to imply to that they will always be a hash map. in
this case, some creative hash map "mapping" could allow one to create a
set without calculating hash codes (make the set hashmap have the same
dimentions and rules as the dictionary one).
If there was intent to allow Python implementations to use trees for
the set, then a list is far faster to create (O(n) time instead of
O(nlogn)).

3 - using a set is sometimes slower (just as using a list is sometimes
slower)
I can''t speak for your code, but this is the most common use of keys in
my coding:
# d is some dictionary
keys = d.keys()
keys.sort()
for k in keys:
#blah

sets cannot be sorted, while lists can. If keys() returned a set, then
I''d have to turn it into a list every time.

There''s potential to add "views" to python (a key view of a dictionary
being a frozenset containing the dictionary''s keys, which is updated
whenever the dictionary updates), but I think thats annother topic
which is out of the scope of your origional idea.


cm************@yaho.com wrote:

> I can''t speak for your code, but this is the most common use of keys in
my coding:
# d is some dictionary
keys = d.keys()
keys.sort()
for k in keys:
#blah

This you can rewrite quite effectively as:

for k in sorted(d):
#blah

--Scott David Daniels
sc***********@acm.org


1 - golden handcuffs. Breaking old code is bad 90% of the time
I agree with you on that, mostly for code that counted on list methods
of result of keys() - like you later show with sort. But there is a
side note: old code that assumed a particular ordering of the keys or
values is broken anyway. So even if ks={1:0,2:0,3:0}.keys() returns
[1,2,3] on my machine I should not do something like
''my_savings_account + ks[0]'' That code should be fixed anyway, since on
a different machine it might produce different values for ks[0].

2 - creating a set MAY be slower.

Creating a set from the dictionary''s keys should not be a lot slower
because the keys are already unique, there is no need to check each key
against the other keys just return them as a set. I assume this is
what you meant by "make the set hashmap have the same dimensions and
rules as the dictionary one". Perhaps a dictionary would internally
just copy its keys to the set and return it rather than construct as
set from scratch (with duplication checks and all).

>3 - using a set is sometimes slower

Again, depending how it is used. In your case you argue that you
usually sort the keys anyway so a list is more convinient. But
different use cases can call for differnent operations on the keys()
after they have been retrieved. What if someone wants to do an
intersection to find common keys with another dictionary, then a set
would be more appropriate. The intent of the set() type was to not temp
anyone into assuming an ordering of keys() just because a list is
indexable. And eliminate the need for a long footnote in the
documentation of the dict type that talks about ''an arbitrary
non-random ordering'' - it takes while just to grasp what that means...

In general I believe that a small performance penalty is acceptable in
order to have a correct and consistent data type, especially for Python
i.e. I might not argue the same for Perl or C.

-Nick V.

cm************@yaho.com wrote:

There''s a few good reasons.
1 - golden handcuffs. Breaking old code is bad 90% of the time
2 - creating a set MAY be slower.

Python''s sets seem to imply to that they will always be a hash map. in
this case, some creative hash map "mapping" could allow one to create a
set without calculating hash codes (make the set hashmap have the same
dimentions and rules as the dictionary one).
If there was intent to allow Python implementations to use trees for
the set, then a list is far faster to create (O(n) time instead of
O(nlogn)).

3 - using a set is sometimes slower (just as using a list is sometimes
slower)
I can''t speak for your code, but this is the most common use of keys in
my coding:
# d is some dictionary
keys = d.keys()
keys.sort()
for k in keys:
#blah

sets cannot be sorted, while lists can. If keys() returned a set, then
I''d have to turn it into a list every time.

There''s potential to add "views" to python (a key view of a dictionary
being a frozenset containing the dictionary''s keys, which is updated
whenever the dictionary updates), but I think thats annother topic
which is out of the scope of your origional idea.


这篇关于字典.keys()和.values()应该返回一个集合[考虑到Python 3000]的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆