Python dict如何具有多个具有相同哈希值的密钥? [英] How can Python dict have multiple keys with same hash?

查看:201
本文介绍了Python dict如何具有多个具有相同哈希值的密钥?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试了解引擎盖下的python哈希函数。我创建了一个自定义类,其中所有实例返回相同的哈希值。

  class C(object):
def __hash __(self):
return 42

我只是假设上述类只有一个实例可以在一个集合中,但实际上是一个集合可以有多个具有相同哈希值的元素。

  c,d = C(),C()
x = {c :'c',d:'d'}
print x
#{< __ main __。C object at 0x83e98cc>:'c',< __ main __ C object at 0x83e98ec& }
#注意,dict有2个元素

我还尝试了一点,发现如果我覆盖 __ eq __ 方法,使得类的所有实例比较相等,那么该集只允许一个实例。



$($)$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $
p,q = D(),D()
y = {p:'p',q:'q'}
print y
#{& 0x8817acc>]:'q'}
#注意dict只有1个元素

所以我很好奇知道一个dict如何具有多个元素具有相同的哈希。谢谢!



注意:编辑问题来给出dict(而不是set)的例子,因为答案中的所有讨论都是关于dicts的。但同样适用于套;

解决方案

有关Python散列如何工作的详细说明,请参阅我的答案<一个href =http://stackoverflow.com/questions/8271139>为什么早期返回比其他的更慢?



基本上它使用散列在桌子上挑一个槽。如果插槽中有一个值并且哈希匹配,则比较这些项目以查看它们是否相等。



如果哈希不匹配或项目isn不等于,然后尝试另一个插槽。有一个公式来选择这个(我在引用的答案中描述),并逐渐拉出哈希值的未使用部分;但一旦使用它们,它将最终通过散列表中的所有插槽。这确保最终我们找到一个匹配的项目或一个空的插槽。当搜索找到一个空的插槽时,它插入该值或放弃(取决于我们是添加还是获取一个值)。



重要的是要注意的是没有列表或桶:只有一个具有特定数量的插槽的哈希表,每个散列用于生成一系列候选插槽。


I am trying to understand python hash function under the hood. I created a custom class where all instances return the same hash value.

class C(object):
    def __hash__(self):
        return 42

I just assumed that only one instance of the above class can be in a set at any time, but in fact a set can have multiple elements with same hash.

c, d = C(), C()
x = {c: 'c', d: 'd'}
print x
# {<__main__.C object at 0x83e98cc>:'c', <__main__.C object at 0x83e98ec>:'d'}
# note that the dict has 2 elements

I experimented a little more and found that if I override the __eq__ method such that all the instances of the class compare equal, then the set only allows one instance.

class D(C):
    def __eq__(self, other):
        return hash(self) == hash(other)

p, q = D(), D()
y = {p:'p', q:'q'}
print y
# {<__main__.D object at 0x8817acc>]: 'q'}
# note that the dict has only 1 element

So I am curious to know how can a dict have multiple elements with same hash. Thanks!

Note: Edited the question to give example of dict (instead of set) because all the discussion in the answers is about dicts. But the same applies to sets; sets can also have multiple elements with same hash value.

解决方案

For a detailed description of how Python's hashing works see my answer to Why is early return slower than else?

Basically it uses the hash to pick a slot in the table. If there is a value in the slot and the hash matches, it compares the items to see if they are equal.

If the hash doesn't match or the items aren't equal, then it tries another slot. There's a formula to pick this (which I describe in the referenced answer), and it gradually pulls in unused parts of the hash value; but once it has used them all up, it will eventually work its way through all slots in the hash table. That guarantees eventually we either find a matching item or an empty slot. When the search finds an empty slot, it inserts the value or gives up (depending whether we are adding or getting a value).

The important thing to note is that there are no lists or buckets: there is just a hash table with a particular number of slots, and each hash is used to generate a sequence of candidate slots.

这篇关于Python dict如何具有多个具有相同哈希值的密钥?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆