如何使用__hash__的返回值? [英] How is the return value of __hash__ used?

查看:198
本文介绍了如何使用__hash__的返回值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我写了一个类,但是没有为它定义一个 __ hash __ 。然后 __ hash __(self)默认为 id(self)( self 的内存地址),根据文档

但是我没有在文档中看到如何使用这个值。

所以如果我的 __ hash__ 只是简单的 return 1 ,这会导致我的类的所有实例的散列都是相同的,它们都会被转换为相同的基础散列桶(我假设它是用C实现的)。然而,这并不意味着 __ hash __ 的返回值被用作这个底层哈希表中bin元素的键。

所以真的,我的问题是: __ hash __ 返回的值会发生什么?是直接用作键还是用作散列表键的哈希值(或对其执行的其他计算结果)?



它很重要,我在python2.7上



编辑:为了澄清,我没有问及如何处理散列冲突。在python中,这似乎是通过线性链接完成的。相反,我询问 __ hash __ 的返回值如何转化为相应存储区的内存地址(?)。


针对表格大小为 n 的探测序列由以下公式给出:

  def gen_probes(hashvalue,n):
'在当前字典设计中使用的相同探测序列'
mask = n - 1
PERTURB_SHIFT = 5
如果散列值< 0:
hashvalue = -hashvalue
i = hashvalue&掩码
产生i
perturb =散列值
而真:
i =(5 * i + perturb + 1)& 0xFFFFFFFFFFFFFFFF
产出i&掩码
perturb>> = PERTURB_SHIFT

例如,字典:

  d = {'timmy':'red','barry':'green','guido':'blue'} 

存储为大小为8的数组,每个条目的格式为 key,value)

  entries = [[' - ',' - ', ' - '],
[-8522787127447073495,'barry','green'],
[' - ',' - ',' - '],
['' - ',' - ',' - '],
[' - ',' - ',' - '],
[-9092791511155847987,'timmy','red '],
[' - ',' - ',' - '],
[-6480567542315338377,'guido','blue']]

Python的字典中用于键入插入的C源代码可以在这里找到: http://hg.python.org/cpython/file/cd87afe18ff8/Objects/dictobject.c#l550


Suppose I write a class, but don't define a __hash__ for it. Then __hash__(self) defaults to id(self) (self's memory address), according to the documentation.

However I don't see in the documentation, how this value is being used.
So if my __hash__ was simply return 1, which would cause the hash of all instances of my class to be the same, they all get bucketed into the same underlying hash bucket (which I assume is implemented in C). However, this does not mean that the return value of __hash__ is being used as the key to bin elements in this underlying hash table.
So really, my question is: what happens to the value returned by __hash__? is it used as the key directly, or is its hash (or the result of some other computation performed on it) used as the key to the hash table?

In case it matters, I'm on python2.7

EDIT: To clarify, I'm not asking about how hash collisions are handled. In python, this seems to be done with linear chaining. Instead, I'm asking how the return value of __hash__ translates into the memory address (?) of the corresponding bucket.

解决方案

Since Python's hash tables have a size that is a power-of-two, the lower bits of the hash value determine the location in the hash table (or at least the location of the initial probe).

The sequence of probes into a table size of n is given by:

def gen_probes(hashvalue, n):
    'Same sequence of probes used in the current dictionary design'
    mask = n - 1
    PERTURB_SHIFT = 5
    if hashvalue < 0:
        hashvalue = -hashvalue
    i = hashvalue & mask
    yield i
    perturb = hashvalue
    while True:
        i = (5 * i + perturb + 1) & 0xFFFFFFFFFFFFFFFF
        yield i & mask
        perturb >>= PERTURB_SHIFT

For example, the dictionary:

d = {'timmy': 'red', 'barry': 'green', 'guido': 'blue'}

is stored as an array of size 8 with each entry in the form (hash, key, value):

entries = [['--', '--', '--'],
           [-8522787127447073495, 'barry', 'green'],
           ['--', '--', '--'],
           ['--', '--', '--'],
           ['--', '--', '--'],
           [-9092791511155847987, 'timmy', 'red'],
           ['--', '--', '--'],
           [-6480567542315338377, 'guido', 'blue']]

The C source code for key insertion in Python's dictionaries can be found here: http://hg.python.org/cpython/file/cd87afe18ff8/Objects/dictobject.c#l550

这篇关于如何使用__hash__的返回值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆