如何使用__hash__的返回值? [英] How is the return value of __hash__ used?
问题描述
__ hash __
。然后 __ hash __(self)
默认为 id(self)( self
的内存地址),根据文档 。
但是我没有在文档中看到如何使用这个值。
所以如果我的 __ hash__
只是简单的 return 1
,这会导致我的类的所有实例的散列都是相同的,它们都会被转换为相同的基础散列桶(我假设它是用C实现的)。然而,这并不意味着 __ hash __
的返回值被用作这个底层哈希表中bin元素的键。
所以真的,我的问题是: __ hash __
返回的值会发生什么?是直接用作键还是用作散列表键的哈希值(或对其执行的其他计算结果)?
它很重要,我在python2.7上
编辑:为了澄清,我没有问及如何处理散列冲突。在python中,这似乎是通过线性链接完成的。相反,我询问 __ hash __
的返回值如何转化为相应存储区的内存地址(?)。
针对表格大小为 n 的探测序列由以下公式给出:
def gen_probes(hashvalue,n):
'在当前字典设计中使用的相同探测序列'
mask = n - 1
PERTURB_SHIFT = 5
如果散列值< 0:
hashvalue = -hashvalue
i = hashvalue&掩码
产生i
perturb =散列值
而真:
i =(5 * i + perturb + 1)& 0xFFFFFFFFFFFFFFFF
产出i&掩码
perturb>> = PERTURB_SHIFT
例如,字典:
d = {'timmy':'red','barry':'green','guido':'blue'}
存储为大小为8的数组,每个条目的格式为 key,value)
:
entries = [[' - ',' - ', ' - '],
[-8522787127447073495,'barry','green'],
[' - ',' - ',' - '],
['' - ',' - ',' - '],
[' - ',' - ',' - '],
[-9092791511155847987,'timmy','red '],
[' - ',' - ',' - '],
[-6480567542315338377,'guido','blue']]
Python的字典中用于键入插入的C源代码可以在这里找到: http://hg.python.org/cpython/file/cd87afe18ff8/Objects/dictobject.c#l550
Suppose I write a class, but don't define a __hash__
for it. Then __hash__(self)
defaults to id(self)
(self
's memory address), according to the documentation.
However I don't see in the documentation, how this value is being used.
So if my __hash__
was simply return 1
, which would cause the hash of all instances of my class to be the same, they all get bucketed into the same underlying hash bucket (which I assume is implemented in C). However, this does not mean that the return value of __hash__
is being used as the key to bin elements in this underlying hash table.
So really, my question is: what happens to the value returned by __hash__
? is it used as the key directly, or is its hash (or the result of some other computation performed on it) used as the key to the hash table?
In case it matters, I'm on python2.7
EDIT: To clarify, I'm not asking about how hash collisions are handled. In python, this seems to be done with linear chaining. Instead, I'm asking how the return value of __hash__
translates into the memory address (?) of the corresponding bucket.
解决方案 Since Python's hash tables have a size that is a power-of-two, the lower bits of the hash value determine the location in the hash table (or at least the location of the initial probe).
The sequence of probes into a table size of n is given by:
def gen_probes(hashvalue, n):
'Same sequence of probes used in the current dictionary design'
mask = n - 1
PERTURB_SHIFT = 5
if hashvalue < 0:
hashvalue = -hashvalue
i = hashvalue & mask
yield i
perturb = hashvalue
while True:
i = (5 * i + perturb + 1) & 0xFFFFFFFFFFFFFFFF
yield i & mask
perturb >>= PERTURB_SHIFT
For example, the dictionary:
d = {'timmy': 'red', 'barry': 'green', 'guido': 'blue'}
is stored as an array of size 8 with each entry in the form (hash, key, value)
:
entries = [['--', '--', '--'],
[-8522787127447073495, 'barry', 'green'],
['--', '--', '--'],
['--', '--', '--'],
['--', '--', '--'],
[-9092791511155847987, 'timmy', 'red'],
['--', '--', '--'],
[-6480567542315338377, 'guido', 'blue']]
The C source code for key insertion in Python's dictionaries can be found here: http://hg.python.org/cpython/file/cd87afe18ff8/Objects/dictobject.c#l550
这篇关于如何使用__hash__的返回值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!