Python的哈希函数顺序背后的逻辑是什么? [英] What's the logic behind Python's hash function order?

查看:168
本文介绍了Python的哈希函数顺序背后的逻辑是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们知道,某些Python的数据结构使用哈希表来存储setdictionary之类的项目.因此,这些对象中没有顺序.但似乎,对于某些数字序列而言,这是不正确的.

As we know, Some of Python's data structures use hash tables for storing items like set or dictionary. So there is no order in these objects. But it seems that, for some sequences of numbers that's not true.

例如,请考虑以下示例:

For example consider the following examples :

>>> set([7,2,5,3,6])
set([2, 3, 5, 6, 7])

>>> set([4,5,3,0,1,2])
set([0, 1, 2, 3, 4, 5])

但是,如果我们进行一些小的更改,则无法排序:

But it isn't sorted if we make a small change :

>>> set([8,2,5,3,6])
set([8, 2, 3, 5, 6])

所以问题是:Python的哈希函数如何在整数序列上工作?

So the question is: How does Python's hash function work on integer sequences?

推荐答案

尽管SO中有很多关于hash及其顺序的问题,但是没有人解释哈希函数的算法.

Although there are a lot of questions in SO about hash and its order,but no one of them explains the algorithm of hash function.

所以您在这里需要知道的是python如何计算哈希表中的索引.

So all you need here is know that how python calculate the indices in hash table.

如果您通过 hashtable.c 在CPython源代码中的文件中,您会在_Py_hashtable_set函数中看到以下几行内容,该行显示了python计算哈希表键索引的方式:

If you go through the hashtable.c file in CPython source code you'll see the following lines in _Py_hashtable_set function which shows the way python calculate the index of hash table keys :

key_hash = ht->hash_func(key);
index = key_hash & (ht->num_buckets - 1);

因此,因为整数的哈希值是整数本身*(-1除外),所以索引基于数据结构的数目和长度(ht->num_buckets - 1),并且它是按位计算的,且介于和号码.

So as the hash value of integers is the integer itself * (except for -1) the index is based on the number and the length of your data structure (ht->num_buckets - 1) and it calculated with Bitwise-and between (ht->num_buckets - 1) and the number.

现在考虑下面的示例,其中set使用哈希表:

Now consider the following example with set that use hash-table :

>>> set([0,1919,2000,3,45,33,333,5])
set([0, 33, 3, 5, 45, 333, 2000, 1919])

对于数字33,我们有:

33 & (ht->num_buckets - 1) = 1

实际上是:

'0b100001' & '0b111'= '0b1' # 1 the index of 33

注意在这种情况下,(ht->num_buckets - 1)8-1=70b111.

Note in this case (ht->num_buckets - 1) is 8-1=7 or 0b111.

对于1919:

'0b11101111111' & '0b111' = '0b111' # 7 the index of 1919

对于333:

'0b101001101' & '0b111' = '0b101' # 5 the index of 333

以及上述示例:

>>> set([8,2,5,3,6])
set([8, 2, 3, 5, 6])

'0b1000' & '0b100'='0b0' # for 8
'0b110' & '0b100'='0b100' # for 8


*类int的哈希函数:


* The hash function for class int :

class int:
    def __hash__(self):
        value = self
        if value == -1:
            value = -2
        return value

这篇关于Python的哈希函数顺序背后的逻辑是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆