python 中的默认 __hash__ 是什么? [英] What is the default __hash__ in python?

查看:51
本文介绍了python 中的默认 __hash__ 是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我经常使用时髦的东西作为字典的键,因此,我想知道什么是正确的方法 - 这通过为我的对象实现良好的散列方法.我知道这里提出的其他问题,例如 实现 hash,但我想了解默认 __hash__ 如何用于自定义对象,以及是否可以依赖它.

我注意到由于 hash({}) 引发了一个错误,所以 mutables 是明确不可散列的……但奇怪的是,自定义类是可散列的:

<预><代码>>>>类对象(对象):通过>>>o = 对象()>>>哈希(o)

那么,有人知道这个默认散列函数是如何工作的吗?通过理解这一点,我想知道:

如果我放置与字典键相同类型的对象,我可以依赖这个默认散列吗?例如:

key1 = MyObject()key2 = MyObject()key3 = MyObject(){key1: 1, key2: 'blabla', key3: 456}

如果我使用不同类型的对象作为字典中的键,我可以依赖它吗?例如

{int: 123, MyObject(10): 'bla', 'plo': 890}

在最后一种情况下,如何确保我的自定义哈希不会与内置哈希冲突?例如:

{int: 123, MyObject(10): 'bla', MyObjectWithCustomHash(123): 890}

解决方案

你可以依赖的东西:自定义对象有一个默认的 hash(),它以某种方式基于对象的身份.即,任何使用默认散列的对象在其生命周期内都将具有该散列的常量值,不同的对象可能具有也可能不具有不同的散列值.

您不能依赖 id() 返回的值和 hash() 返回的值之间的任何特定关系.在 Python 2.6 及更早版本的标准 C 实现中,它们是相同的,在 Python 2.7-3.2 hash(x)==id(x)/16.

最初我写道,在 3.2.3 及更高版本或 2.7.3 或更高版本中,哈希值可能是随机的,而在 Python 3.3 中,关系将始终是随机的.事实上,随机化目前仅适用于哈希字符串,因此实际上除以 16 的关系现在可能会继续存在,但不要指望它.

散列冲突通常无关紧要:在查找对象的字典查找中,它必须具有相同的散列并且还必须比较相等.只有当您遇到很高比例的冲突(例如拒绝服务攻击导致最新版本的 Python 能够随机化哈希计算)时,冲突才有意义.

I am quite often using funky stuff as keys for dictionaries, and therefore, I am wondering what is the right way to do it - and this goes through implementing good hash methods for my objects. I am aware of other questions asked here like good way to implement hash, but I'd like to understand how the default __hash__ works for custom objects, and if it is possible to rely on it.

I have noticed that mutables are explicitely unhashable since hash({}) raises an error ... but strangely, custom classes are hashable :

>>> class Object(object): pass
>>> o = Object()
>>> hash(o)

So, does anybody knows how this default hash function works ? By understanding this, I'd like to know :

Can I rely on this default hash, if I put objects of a same type as keys of a dictionary ? e.g. :

key1 = MyObject()
key2 = MyObject()
key3 = MyObject()
{key1: 1, key2: 'blabla', key3: 456}

Can I rely on it if I use objects of different types as keys in a dictionary ? e.g.

{int: 123, MyObject(10): 'bla', 'plo': 890}

And in the last case also, how to make sure that my custom hashes don't clash with the builtin hashes ? e.g :

{int: 123, MyObject(10): 'bla', MyObjectWithCustomHash(123): 890}

解决方案

What you can rely on: custom objects have a default hash() that is based in some way on the identity of the object. i.e. any object using the default hash will have a constant value for that hash over its lifetime and different objects may or may not have a different hash value.

You cannot rely on any particular relationship between the value returned by id() and the value returned by hash(). In the standard C implementation of Python 2.6 and earlier they were the same, in Python 2.7-3.2 hash(x)==id(x)/16.

Edit: originally I wrote that in releases 3.2.3 and later or 2.7.3 or later the hash value may be randomised and in Python 3.3 the relationship will always be randomised. In fact that randomisation at present only applies to hashing strings so in fact the divide by 16 relationship may continue to hold for now, but don't bank on it.

Hash collisions don't usually matter: in a dictionary lookup to find an object it must have the same hash and must also compare equal. Collisions only matter if you get a very high proportion of collisions such as in the denial of service attack that led to recent versions of Python being able to randomise the hash calculation.

这篇关于python 中的默认 __hash__ 是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆