hash()和id()之间的区别 [英] Difference between hash() and id()

查看:1260
本文介绍了hash()和id()之间的区别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个用户定义的对象,比如 a b

Both这些对象具有相同的 hash 值。

但是, id(a) id(b)是不相等的。



p $ p> >>> a是b
False
>>> a == b
True

从这个观察中,我可以推断出以下几点吗?


  • 不平等的对象可能具有相同的 hash 值。

  • 平等对象需要具有相同的 id 值。

  • 每当调用 obj1为obj2 时,两个对象的 id 值是比较的,而不是它们的 hash 值。

解决方案 id , hash == 和运营商:身份散列值即可。并非所有对象都具有全部三种。


  1. 所有对象都有标识在某些情况下有点滑。 id 函数返回一个对应于对象身份的数字(在cpython中,它返回对象的内存地址,但其他解释器可能会返回其他内容)。如果两个对象(同时存在)具有相同的标识,那么它们实际上是对同一对象的两个引用。 是运算符通过标识比较项目, a是b 相当于 id(a) == id(b)



    当您处理在实现中缓存的对象时,标识可能会有点混乱。例如,cpython中的小整数和字符串对象在每次使用时都不会重新映射。相反,现有的对象在任何需要时都会返回。你不应该依赖于你的代码,因为它是cpython的一个实现细节(其他的解释器可以做不同的事情或者根本不做)。


  2. 全部对象也有一个,虽然这有点复杂。一些对象除了他们的身份之外没有其他有意义的价值(因此在某些情况下,对身份的价值可能是同义词)。值可以定义为 == 运算符进行比较,所以任何时候 a == b ,都可以说 a b 具有相同的值。容器对象(如列表)具有由其内容定义的值,而其他类型的对象将具有基于其属性的值。不同类型的对象有时可以具有与数字相同的值: 0 == 0.0 == 0j == decimal.Decimal(0)== fractions.Fraction(0)== False (yep, bool s是Python中的数字,因为历史原因)。



    一个类没有定义 __ eq __ 方法(为了实现 == 运算符),它将继承默认版本从 object ,它的实例将仅由它们的身份进行比较。当其他相同的实例可能具有重要的语义差异时,这是适当的。例如,连接到同一主机的同一端口的两个不同的套接字需要以不同的方式处理,如果其中一个正在获取HTML网页,而另一个正在获取从该页面链接的图像,则它们不具有相同的值。除了一个值之外,一些对象还有一个散列值,这意味着它们可以用作字典键(并存储在设置 S)。函数 hash(a)返回对象 a 的散列值,这是一个基于对象值的数字。对象的哈希值必须在对象的生命周期中保持不变,所以只有当对象的值是不可变的时,它才有意义(因为它基于对象的身份,或者因为它基于对象的内容)对象本身是不可变的)。

    多个不同的对象可能具有相同的散列值,尽管精心设计的散列函数将尽可能避免这种情况。在字典中存储具有相同散列的对象比存储具有不同散列的对象效率低得多(每个散列冲突需要更多工作)。对象默认是可哈希的(因为它们的缺省值是它们的身份,这是不可变的)。如果您在自定义类中编写 __ eq __ 方法,则Python将禁用此默认哈希实现,因为您的 __ eq __ 函数将会为其实例定义一个新的价值含义。如果你希望你的类仍然可以被哈希,你还需要写一个 __ hash __ 方法。如果您从可哈希类继承,但不想自己被哈希化,可以在类体中设置 __ hash__ = None



I have two user-defined objects, say a and b.
Both these objects have the same hash values.
However, the id(a) and id(b) are unequal.

Moreover,

>>> a is b
False
>>> a == b
True

From this observation, can I infer the following?

  • Unequal objects may have the same hash values.
  • Equal objects need to have the same id values.
  • Whenever obj1 is obj2 is called, the id values of both objects is compared, not their hash values.

解决方案

There are three concepts to grasp when trying to understand id, hash and the == and is operators: identity, value and hash value. Not all objects have all three.

  1. All objects have an identity, though even this can be a little slippery in some cases. The id function returns a number corresponding to an object's identity (in cpython, it returns the memory address of the object, but other interpreters may return something else). If two objects (that exist at the same time) have the same identity, they're actually two references to the same object. The is operator compares items by identity, a is b is equivalent to id(a) == id(b).

    Identity can get a little confusing when you deal with objects that are cached somewhere in their implementation. For instance, the objects for small integers and strings in cpython are not remade each time they're used. Instead, existing objects are returned any time they're needed. You should not rely on this in your code though, because it's an implementation detail of cpython (other interpreters may do it differently or not at all).

  2. All objects also have a value, though this is a bit more complicated. Some objects do not have a meaningful value other than their identity (so value an identity may be synonymous, in some cases). Value can be defined as what the == operator compares, so any time a == b, you can say that a and b have the same value. Container objects (like lists) have a value that is defined by their contents, while some other kinds of objects will have values based on their attributes. Objects of different types can sometimes have the same values, as with numbers: 0 == 0.0 == 0j == decimal.Decimal("0") == fractions.Fraction(0) == False (yep, bools are numbers in Python, for historic reasons).

    If a class doesn't define an __eq__ method (to implement the == operator), it will inherit the default version from object and its instances will be compared solely by their identities. This is appropriate when otherwise identical instances may have important semantic differences. For instance, two different sockets connected to the same port of the same host need to be treated differently if one is fetching an HTML webpage and the other is getting an image linked from that page, so they don't have the same value.

  3. In addition to a value, some objects have a hash value, which means they can be used as dictionary keys (and stored in sets). The function hash(a) returns the object a's hash value, a number based on the object's value. The hash of an object must remain the same for the lifetime of the object, so it only makes sense for an object to be hashable if its value is immutable (either because it's based on the object's identity, or because it's based on contents of the object that are themselves immutable).

    Multiple different objects may have the same hash value, though well designed hash functions will avoid this as much as possible. Storing objects with the same hash in a dictionary is much less efficient than storing objects with distinct hashes (each hash collision requires more work). Objects are hashable by default (since their default value is their identity, which is immutable). If you write an __eq__ method in a custom class, Python will disable this default hash implementation, since your __eq__ function will define a new meaning of value for its instances. You'll need to write a __hash__ method as well, if you want your class to still be hashable. If you inherit from a hashable class but don't want to be hashable yourself, you can set __hash__ = None in the class body.

这篇关于hash()和id()之间的区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆