更改存储在基于散列的集合中的对象的hashCode [英] Changing hashCode of object stored in hash-based collection
问题描述
我有一个基于散列的对象集合,比如 HashSet
或 HashMap
。当 hashCode()
的实现可以随时间变化,因为它是从一些可变字段计算出来的时候,我可以遇到什么问题?
它如何影响Hibernate?为什么在默认情况下 hashCode()
返回对象的ID是否有缺陷?所有尚未保存的对象的id = 0,如果这很重要。
hashCode
为Hibernate映射实体?一旦设置ID是不可变的,但在将实体保存到数据库时不是这样。
我不担心 HashSet
,其中key为0的十几个实体。我所关心的是应用程序和Hibernate是否安全使用ID作为哈希代码,因为ID会随着持久化而生成ID。
返回一个对象的ID本身并不坏,但是如果你提到它们中的很多都有id = 0,它将会降低hash表的性能:所有对象都相同散列码进入同一个桶,所以你的散列表现在不比线性列表好。
更新:理论上,你的散列码只要其他人没有意识到这一点 - 这就意味着@bestsss在他的评论中提到了什么,即从可能持有它的任何集合中移除对象,并在哈希代码发生更改后再次插入它。在实践中,更好的选择是从对象的实际内容字段生成哈希码,而不是依赖数据库ID。
I have a hash-based collection of objects, such as HashSet
or HashMap
. What issues can I run into when the implementation of hashCode()
is such that it can change with time because it's computed from some mutable fields?
How does it affect Hibernate? Is there any reason why having hashCode()
return object's ID by default is bad? All not-yet-persisted objects have id=0, if that matters.
What is the reasonable implementation of hashCode
for Hibernate-mapped entities? Once set the ID is immutable, but it's not true for the moment of saving an entity to database.
I'm not worried about performance of a HashSet
with a dozen entities with key=0. What I care about is whether it's safe for my application and Hibernate to use ID as hash code, because ID changes as it is generated on persist.
If the hash code of the same object changes over time, the results are basically unpredictable. Hash collections use the hash code to assign objects to buckets -- if your hash code suddenly changes, the collection obviously doesn't know, so it can fail to find an existing object because it hashes to a different bucket now.
Returning an object's ID by itself isn't bad, but if many of them have id=0 as you mentioned, it will reduce the performance of the hash table: all objects with the same hash code go into the same bucket, so your hash table is now no better than a linear list.
Update: Theoretically, your hash code can change as long as nobody else is aware of it -- this implies exactly what @bestsss mentioned in his comment, which is to remove your object from any collections that may be holding it and insert it again once the hash code has changed. In practice, a better alternative is to generate your hash code from the actual content fields of your object rather than relying on the database ID.
这篇关于更改存储在基于散列的集合中的对象的hashCode的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!