更改存储在基于散列的集合中的对象的hashCode [英] Changing hashCode of object stored in hash-based collection

查看:123
本文介绍了更改存储在基于散列的集合中的对象的hashCode的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个基于散列的对象集合,比如 HashSet HashMap 。当 hashCode()的实现可以随时间变化,因为它是从一些可变字段计算出来的时候,我可以遇到什么问题?



它如何影响Hibernate?为什么在默认情况下 hashCode()返回对象的ID是否有缺陷?所有尚未保存的对象的id = 0,如果这很重要。



hashCode 为Hibernate映射实体?一旦设置ID是不可变的,但在将实体保存到数据库时不是这样。



我不担心 HashSet ,其中key为0的十几个实体。我所关心的是应用程序和Hibernate是否安全使用ID作为哈希代码,因为ID会随着持久化而生成ID。

如果同一个对象的哈希码随时间变化,结果基本上是不可预测的。哈希集合使用哈希代码为对象分配对象 - 如果您的哈希代码突然变化,那么该集合显然不知道,因此它可能无法找到现有对象,因为它现在散列到不同的存储桶中。



返回一个对象的ID本身并不坏,但是如果你提到它们中的很多都有id = 0,它将会降低hash表的性能:所有对象都相同散列码进入同一个桶,所以你的散列表现在不比线性列表好。



更新:理论上,你的散列码只要其他人没有意识到这一点 - 这就意味着@bestsss在他的评论中提到了什么,即从可能持有它的任何集合中移除对象,并在哈希代码发生更改后再次插入它。在实践中,更好的选择是从对象的实际内容字段生成哈希码,而不是依赖数据库ID。


I have a hash-based collection of objects, such as HashSet or HashMap. What issues can I run into when the implementation of hashCode() is such that it can change with time because it's computed from some mutable fields?

How does it affect Hibernate? Is there any reason why having hashCode() return object's ID by default is bad? All not-yet-persisted objects have id=0, if that matters.

What is the reasonable implementation of hashCode for Hibernate-mapped entities? Once set the ID is immutable, but it's not true for the moment of saving an entity to database.

I'm not worried about performance of a HashSet with a dozen entities with key=0. What I care about is whether it's safe for my application and Hibernate to use ID as hash code, because ID changes as it is generated on persist.

解决方案

If the hash code of the same object changes over time, the results are basically unpredictable. Hash collections use the hash code to assign objects to buckets -- if your hash code suddenly changes, the collection obviously doesn't know, so it can fail to find an existing object because it hashes to a different bucket now.

Returning an object's ID by itself isn't bad, but if many of them have id=0 as you mentioned, it will reduce the performance of the hash table: all objects with the same hash code go into the same bucket, so your hash table is now no better than a linear list.

Update: Theoretically, your hash code can change as long as nobody else is aware of it -- this implies exactly what @bestsss mentioned in his comment, which is to remove your object from any collections that may be holding it and insert it again once the hash code has changed. In practice, a better alternative is to generate your hash code from the actual content fields of your object rather than relying on the database ID.

这篇关于更改存储在基于散列的集合中的对象的hashCode的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆