与顺序无关的哈希算法 [英] Order-independent Hash Algorithm

查看:589
本文介绍了与顺序无关的哈希算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在为自定义编程语言开发一个集合库。我已经有了几种数据类型(Collection,List,Map,Set)和实现(可变和不可变),但到目前为止我所缺少的是 hashCode 等于。虽然列表没有问题,因为它们是有序集合,但它们对集合和地图起着特殊的作用。如果两个集合具有相同的大小和相同的元素,则它们被认为是相等的,并且集合维护它们的顺序不应该在它们的相等性上有所不同。由于equals-hashCode-contract, hashCode 实现也必须反映这种行为,这意味着具有相同元素但排序不同的两个集合应具有相同的哈希码。 (这同样适用于地图,从技术上讲是一组键值对)

I am currently working on a collection library for my custom programming language. I already have several data types (Collection, List, Map, Set) and implementations for them (mutable and immutable), but what I was missing so far was hashCode and equals. While these are no problem for Lists as they are ordered collections, the play a special role for Sets and Maps. Two Sets are considered equal if they have the same size and the same elements, and the order in which the Sets maintain them should not make a difference in their equality. Because of the equals-hashCode-contract, the hashCode implementation also has to reflect this behavior, meaning that two sets with the same elements but different ordering should have the same hash code. (The same applies for Maps, which are technically a Set of Key-Value-Pairs)

示例(伪代码):

let set1: Set<String> = [ "a", "b", "c" ]
let set2: Set<String> = [ "b", "c", "a" ]
set1 == set2       // should return true
set1.hashCode == set2.hashCode // should also return true

我如何实现一个相当好的哈希算法, hashCode 上例中的s返回相同的值?

How would I implement a reasonably good hash algorithm for which the hashCodes in the above example return the same value?

推荐答案

JDK本身提出了以下解决方案。 java.util.Set的合同接口状态:

The JDK itself proposes the following solution to this problem. The contract of the java.util.Set interface states:


返回此集合的哈希码值。集合的哈希码被定义为集合中元素的哈希码的总和,其中空元素的哈希码被定义为零。这确保了s1.equals(s2)意味着对于任何两个集合s1和s2的s1.hashCode()== s2.hashCode(),正如Object.hashCode()的常规协定所要求的那样。

Returns the hash code value for this set. The hash code of a set is defined to be the sum of the hash codes of the elements in the set, where the hash code of a null element is defined to be zero. This ensures that s1.equals(s2) implies that s1.hashCode()==s2.hashCode() for any two sets s1 and s2, as required by the general contract of Object.hashCode().

使用条目哈希码总和的另一种方法是使用,例如, ^ (XOR)运算符。

An alternative to using the sum of the entries' hash codes would be to use, for example, the ^ (XOR) operator.

Scala语言使用 Murmurhash 算法(参见私人 scala.util.hashing.MurmurHash3 class)实现 hashCode (或 ## )的方法不可变集和类似集合。

The Scala language uses an ordering-invariant version of the Murmurhash algorithm (cf. the private scala.util.hashing.MurmurHash3 class) to implement the hashCode (or ##) method of its immutable sets and similar collections.

这篇关于与顺序无关的哈希算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆