重写GetHashCode的最佳算法是什么? [英] What is the best algorithm for overriding GetHashCode?

查看:255
本文介绍了重写GetHashCode的最佳算法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在.NET中, GetHashCode 方法在.NET基类库的很多地方都使用过。正确实现它对于在集合中或确定相等性时快速查找项目尤为重要。

In .NET, the GetHashCode method is used in a lot of places throughout the .NET base class libraries. Implementing it properly is especially important to find items quickly in a collection or when determining equality.

是否存在关于如何实现 GetHashCode 用于我的自定义类,这样我就不会降低性能吗?

Is there a standard algorithm or best practice on how to implement GetHashCode for my custom classes so I don't degrade performance?

推荐答案

我通常会选择类似于Josh Bloch的神话般的 有效的Java 。它速度很快,并且创建了一个很好的哈希,不太可能引起冲突。选择两个不同的质数,例如17和23,并执行以下操作:

I usually go with something like the implementation given in Josh Bloch's fabulous Effective Java. It's fast and creates a pretty good hash which is unlikely to cause collisions. Pick two different prime numbers, e.g. 17 and 23, and do:

public override int GetHashCode()
{
    unchecked // Overflow is fine, just wrap
    {
        int hash = 17;
        // Suitable nullity checks etc, of course :)
        hash = hash * 23 + field1.GetHashCode();
        hash = hash * 23 + field2.GetHashCode();
        hash = hash * 23 + field3.GetHashCode();
        return hash;
    }
}

如评论中所述,您可能会觉得更好选择一个大素数乘以。显然486187739是个好主意……虽然我看到的大多数示例中带有小数的大多数例子都倾向于使用质数,但至少有一些类似的算法经常使用非质数。在稍后的 FNV 示例中,例如,我使用的数字显然效果很好-但初始值不是素数。 (尽管乘法常数 是质数。我不知道这有多重要。)

As noted in comments, you may find it's better to pick a large prime to multiply by instead. Apparently 486187739 is good... and although most examples I've seen with small numbers tend to use primes, there are at least similar algorithms where non-prime numbers are often used. In the not-quite-FNV example later, for example, I've used numbers which apparently work well - but the initial value isn't a prime. (The multiplication constant is prime though. I don't know quite how important that is.)

XOR 哈希码有两个主要原因。假设我们有一个带有两个 int 字段的类型:

This is better than the common practice of XORing hashcodes for two main reasons. Suppose we have a type with two int fields:

XorHash(x, x) == XorHash(y, y) == 0 for all x, y
XorHash(x, y) == XorHash(y, x) for all x, y

顺便说一下,较早的算法是C#编译器当前用于匿名类型的算法。

By the way, the earlier algorithm is the one currently used by the C# compiler for anonymous types.

此页面提供了很多选项。我认为,在大多数情况下,上述内容足够好,而且让人难以忘怀并且正确无误。 FNV 的替代方法也很简单,但是使用了不同的常量和 XOR 而不是 ADD 作为组合操作。它看起来与下面的代码类似,但是普通的FNV算法在单个字节上运行,因此需要进行修改以对每个字节执行一次迭代,而不是对每个32位哈希值进行一次迭代。 FNV还设计用于可变长度的数据,而我们在此处使用的方式始终是针对相同数量的字段值。对这个答案的评论表明,这里的代码实际上不像上面的加法一样有效(在测试的示例情况下)。

This page gives quite a few options. I think for most cases the above is "good enough" and it's incredibly easy to remember and get right. The FNV alternative is similarly simple, but uses different constants and XOR instead of ADD as a combining operation. It looks something like the code below, but the normal FNV algorithm operates on individual bytes, so this would require modifying to perform one iteration per byte, instead of per 32-bit hash value. FNV is also designed for variable lengths of data, whereas the way we're using it here is always for the same number of field values. Comments on this answer suggest that the code here doesn't actually work as well (in the sample case tested) as the addition approach above.

// Note: Not quite FNV!
public override int GetHashCode()
{
    unchecked // Overflow is fine, just wrap
    {
        int hash = (int) 2166136261;
        // Suitable nullity checks etc, of course :)
        hash = (hash * 16777619) ^ field1.GetHashCode();
        hash = (hash * 16777619) ^ field2.GetHashCode();
        hash = (hash * 16777619) ^ field3.GetHashCode();
        return hash;
    }
}

请注意,理想情况是在将其添加到依赖于哈希码的集合中之后,应该防止对等值敏感(因此对哈希码敏感)的状态发生更改。

Note that one thing to be aware of is that ideally you should prevent your equality-sensitive (and thus hashcode-sensitive) state from changing after adding it to a collection that depends on the hash code.

根据文档


您可以重写GetHashCode以获取不可变的引用类型。通常,对于可变引用类型,仅在以下情况下才应覆盖GetHashCode:

You can override GetHashCode for immutable reference types. In general, for mutable reference types, you should override GetHashCode only if:


  • 您可以从不可变的字段中计算哈希码;或

  • 您可以确保当可变对象的哈希代码包含在依赖于其哈希代码的集合中时,该可变对象的哈希代码不会更改。

这篇关于重写GetHashCode的最佳算法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆