C#-类的通用HashCode实现 [英] C# - Generic HashCode implementation for classes

查看:541
本文介绍了C#-类的通用HashCode实现的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究如何为类构建最佳的HashCode,并且看到一些算法.我看到了这个:哈希代码实现,似乎是.NET类的HashCode方法类似(请参见通过反映代码).

I'm looking at how build the best HashCode for a class and I see some algorithms. I saw this one : Hash Code implementation, seems to be that .NET classes HashCode methods are similar (see by reflecting the code).

所以问题是,为什么不通过传递我们认为是键"的字段来创建上述静态类以自动构建HashCode.

So question is, why don't create the above static class in order to build a HashCode automatically, just by passing fields we consider as a "key".

// Old version, see edit
public static class HashCodeBuilder
{
    public static int Hash(params object[] keys)
    {
        if (object.ReferenceEquals(keys, null))
        {
            return 0;
        }

        int num = 42;

        checked
        {
            for (int i = 0, length = keys.Length; i < length; i++)
            {
                num += 37;
                if (object.ReferenceEquals(keys[i], null))
                { }
                else if (keys[i].GetType().IsArray)
                {
                    foreach (var item in (IEnumerable)keys[i])
                    {
                        num += Hash(item);
                    }
                }
                else
                {
                    num += keys[i].GetHashCode();
                }
            }
        }

        return num;
    }
}

并像这样使用它:

// Old version, see edit
public sealed class A : IEquatable<A>
{
    public A()
    { }

    public string Key1 { get; set; }
    public string Key2 { get; set; }
    public string Value { get; set; }

    public override bool Equals(object obj)
    {
        return this.Equals(obj as A);
    }

    public bool Equals(A other)
    {
        if(object.ReferenceEquals(other, null)) 
            ? false 
            : Key1 == other.Key1 && Key2 == other.Key2;
    }

    public override int GetHashCode()
    {
        return HashCodeBuilder.Hash(Key1, Key2);
    }
}

总是自己的方法会简单得多,不是吗?我想念什么吗?

Will be much simpler that always is own method, no? I'm missing something?

根据所有评论,我得到了以下代码:

According all remarks, I got the following code :

public static class HashCodeBuilder
{
    public static int Hash(params object[] args)
    {
        if (args == null)
        {
            return 0;
        }

        int num = 42;

        unchecked
        {
            foreach(var item in args)
            {
                if (ReferenceEquals(item, null))
                { }
                else if (item.GetType().IsArray)
                {
                    foreach (var subItem in (IEnumerable)item)
                    {
                        num = num * 37 + Hash(subItem);
                    }
                }
                else
                {
                    num = num * 37 + item.GetHashCode();
                }
            }
        }

        return num;
    }
}


public sealed class A : IEquatable<A>
{
    public A()
    { }

    public string Key1 { get; set; }
    public string Key2 { get; set; }
    public string Value { get; set; }

    public override bool Equals(object obj)
    {
        return this.Equals(obj as A);
    }

    public bool Equals(A other)
    {
        if(ReferenceEquals(other, null))
        {
            return false;
        }
        else if(ReferenceEquals(this, other))
        {
            return true;
        }

        return Key1 == other.Key1
            && Key2 == other.Key2;
    }

    public override int GetHashCode()
    {
        return HashCodeBuilder.Hash(Key1, Key2);
    }
}

推荐答案

您的Equals方法已损坏-假设具有相同哈希码的两个对象必须相等.事实并非如此.

Your Equals method is broken - it's assuming that two objects with the same hash code are necessarily equal. That's simply not the case.

您的哈希码方法乍一看似乎还可以,但实际上可以做一些工作-见下文.这意味着您可以在每次调用它的时候将任何值类型的值装箱,但这还可以(如SLaks所指出的那样,集合处理周围存在一些问题).您可能需要考虑编写一些通用重载,这样可以避免常见情况下的性能损失(可能是1、2、3或4个参数).出于习惯用法,您可能还想使用foreach循环而不是普通的for循环.

Your hash code method looked okay at a quick glance, but could actually do some with some work - see below. It means boxing any value type values and creating an array any time you call it, but other than that it's okay (as SLaks pointed out, there are some issues around the collection handling). You might want to consider writing some generic overloads which would avoid those performance penalties for common cases (1, 2, 3 or 4 arguments, perhaps). You might also want to use a foreach loop instead of a plain for loop, just to be idiomatic.

您可以为平等做同样的事情 sort ,但这会变得更加困难和混乱.

You could do the same sort of thing for equality, but it would be slightly harder and messier.

对于哈希码本身,您只添加了值.我怀疑,您正在尝试做这种事情:

For the hash code itself, you're only ever adding values. I suspect you were trying to do this sort of thing:

int hash = 17;
hash = hash * 31 + firstValue.GetHashCode();
hash = hash * 31 + secondValue.GetHashCode();
hash = hash * 31 + thirdValue.GetHashCode();
return hash;

但是该乘以31,而不是 add 31.当前,您的哈希码对于相同的值总是返回相同的值,无论它们是否顺序相同,这是不理想的.

But that multiplies the hash by 31, it doesn't add 31. Currently your hash code will always return the same for the same values, whether or not they're in the same order, which isn't ideal.

对于哈希码用于什么似乎有些困惑.我建议那些不确定的人阅读 ,然后是Eric Lippert的有关散列和相等性的博客文章.

It seems there's some confusion over what hash codes are used for. I suggest that anyone who isn't sure reads the documentation for Object.GetHashCode and then Eric Lippert's blog post about hashing and equality.

这篇关于C#-类的通用HashCode实现的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆