如果空的哈希值code永远是零,在.NET [英] Should the hash code of null always be zero, in .NET

查看:157
本文介绍了如果空的哈希值code永远是零,在.NET的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

由于集合像 System.Collections.Generic.HashSet<> 接受作为一组成员,有人可能要问什么的哈希值code应该的。它看起来像框架使用 0

  //可空结构类型
诠释? I = NULL;
i.GetHash code(); //给出0
EqualityComparer<诠释> .Default.GetHash code(I) //给出0

//类的类型
CultureInfo的C = NULL;
EqualityComparer< CultureInfo的> .Default.GetHash code(C); //给出0
 

这可能是(略)问题可空枚举。如果我们定义

 枚举季节
{
  弹簧,
  夏季,
  秋季,
  冬季,
}
 

那么可空<季节> (又称季节?)可以利用短短五年值,但两个其中,即 Season.Spring ,具有相同的哈希code。

我们很容易写一个更好的相等比较是这样的:

 类NewNullEnumEqComp< T> :?EqualityComparer< T>其中T:结构
{
  公众覆盖布尔等于(T?X,T?Y)
  {
    返回Default.Equals(X,Y);
  }
  公众覆盖INT GetHash code(T?X)
  {
    返回x.HasValue? Default.GetHash code(X):-1;
  }
}
 

不过是没有任何理由的哈希值code 0

修改/添加:

有些人似乎认为这是有关重写 Object.GetHash code()。这真的不是,其实。 (.NET的作者的确让在 GetHash code()的覆盖的可空<> 结构而相关,但。)的用户编写的实现无参数的 GetHash code()永远无法处理的情况,其中对象,其哈希code我们追求的是

这是有关实现抽象方法<一href="http://msdn.microsoft.com/en-us/library/ms132131"><$c$c>EqualityComparer<T>.GetHash$c$c(T)或以其他方式实现接口方法<一href="http://msdn.microsoft.com/en-us/library/ms132155"><$c$c>IEqualityComparer<T>.GetHash$c$c(T).现在,在创建这些链接MSDN,我看到它说,有这些方法抛出 ArgumentNullException ,如果他们的唯一的参数是。这一点必须肯定MSDN上的错误呢?没有一个.NET本身的实现抛出异常。扔在这种情况下,将有效地打破任何尝试添加的HashSet&LT;&GT; 。除非的HashSet&LT;&GT; 不会的东西时,用项目交易非凡(我将要测试)

新的编辑/添加:

现在我试着调试。随着的HashSet&LT;&GT; ,我可以证实,使用默认的相等比较,值 Season.Spring 结束在同一个桶。这可以通过非常仔细地检查了私有数组成员 m_buckets m_slots 确定。注意,索引总是,通过设计,通过一个偏移

在code我给上面不,但是,解决这个问题。事实证明,的HashSet&LT;&GT; 甚至不会问相等比较器,当值。这是从源头code 的HashSet&LT;&GT;

  //抛出ArgumentNullException的GetHash code(空)解决方法Comparers。
    私人诠释InternalGetHash code(T项目){
        如果(项目== NULL){
            返回0;
        }
        返回m_comparer.GetHash code(项目)及Lower31BitMask;
    }
 

这意味着,至少的HashSet&LT;&GT; ,它甚至没有可能改变的哈希 相反,一个解决办法是改变所有其它数值的哈希值,像这样的:

 类NewerNullEnumEqComp&LT; T&GT; :?EqualityComparer&LT; T&GT;其中T:结构
{
  公众覆盖布尔等于(T?X,T?Y)
  {
    返回Default.Equals(X,Y);
  }
  公众覆盖INT GetHash code(T?X)
  {
    返回x.HasValue? 1 + Default.GetHash code(X)/ *没有看到HashSet的:* / 0;
  }
}
 

解决方案

只要哈希code返回空值是的一致的该类型,你应该罚款。对于散列code的唯一要求是,被认为是相等的份额两个对象相同的哈希code。

返回0或-1为空,只要你选择其中一个,并返回它所有的时间,将工作。显然,非空散列codeS不应该返回任何值,你使用的空。

<打击>类似的问题:

GetHash code对空字段?

<一个href="http://stackoverflow.com/questions/5078149/what-should-gethash$c$c-return-when-objects-identifier-is-null">What应该GetHash code返回时,对象标识符是空的?

备注这的MSDN进入去绕成散列code更多细节。尖锐,文档不提供任何报道或讨论的空值的所有的 - 甚至在社区含量

要与枚举解决您的问题,无论是重新实现散列code返回非零值,添加一个默认的未知枚举项等同于空,或者干脆不使用可空枚举。

有趣的发现,顺便说一句。

另外一个问题,我这个看一般是哈希code 不能再present一个4字节或更大的类型可为空没有的至少一个碰撞(多为户型面积增大)。例如,一个int的散列code只是整型,所以它采用了全INT范围。你选择了空什么样的价值在这个范围内做?不管人们你选择将与C本身价值的哈希值$ C $冲突。

在自己和冲突并不一定有问题,但你需要知道它们的存在。散列codeS仅用于在某些情况下。正如在MSDN上的文档指出,散列codeS不能保证返回不同的值对不同的对象,因此不应该被预料到了。

Given that collections like System.Collections.Generic.HashSet<> accept null as a set member, one can ask what the hash code of null should be. It looks like the framework uses 0:

// nullable struct type
int? i = null;
i.GetHashCode();  // gives 0
EqualityComparer<int?>.Default.GetHashCode(i);  // gives 0

// class type
CultureInfo c = null;
EqualityComparer<CultureInfo>.Default.GetHashCode(c);  // gives 0

This can be (a little) problematic with nullable enums. If we define

enum Season
{
  Spring,
  Summer,
  Autumn,
  Winter,
}

then the Nullable<Season> (also called Season?) can take just five values, but two of them, namely null and Season.Spring, have the same hash code.

It is tempting to write a "better" equality comparer like this:

class NewNullEnumEqComp<T> : EqualityComparer<T?> where T : struct
{
  public override bool Equals(T? x, T? y)
  {
    return Default.Equals(x, y);
  }
  public override int GetHashCode(T? x)
  {
    return x.HasValue ? Default.GetHashCode(x) : -1;
  }
}

But is there any reason why the hash code of null should be 0?

EDIT/ADDITION:

Some people seem to think this is about overriding Object.GetHashCode(). It really is not, actually. (The authors of .NET did make an override of GetHashCode() in the Nullable<> struct which is relevant, though.) A user-written implementation of the parameterless GetHashCode() can never handle the situation where the object whose hash code we seek is null.

This is about implementing the abstract method EqualityComparer<T>.GetHashCode(T) or otherwise implementing the interface method IEqualityComparer<T>.GetHashCode(T). Now, while creating these links to MSDN, I see that it says there that these methods throw an ArgumentNullException if their sole argument is null. This must certainly be a mistake on MSDN? None of .NET's own implementations throw exceptions. Throwing in that case would effectively break any attempt to add null to a HashSet<>. Unless HashSet<> does something extraordinary when dealing with a null item (I will have to test that).

NEW EDIT/ADDITION:

Now I tried debugging. With HashSet<>, I can confirm that with the default equality comparer, the values Season.Spring and null will end in the same bucket. This can be determined by very carefully inspecting the private array members m_buckets and m_slots. Note that the indices are always, by design, offset by one.

The code I gave above does not, however, fix this. As it turns out, HashSet<> will never even ask the equality comparer when the value is null. This is from the source code of HashSet<>:

    // Workaround Comparers that throw ArgumentNullException for GetHashCode(null).
    private int InternalGetHashCode(T item) {
        if (item == null) { 
            return 0;
        } 
        return m_comparer.GetHashCode(item) & Lower31BitMask; 
    }

This means that, at least for HashSet<>, it is not even possible to change the hash of null. Instead, a solution is to change the hash of all the other values, like this:

class NewerNullEnumEqComp<T> : EqualityComparer<T?> where T : struct
{
  public override bool Equals(T? x, T? y)
  {
    return Default.Equals(x, y);
  }
  public override int GetHashCode(T? x)
  {
    return x.HasValue ? 1 + Default.GetHashCode(x) : /* not seen by HashSet: */ 0;
  }
}

解决方案

So long as the hash code returned for nulls is consistent for the type, you should be fine. The only requirement for a hash code is that two objects that are considered equal share the same hash code.

Returning 0 or -1 for null, so long as you choose one and return it all the time, will work. Obviously, non-null hash codes should not return whatever value you use for null.

Similar questions:

GetHashCode on null fields?

What should GetHashCode return when object's identifier is null?

The "Remarks" of this MSDN entry goes into more detail around the hash code. Poignantly, the documentation does not provide any coverage or discussion of null values at all - not even in the community content.

To address your issue with the enum, either re-implement the hash code to return non-zero, add a default "unknown" enum entry equivalent to null, or simply don't use nullable enums.

Interesting find, by the way.

Another problem I see with this generally is that the hash code cannot represent a 4 byte or larger type that is nullable without at least one collision (more as the type size increases). For example, the hash code of an int is just the int, so it uses the full int range. What value in that range do you choose for null? Whatever one you pick will collide with the value's hash code itself.

Collisions in and of themselves are not necessarily a problem, but you need to know they are there. Hash codes are only used in some circumstances. As stated in the docs on MSDN, hash codes are not guaranteed to return different values for different objects so shouldn't be expected to.

这篇关于如果空的哈希值code永远是零,在.NET的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆