与词典和HashSet的GetHash code方法 [英] GetHashCode method with Dictionary and HashSet

查看:138
本文介绍了与词典和HashSet的GetHash code方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个关于如何的问题词典和C#HashSet的工作。按照我的理解,GetHash code采用的是哈希表来确定键的唯一性。

在下面的MSDN页面,它指出:

  

一个哈希code是用来插入,并确定在基于散列的集合中的对象,如字典类,Hashtable类,或从DictionaryBase类派生的类型的数值。

链接: MSDN Object.GetHash code

如果是这样的话,为什么的containsKey和包含返回false CAR2当它具有相同的散列code作为CAR1?如果我的理解是正确的,如果有什么的MSDN说的是正确的,不能同时那些返回true?

 类节目
{
    静态无效的主要(字串[] args)
    {
        //创建一个字典和HashSet的
        字典<汽车,INT> carDictionary =新字典<汽车,INT>();
        HashSet的<车> carSet =新的HashSet<车>();

        //创建3汽车(2通用和1思域)
        汽车CAR1 =新车();
        汽车CAR2 =新车();
        汽车car3 =新思域();

        //测试哈希值
        INT TEST1 = car1.GetHash code(); // 22008501
        INT测试2 = car2.GetHash code(); // 22008501
        INT TEST3 = car3.GetHash code(); // 12048305


        //添加1通用汽车和1个公民既词典和HashSet的
        carDictionary.Add(CAR1,1);
        carDictionary.Add(car3,1);
        carSet.Add(CAR1);
        carSet.Add(car3);

        //为什么这两种假的?
        布尔dictTest1 = carDictionary.ContainsKey(CAR2); // 假
        布尔setTest1 = carSet.Contains(CAR2); // 假

        //测试平等是有道理的
        布尔种皮= car1.Equals(CAR2); // 假
        布尔TESTB = car1.Equals(car3); // 假
    }
}

类车
{
    公众覆盖INT GetHash code()
    {
        返回22008501;
    }
}

一流的公民:汽车
{
    公众覆盖INT GetHash code()
    {
        返回12048305;
    }
}
 

解决方案

由于的containsKey的逻辑与此类似。

  //这是回答的任择议定书的问题,简化的模型,真正的是更为复杂的。
私人名单,其中,名单,其中,KeyValuePair< TKEY的,TValue>>> _buckets = // ....

公共BOOL的containsKey(TKEY的键)
{
    名单< KeyValuePair< TKEY的,TValue>>斗= _buckets [key.GetHash code()%_buckets.Length]。
    的foreach(斗VAR项)
    {
        如果(key.Equals(item.Key))
            返回true;
    }
    返回false;
}
 

所有GetHash code的作用是让桶的钥匙会去的,但它仍然必须经过该桶的每个成员,并找到通过等于方法。这就是为什么有好的哈希codeS是非常重要的,在一个桶中的项目少越快的foreach 部分将是。最好的散列code将每桶只有一个项目。


下面是实际code 以包含在HashSet的(词典的的containsKey是非常相似,但更为复杂)

 私人INT [] m_buckets;
私人插槽[] m_slots;

公共BOOL包含(T项目){
    如果(m_buckets!= NULL){
        INT散列code = InternalGetHash code(项目);
        //见注,HashSet的级别描述为什么 -  1出现在for循环
        的for(int i = m_buckets [散列code%m_buckets.Length]  -  1; I> = 0; i = m_slots [I]。接下来){
            如果(m_slots [I] .hash code ==哈希code和;&安培; m_comparer.Equals(m_slots [I]。价值,项目)){
                返回true;
            }
        }
    }
    //要么m_buckets为空或找不到
    返回false;
}

私人诠释InternalGetHash code(T项目){
    如果(项目== NULL){
        返回0;
    }
    返回m_comparer.GetHash code(项目)及Lower31BitMask;
}

内部结构插槽{
    内部INT散列code; //低31位的散列code,-1,如果未使用
    内部的T值;
    内部INT下一个;接下来进入,-1,如果最后//指数
}
 

I have a question about how the Dictionary and HashSet work in C#. According to my understanding, GetHashCode is used in hash tables to determine key uniqueness.

On the following MSDN page, it states:

A hash code is a numeric value that is used to insert and identify an object in a hash-based collection such as the Dictionary class, the Hashtable class, or a type derived from the DictionaryBase class.

Link: MSDN Object.GetHashCode

If that is the case, why does ContainsKey and Contains return false for car2 when it has the same hashcode as car1? If my understanding is correct and if what MSDN says is correct, shouldn't both of those return true?

class Program
{
    static void Main(string[] args)
    {            
        // Create a Dictionary and HashSet
        Dictionary<Car, int> carDictionary = new Dictionary<Car, int>();
        HashSet<Car> carSet = new HashSet<Car>();

        // Create 3 Cars (2 generic and 1 Civic)
        Car car1 = new Car();
        Car car2 = new Car();
        Car car3 = new Civic();

        // Test hash values
        int test1 = car1.GetHashCode(); // 22008501
        int test2 = car2.GetHashCode(); // 22008501
        int test3 = car3.GetHashCode(); // 12048305


        // Add 1 generic car and 1 Civic to both Dictionary and HashSet
        carDictionary.Add(car1, 1);
        carDictionary.Add(car3, 1);
        carSet.Add(car1);
        carSet.Add(car3);

        // Why are both of these false?
        bool dictTest1 = carDictionary.ContainsKey(car2);  // false
        bool setTest1 = carSet.Contains(car2); // false

        // Testing equality makes sense
        bool testA = car1.Equals(car2); // false
        bool testB = car1.Equals(car3); // false
    }
}

class Car
{
    public override int GetHashCode()
    {
        return 22008501;
    }
}

class Civic : Car
{
    public override int GetHashCode()
    {
        return 12048305;
    }
}

解决方案

Because the logic of ContainsKey is similar to this.

//This is a simplified model for answering the OP's question, the real one is more complex.
private List<List<KeyValuePair<TKey,TValue>>> _buckets = //....

public bool ContainsKey(TKey key)
{
    List<KeyValuePair<TKey,TValue>> bucket = _buckets[key.GetHashCode() % _buckets.Length];
    foreach(var item in bucket)
    {
        if(key.Equals(item.Key))
            return true;
    }
    return false;
}

All GetHashCode does is get the bucket your key would go in, it still must go through each member of that bucket and find the exact match via the Equals method. That is why having good hash codes is important, the less items in a bucket the faster the foreach part will be. The best possible hashcode will have only one item per bucket.


Here is the actual code for Contains on a HashSet (Dictionary's ContainsKey is very similar but more complex)

private int[] m_buckets;
private Slot[] m_slots;

public bool Contains(T item) {
    if (m_buckets != null) {
        int hashCode = InternalGetHashCode(item);
        // see note at "HashSet" level describing why "- 1" appears in for loop
        for (int i = m_buckets[hashCode % m_buckets.Length] - 1; i >= 0; i = m_slots[i].next) {
            if (m_slots[i].hashCode == hashCode && m_comparer.Equals(m_slots[i].value, item)) {
                return true;
            }
        }
    }
    // either m_buckets is null or wasn't found
    return false;
}

private int InternalGetHashCode(T item) {
    if (item == null) {
        return 0;
    } 
    return m_comparer.GetHashCode(item) & Lower31BitMask;
}

internal struct Slot {
    internal int hashCode;      // Lower 31 bits of hash code, -1 if unused
    internal T value;
    internal int next;          // Index of next entry, -1 if last
}

这篇关于与词典和HashSet的GetHash code方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆