如何处理Swift中字典的哈希冲突 [英] How to handle hash collisions for Dictionaries in Swift

查看:612
本文介绍了如何处理Swift中字典的哈希冲突的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

TLDR



我的自定义结构实现了可破解协议。但是,当在字典中插入键时发生哈希冲突时,它们不会自动处理。我如何克服这个问题?



背景



我以前曾问这个问题如何在Swift中实现Hashable协议对于Int数组(自定义字符串结构体)。后来我添加了我自己的答案,这似乎是有效的。



然而,最近我在使用字典时发现了 hashValue 碰撞的微妙问题。



最基本的例子



我已经将代码简化为以下示例。



自定义结构

  struct MyStructure :Hashable {

var id:Int

init(id:Int){
self.id = id
}

var hashValue:Int {
get {
//设计为id = 1产生一个hashValue冲突,id = 2
如果id == 1 {
return 2
}
return id
}
}
}

func ==(lhs:MyStructure,rhs:MyStructure) - > Bool {
return lhs.hashValue == rhs.hashValue
}

注意全局函数重载相等运算符(==),以符合可衡量协议,这是Hashable协议所要求的。



细微词典关键问题



如果我创建一个字典 MyStructure 作为关键

  var dictionary = [MyStructure:String]()

let ok = MyStructure(id:0)// hashValue = 0
let collision1 = MyStructure(id:1)// hashValue = 2
let collision2 = MyStructure(id:2)// hashValue = 2

dictionary [ok] =some text
dictionary [collision1] =other text
字典[collision2] =更多文本

打印(字典)// [MyStructure(id:2):更多文本,MyStructure(id:0):一些文本]
print(dictionary.count)// 2

相等的哈希值会导致 collision1 键被 collision2 键覆盖。没有警告。如果这样的碰撞只发生在100个字母的字典中,那么很容易被忽略。 (花了我一段时间才能注意到这个问题。)



字典文字的明显问题



如果我用一个字典文字重复这个,这个问题变得更加明显,因为一个致命的错误被抛出。

  let ok = MyStructure(id:0)// hashValue = 0 
let collision1 = MyStructure(id:1)// hashValue = 2
let collision2 = MyStructure(id:2)// hashValue = 2

let dictionaryLiteral = [
ok:some text,
collision1:other text,
collision2:more text
]
//致命错误:字典文字包含重复键



问题



我的印象是, hashValue 不必总是返回唯一的值。例如, Mattt Thompson说


关于实现自定义哈希
函数的最常见的误解之一是...认为散列值必须是不同的。


尊敬的SO用户 @Gaffa说处理哈希冲突的一种方式是




考虑哈希码不唯一,并且使用
的等式比较器实际数据来确定唯一性。


在我看来,问题在撰写本文时,do Swift可变哈希函数需要返回唯一值?尚未得到充分的回答。



阅读Swift Dictiona后ry 问题哈希冲突如何处理?,我假设Swift自动处理与 Dictionary 的哈希冲突。但显然,如果我正在使用自定义类或结构,这是不正确的。



这个评论让我认为答案是如何实现Equatable协议,但我不知道我应该如何改变它。

  func ==(lhs:MyStructure,rhs:MyStructure) - > Bool {
return lhs.hashValue == rhs.hashValue
}

是这个函数调用每个字典键查找,还是仅当有哈希冲突? (更新:请参阅此问题



当发生哈希冲突(且仅当出现)时,该怎么做才能确定唯一性?

解决方案

  func ==(lhs:MyStructure,rhs:MyStructure) - > Bool {
return lhs.hashValue == rhs.hashValue
}




注意全局函数重载相等运算符(==),以符合Hashable协议所要求的可衡量协议。


您的问题是不正确的等级实现。



散列表(如Swift Dictionary或Set )需要单独的等式哈希实现。



哈希让你靠近你要找的对象; 等级可以让您获得正确的对象。



您的代码对于哈希等于使用相同的实现,这将保证冲突。 >

要解决此问题,请执行等式以匹配精确的对象值(但是您的模型定义了相等性)。例如:

  func ==(lhs:MyStructure,rhs:MyStructure) - > Bool {
return lhs.id == rhs.id
}


TLDR

My custom structure implements the Hashable Protocol. However, when hash collisions occur while inserting keys in a Dictionary, they are not automatically handled. How do I overcome this problem?

Background

I had previously asked this question How to implement the Hashable Protocol in Swift for an Int array (a custom string struct). Later I added my own answer, which seemed to be working.

However, recently I have detected a subtle problem with hashValue collisions when using a Dictionary.

Most basic example

I have simplified the code down as far as I can to the following example.

Custom structure

struct MyStructure: Hashable {

    var id: Int

    init(id: Int) {
        self.id = id
    }

    var hashValue: Int {
        get {
            // contrived to produce a hashValue collision for id=1 and id=2
            if id == 1 {
                return 2 
            }
            return id
        }
    }
}

func ==(lhs: MyStructure, rhs: MyStructure) -> Bool {
    return lhs.hashValue == rhs.hashValue
}

Note the global function to overload the equality operator (==) in order to conform to the Equatable Protocol, which is required by the Hashable Protocol.

Subtle Dictionary key problem

If I create a Dictionary with MyStructure as the key

var dictionary = [MyStructure : String]()

let ok = MyStructure(id: 0)            // hashValue = 0
let collision1 = MyStructure(id: 1)    // hashValue = 2
let collision2 = MyStructure(id: 2)    // hashValue = 2

dictionary[ok] = "some text"
dictionary[collision1] = "other text"
dictionary[collision2] = "more text"

print(dictionary) // [MyStructure(id: 2): more text, MyStructure(id: 0): some text]
print(dictionary.count) // 2

the equal hash values cause the collision1 key to be overwritten by the collision2 key. There is no warning. If such a collision only happened once in a dictionary with 100 keys, then it could easily be missed. (It took me quite a while to notice this problem.)

Obvious problem with Dictionary literal

If I repeat this with a dictionary literal, though, the problem becomes much more obvious because a fatal error is thrown.

let ok = MyStructure(id: 0)            // hashValue = 0
let collision1 = MyStructure(id: 1)    // hashValue = 2
let collision2 = MyStructure(id: 2)    // hashValue = 2

let dictionaryLiteral = [
    ok : "some text",
    collision1 : "other text",
    collision2 : "more text"
]
// fatal error: Dictionary literal contains duplicate keys

Question

I was under the impression that it was not necessary for hashValue to always return a unique value. For example, Mattt Thompson says,

One of the most common misconceptions about implementing a custom hash function comes from ... thinking that hash values must be distinct.

And the respected SO user @Gaffa says that one way to handle hash collisions is to

Consider hash codes to be non-unique, and use an equality comparer for the actual data to determine uniqueness.

In my opinion, the question Do swift hashable protocol hash functions need to return unique values? has not been adequately answered at the time of this writing.

After reading the Swift Dictionary question How are hash collisions handled?, I assumed that Swift automatically handled hash collisions with Dictionary. But apparently that is not true if I am using a custom class or structure.

This comment makes me think the answer is in how the Equatable protocol is implemented, but I am not sure how I should change it.

func ==(lhs: MyStructure, rhs: MyStructure) -> Bool {
    return lhs.hashValue == rhs.hashValue
}

Is this function called for every dictionary key lookup or only when there is a hash collision? (Update: see this question)

What should I do to determine uniqueness when (and only when) a hash collision occurs?

解决方案

func ==(lhs: MyStructure, rhs: MyStructure) -> Bool {
    return lhs.hashValue == rhs.hashValue
}

Note the global function to overload the equality operator (==) in order to conform to the Equatable Protocol, which is required by the Hashable Protocol.

Your problem is an incorrect equality implementation.

A hash table (such as a Swift Dictionary or Set) requires separate equality and hash implementations.

hash gets you close to the object you're looking for; equality gets you the exact object you're looking for.

Your code uses the same implementation for hash and equality, and this will guarantee a collision.

To fix the problem, implement equality to match exact object values (however your model defines equality). E.g.:

func ==(lhs: MyStructure, rhs: MyStructure) -> Bool {
    return lhs.id == rhs.id
}

这篇关于如何处理Swift中字典的哈希冲突的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆