Go 中复杂键字典的唯一性,但 Julia 中没有? [英] Unicity of complex key dictionaries in Go but not in Julia?
问题描述
在 GO 中,当我使用结构作为映射的键时,键是唯一的.
In GO when I use a struct as a key for a map, there is an unicity of the keys.
例如,以下代码生成一个只有一个键的地图:map[{x 1}:1]
For example, the following code produce a map with only one key : map[{x 1}:1]
package main
import (
"fmt"
)
type MyT struct {
A string
B int
}
func main() {
dic := make(map[MyT]int)
for i := 1; i <= 10; i++ {
dic[MyT{"x", 1}] = 1
}
fmt.Println(dic)
}
// result : map[{x 1}:1]
我尝试在 Julia 中做同样的事情,但我有一个奇怪的惊喜:
I Tried to do the same in Julia and I had a strange surprise :
这个 Julia 代码,类似于 GO 代码,生成一个包含 10 个键的字典!
This Julia code, similar to the GO one, produces a dictionary whith 10 keys !
type MyT
A::String
B::Int64
end
dic = Dict{MyT, Int64}()
for i in 1:10
dic[MyT("x", 1)] = 1
end
println(dic)
# Dict(MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1)
println(keys(dic))
# MyT[MyT("x",1),MyT("x",1),MyT("x",1),MyT("x",1),MyT("x",1),MyT("x",1),MyT("x",1),MyT("x",1),MyT("x",1),MyT("x",1)]
那么我做错了什么?
感谢@DanGetz 的解决方案!:
Thank you @DanGetz for the solution ! :
immutable MyT # or struct MyT with julia > 0.6
A::String
B::Int64
end
dic = Dict{MyT, Int64}()
for i in 1:10
dic[MyT("x", 1)] = 1
end
println(dic) # Dict(MyT("x", 1)=>1)
println(keys(dic)) # MyT[MyT("x", 1)]
推荐答案
可变值在 Julia 中 by identity 散列,因为如果没有关于类型表示什么的额外知识,就无法知道两个值是否具有相同的结构是否意味着相同的事情.如果您在将一个值用作字典键之后对其进行变异,则按值对可变对象进行散列可能会特别成问题 - 这在按身份进行散列时不是问题,因为即使修改了可变对象的身份也保持不变.另一方面,按值对不可变对象进行散列是完全安全的——因为它们不能被变异,因此这是不可变类型的默认行为.在给定的示例中,如果您使 MyT
不可变,您将自动获得您期望的行为:
Mutable values hash by identity in Julia, since without additional knowledge about what a type represents, one cannot know if two values with the same structure mean the same thing or not. Hashing mutable objects by value can be especially problematic if you mutate a value after using it as a dictionary key – this is not a problem when hashing by identity since the identity of a mutable object remains the same even when it is modified. On the other hand, it's perfectly safe to hash immutable objects by value – since they cannot be mutated, and accordingly that is the default behavior for immutable types. In the given example, if you make MyT
immutable you will automatically get the behavior you're expecting:
immutable MyT # `struct MyT` in 0.6
A::String
B::Int64
end
dic = Dict{MyT, Int64}()
for i in 1:10
dic[MyT("x", 1)] = 1
end
julia> dic
Dict{MyT,Int64} with 1 entry:
MyT("x", 1) => 1
julia> keys(dic)
Base.KeyIterator for a Dict{MyT,Int64} with 1 entry. Keys:
MyT("x", 1)
对于包含 String
和 Int
值的类型,您想将其用作哈希键,不变性可能是正确的选择.事实上,不可变性通常是正确的选择,这就是为什么在 0.6 中引入结构类型的关键字已更改为 struct
表示不可变结构, mutable struct
表示可变结构——原则上人们会首先找到更短、更简单的名称,所以这应该是更好的默认选择——即不变性.
For a type holding a String
and an Int
value that you want to use as a hash key, immutability is probably the right choice. In fact, immutability is the right choice more often than not, which is why the keywords introducing structural types has been change in 0.6 to struct
for immutable structures and mutable struct
for mutable structures – on the principle that people will reach for the shorter, simpler name first, so that should be the better default choice – i.e. immutability.
正如@ntdef 所写,您可以通过重载 Base.hash
函数来更改您的类型的散列行为.然而,他的定义在某些方面是不正确的(这可能是我们未能更突出、更彻底地记录这一点的错):
As @ntdef has written, you can change the hashing behavior of your type by overloading the Base.hash
function. However, his definition is incorrect in a few respects (which is probably our fault for failing to document this more prominently and thoroughly):
- 您要重载的
Base.hash
的方法签名是Base.hash(::T, ::UInt)
. Base.hash(::T, ::UInt)
方法必须返回UInt
值.- 如果要重载
Base.hash
,则还应重载Base.==
以匹配.
- The method signature of
Base.hash
that you want to overload isBase.hash(::T, ::UInt)
. - The
Base.hash(::T, ::UInt)
method must return aUInt
value. - If you are overloading
Base.hash
, you should also overloadBase.==
to match.
所以这将是一个正确的方法来让你的可变类型按值散列(重新定义 MyT
需要新的 Julia 会话):
So this would be a correct way to make your mutable type hash by value (new Julia session required to redefine MyT
):
type MyT # `mutable struct MyT` in 0.6
A::String
B::Int64
end
import Base: ==, hash
==(x::MyT, y::MyT) = x.A == y.A && x.B == y.B
hash(x::MyT, h::UInt) = hash((MyT, x.A, x.B), h)
dic = Dict{MyT, Int64}()
for i in 1:10
dic[MyT("x", 1)] = 1
end
julia> dic
Dict{MyT,Int64} with 1 entry:
MyT("x", 1) => 1
julia> keys(dic)
Base.KeyIterator for a Dict{MyT,Int64} with 1 entry. Keys:
MyT("x", 1)
手动执行这有点烦人,但是 AutoHashEquals 包会自动执行此操作,采用它的乏味.您需要做的就是在 type
定义前加上 @auto_hash_equals
宏:
This is kind of annoying to do manually, but the AutoHashEquals package automates this, taking the tedium out of it. All you need to do is prefix the type
definition with the @auto_hash_equals
macro:
using AutoHashEquals
@auto_hash_equals type MyT # `@auto_hash_equals mutable struct MyT` in 0.6
A::String
B::Int64
end
底线:
如果您的类型应该具有基于值的相等性和散列,请认真考虑使其不可变.
If you have a type that should have value-based equality and hashing, seriously consider making it immutable.
如果您的类型确实必须是可变的,那么请认真考虑将其用作哈希键是否是一个好主意.
If your type really has to be mutable, then think hard about whether it's a good idea to use as a hash key.
如果您确实需要使用可变类型作为具有基于值的相等性和散列语义的散列键,请使用 AutoHashEquals
包.
If you really need to use a mutable type as a hash key with value-based equality and hashing semantics, use the AutoHashEquals
package.
这篇关于Go 中复杂键字典的唯一性,但 Julia 中没有?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!