当使用哈希默认值时,例如奇怪的意外行为(消失/变化的值),例如Hash.new([]) [英] Strange, unexpected behavior (disappearing/changing values) when using Hash default value, e.g. Hash.new([])

查看:94
本文介绍了当使用哈希默认值时,例如奇怪的意外行为(消失/变化的值),例如Hash.new([])的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑下面的代码:

  h = Hash.new(0)#新的哈希对默认值为0 
h [1] + = 1#=> {1 => 1}
h [2] + = 2#=> {2 => 2}

没关系,但:

  h = Hash.new([])#作为默认值的空数组
h [1]<< = 1#=> {1 => [1]}←Ok
h [2]<< = 2#=> {1 => [1,2],2 => [1,2]}←为什么`1`改变了?
h [3]<< 3#=> {1 => [1,2,3],2 => [1,2,3]}←3是哪里?

在这一点上,我希望哈希值为:

  {1 => [1],2 => [2],3 => [3]} 
/ pre>

但是远非如此。发生了什么以及如何获得我期望的行为?首先,请注意,此行为适用于任何默认值,随后发生了变异(例如哈希和字符串),而不仅仅是数组。

TL; DR :使用 Hash.new {| h,k | h [k] = []} 如果你想要最简单,最习惯的解决方案。
$ b




什么不行



为什么 Hash.new([])不起作用



让我们更深入地探讨为什么 Hash.new([])不起作用:

  h = Hash.new([])
h [0]<< 'a'#=> [a]
h [1]<< 'b'#=> [a,b]
h [1]#=> [a,b]

h [0] .object_id == h [1] .object_id#=> true
h#=> {}

我们可以看到我们的默认对象被重用和变异(这是因为它被传递作为唯一的默认值,哈希无法获得新的,新的默认值),但为什么数组中没有键或值,尽管 h [1] 仍然给我们一个价值?这里有一个提示:

  h [42]#=> [a,b] 

每个 [ ] call只是默认值,我们一直在变化,所以现在包含了我们的新值。由于<< = ),我们从来没有把任何东西放到我们的实际散列中。相反,我们必须使用<< = (它是<< / code>作为 + = 即为 + ):

  h [2]<< ='c'#=> [a,b,c] 
h#=> {2 => [a,b,c]}

相同:

  h [2] =(h [2] <  



为什么 Hash.new {[]} 不起作用



使用 Hash.new {[]} 解决了重用和改变原始默认值的问题(如每次调用block时,都会返回一个新的数组),但不是赋值问题:

$ p $ h = Hash.new {[ ]}
h [0]<< 'a'#=> [a]
h [1]<< ='b'#=> [b]
h#=> {1 => [b]}






做什么



赋值方式



如果我们记得始终使用< ;< = ,然后 Hash.new {[]} 是一个可行的解决方案,但它有点奇怪和非惯用的(我从来没有见过在野外使用的<< = )。如果无意中使用了< ,那么它也容易出现细微的错误。



可变方式



Hash.new 州的文件(强调我自己的):


如果指定了一个块,将使用散列对象和键调用它,并返回默认值。 如果需要,块中的值存储在哈希中

所以我们必须存储如果我们希望使用<< / code>而不是< code><< =< / code, >:

  h = Hash.new {| h,k | h [k] = []} 
h [0]<< 'a'#=> [a]
h [1]<< 'b'#=> [b]
h#=> {0 => [a],1 => [b]}

这有效地将来自我们的个别调用(其将使用< = )的赋值移动到传递给 Hash.new ,当使用<< 时消除意外行为的负担。



请注意,此方法和其他方法之间的功能差异:这种方式在读取时分配默认值(因为分配始终发生在块内)。例如:

  h1 = Hash.new {| h,k | h [k] = []} 
h1 [:x]
h1#=> {:x => []}

h2 = Hash.new {[]}
h2 [:x]
h2#=> {}



不可变的方式



您可能想知道为什么 Hash.new([])不起作用,而 Hash.new(0)精细。关键是Ruby中的Numerics是不可变的,所以我们自然不会最终在原地进行变异。如果我们将默认值视为不可变,我们可以使用 Hash.new([])就好了:

  h = Hash.new([]。freeze)
h [0] + = ['a']#=> [a]
h [1] + = ['b']#=> [b]
h [2]#=> []
h#=> {0 => [a],1 => [b]}

但是,请注意([]。freeze + [] .freeze).frozen? == false 。所以,如果你想确保整个不变性,那么你必须小心地重新冻结新的对象。



在所有的方法中,我个人比较喜欢这种方式 - 不变性通常会使事情的推理变得更加简单(毕竟,这是唯一不存在隐藏或微妙意外行为的方法)。




这并非严格意义上的,像 instance_variable_set 这样的方法绕过了这个,但它们必须存在元编程,因为 = 中的l值不能是动态的。

Consider this code:

h = Hash.new(0)  # New hash pairs will by default have 0 as values
h[1] += 1  #=> {1=>1}
h[2] += 2  #=> {2=>2}

That’s all fine, but:

h = Hash.new([])  # Empty array as default value
h[1] <<= 1  #=> {1=>[1]}                  ← Ok
h[2] <<= 2  #=> {1=>[1,2], 2=>[1,2]}      ← Why did `1` change?
h[3] << 3   #=> {1=>[1,2,3], 2=>[1,2,3]}  ← Where is `3`?

At this point I expect the hash to be:

{1=>[1], 2=>[2], 3=>[3]}

but it’s far from that. What is happening and how can I get the behavior I expect?

解决方案

First, note that this behavior applies to any default value that is subsequently mutated (e.g. hashes and strings), not just arrays.

TL;DR: Use Hash.new { |h, k| h[k] = [] } if you want the simplest, most idiomatic solution.


What doesn’t work

Why Hash.new([]) doesn’t work

Let’s look more in-depth at why Hash.new([]) doesn’t work:

h = Hash.new([])
h[0] << 'a'  #=> ["a"]
h[1] << 'b'  #=> ["a", "b"]
h[1]         #=> ["a", "b"]

h[0].object_id == h[1].object_id  #=> true
h  #=> {}

We can see that our default object is being reused and mutated (this is because it is passed as the one and only default value, the hash has no way of getting a fresh, new default value), but why are there no keys or values in the array, despite h[1] still giving us a value? Here’s a hint:

h[42]  #=> ["a", "b"]

The array returned by each [] call is just the default value, which we’ve been mutating all this time so now contains our new values. Since << doesn’t assign to the hash (there can never be assignment in Ruby without an = present), we’ve never put anything into our actual hash. Instead we have to use <<= (which is to << as += is to +):

h[2] <<= 'c'  #=> ["a", "b", "c"]
h             #=> {2=>["a", "b", "c"]}

This is the same as:

h[2] = (h[2] << 'c')

Why Hash.new { [] } doesn’t work

Using Hash.new { [] } solves the problem of reusing and mutating the original default value (as the block given is called each time, returning a new array), but not the assignment problem:

h = Hash.new { [] }
h[0] << 'a'   #=> ["a"]
h[1] <<= 'b'  #=> ["b"]
h             #=> {1=>["b"]}


What does work

The assignment way

If we remember to always use <<=, then Hash.new { [] } is a viable solution, but it’s a bit odd and non-idiomatic (I’ve never seen <<= used in the wild). It’s also prone to subtle bugs if << is inadvertently used.

The mutable way

The documentation for Hash.new states (emphasis my own):

If a block is specified, it will be called with the hash object and the key, and should return the default value. It is the block’s responsibility to store the value in the hash if required.

So we must store the default value in the hash from within the block if we wish to use << instead of <<=:

h = Hash.new { |h, k| h[k] = [] }
h[0] << 'a'  #=> ["a"]
h[1] << 'b'  #=> ["b"]
h            #=> {0=>["a"], 1=>["b"]}

This effectively moves the assignment from our individual calls (which would use <<=) to the block passed to Hash.new, removing the burden of unexpected behavior when using <<.

Note that there is one functional difference between this method and the others: this way assigns the default value upon reading (as the assignment always happens inside the block). For example:

h1 = Hash.new { |h, k| h[k] = [] }
h1[:x]
h1  #=> {:x=>[]}

h2 = Hash.new { [] }
h2[:x]
h2  #=> {}

The immutable way

You may be wondering why Hash.new([]) doesn’t work while Hash.new(0) works just fine. The key is that Numerics in Ruby are immutable, so we naturally never end up mutating them in-place. If we treated our default value as immutable, we could use Hash.new([]) just fine too:

h = Hash.new([].freeze)
h[0] += ['a']  #=> ["a"]
h[1] += ['b']  #=> ["b"]
h[2]           #=> []
h              #=> {0=>["a"], 1=>["b"]}

However, note that ([].freeze + [].freeze).frozen? == false. So, if you want to ensure that the immutability is preserved throughout, then you must take care to re-freeze the new object.

Of all the ways, I personally prefer this way—immutability generally makes reasoning about things much simpler (this is, after all, the only method that has no possibility of hidden or subtle unexpected behavior).


This isn’t strictly true, methods like instance_variable_set bypass this, but they must exist for metaprogramming since the l-value in = cannot be dynamic.

这篇关于当使用哈希默认值时,例如奇怪的意外行为(消失/变化的值),例如Hash.new([])的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆