为什么这两种方法会产生不同的结果? [英] Why do these two methods yield different results?

查看:31
本文介绍了为什么这两种方法会产生不同的结果?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据所有文档,您可以使用 <<.push+= 将元素附加到数组,并且结果应该是一样的.我发现它不是.有人可以向我解释我的错误吗?(我使用的是 Ruby 2.3.1.)

我有一些哈希值.它们都包含相同的密钥.我想将它们组合成一个散列,其中包含数组中所有收集的值.这很简单,您遍历所有散列并创建一个新散列,收集所有值,如下所示:

 # arg 是哈希数组 - 键必须相同返回 {} 除非 arg键 = (arg[0] ? arg[0].keys : [])result = keys.product([[]]).to_h # 每个键的值是空数组.arg.each 做 |h|h.each { |k,v|结果[k] += [v] }结尾结果结尾

如果而不是使用+=我使用.push<<,我完全明白奇怪的结果.

使用以下测试数组:

a_of_h = [{"1"=>​​10, "2"=>10, "3"=>10, "4"=>10, "5"=>10,6"=>10,7"=>10,8"=>10,9"=>10,10"=>10},{1"=>100,2"=>100,3"=>100,4"=>100,5"=>100,6"=>100,7"=>100,8""=>100, "9"=>100, "10"=>100}, {"1"=>​​1000, "2"=>1000, "3"=>1000, "4=>1000,5"=>1000,6"=>1000,7"=>1000,8"=>1000,9"=>1000,10"=>1000}, {1"=>10000, 2"=>10000, 3"=>10000, 4"=>10000, 5"=>10000, 6"=>10000,7"=>10000,8"=>10000,9"=>10000,10"=>10000}]

我明白

merge_hashes(a_of_h)=>{"1"=>​​[10, 100, 1000, 10000], "2"=>[10, 100, 1000, 10000], "3"=>[10, 100, 1000, 10000], "4"=>[10, 100, 1000, 10000], "5"=>[10, 100, 1000, 10000], "6"=>[10, 100, 1000, 10000], "7"=>[10, 100, 1000, 10000], "8"=>[10, 100, 1000, 10000], "9"=>[10, 100, 1000, 10000], "10"=>;[10, 100, 1000, 10000]}

正如我所料,但如果我使用 h.each { |k,v|结果[k]<<v } 而我得到

buggy_merge_hashes(a_of_h)=>{1"=>[10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 100, 100, 100, 100, 100, 100, 100, 100, 100, 10, 1001000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 10000, 10000, 10000, 10000, 10000, 10000, 0,0,0, 0, 0, 10, 0, 0, 110, 10, 10, 10, 10, 10, 10, 10, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 1000, 1000, 10, 10, 10, 10, 10, 101000, 1000, 1000, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000], "3"=,10,10,10,1010、10、100、100、100、100、100、100、100、100、100、100、1000、1000、1000、1000、1000、1000、100、100、100、100、10010000, 10000, 10000, 10000, 10000, 10000, 10000], "4"=>[10, 10, 10, 10, 10, 10, 10, 10, 10, 0, 10, 10, 10, 10, 10100,100,100,100,100,100,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,10000,10000,10000,10000,10000,10000,10000,10000,10000,10000], "5"=>[10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 100, 100, 100, 100, 100, 100,100,100,100,100,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,10000,10000,10000,10000,10000,10000,10000,10000,10000,10000],...}

(我把剩下的剪掉了.)

什么是我不知道的?

解决方案

<<#push 是破坏性操作(它们会改变接收者).

+(因此也是 +=)是一种非破坏性操作(它返回一个新对象,保持接收者不变).

虽然他们似乎在做同样的事情,但这种看似微小的差异却至关重要.

这是由于另一个错误造成的:result 中的所有子数组都作为同一个对象开始.如果您添加到其中之一,则会添加到所有这些.

如果您使用 += 为什么这不是问题?因为 result[k] += [v]result[k] = result[k] += [v] 是一样的(我躺在这里,有一个微妙的区别,但它在这里无关紧要,只是接受它们现在是相同的,以免变得更加困惑:D);由于 + 是非破坏性的,result[k] + [v]result[k] 是不同的对象;当您使用此赋值更新数组中的值时,您不再使用起始 [] 对象,并且引用共享错误无法再困扰您.

创建 result 数组的更好方法是以下之一:

result = Array.new(keys.size) { [] }结果 = 键.map { [] }

这将为每个元素创建一个新的数组对象.

然而,我会用完全不同的方式来写:

a_of_h.each_with_object(Hash.new { |h, k| h[k] = [] }) { |h, r|h.each { |k, v|r[k]<<v }}

each_with_hash 将传递的对象作为附加参数提供给块(此处为 r,为结果),并在方法完成时返回它.参数——将在 r 中的对象——将是一个带有 default_proc 的散列:每次我们尝试获取一个不在里面的键时,它会插入一个那里的新数组(即,不是尝试预先填充我们的结果对象,而是按需进行).然后我们只需遍历数组中的每个散列,并将值插入结果散列中,而无需担心键是否存在.

According to all documentation, you can append an element to an array using << or .push or +=, and the result ought to be the same. I have found it isn't. Can anybody explain to me what I am getting wrong? (I am using Ruby 2.3.1.)

I have got a number of hashes. All of them contain the same keys. I would like to combine them to form one hash with all the collected values in an array. This is straightforward, you iterate through all the hashes and make a new one, collecting all the values like this:

    # arg is array of Hashes - keys must be identical
    return {} unless arg
    keys = (arg[0] ? arg[0].keys : [])

    result = keys.product([[]]).to_h # value for each key is empty array.

    arg.each do |h|
      h.each { |k,v| result[k] += [v] }
    end

    result
  end

If instead of using += I use .push or <<, I get completely weird results.

Using the following test array:

a_of_h = [{"1"=>10, "2"=>10, "3"=>10, "4"=>10, "5"=>10, "6"=>10, "7"=>10, "8"=>10, "9"=>10, "10"=>10}, {"1"=>100, "2"=>100, "3"=>100, "4"=>100, "5"=>100, "6"=>100, "7"=>100, "8"=>100, "9"=>100, "10"=>100}, {"1"=>1000, "2"=>1000, "3"=>1000, "4"=>1000, "5"=>1000, "6"=>1000, "7"=>1000, "8"=>1000, "9"=>1000, "10"=>1000}, {"1"=>10000, "2"=>10000, "3"=>10000, "4"=>10000, "5"=>10000, "6"=>10000, "7"=>10000, "8"=>10000, "9"=>10000, "10"=>10000}] 

I get

merge_hashes(a_of_h)
 => {"1"=>[10, 100, 1000, 10000], "2"=>[10, 100, 1000, 10000], "3"=>[10, 100, 1000, 10000], "4"=>[10, 100, 1000, 10000], "5"=>[10, 100, 1000, 10000], "6"=>[10, 100, 1000, 10000], "7"=>[10, 100, 1000, 10000], "8"=>[10, 100, 1000, 10000], "9"=>[10, 100, 1000, 10000], "10"=>[10, 100, 1000, 10000]} 

as I expect, but if I use h.each { |k,v| result[k] << v } instead I get

buggy_merge_hashes(a_of_h)
 => {"1"=>[10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000], "2"=>[10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000], "3"=>[10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000], "4"=>[10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000], "5"=>[10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000, 10000], ...}

(I cut the rest.)

What is it I don't know here?

解决方案

<< and #push are destructive operations (they change the receiver).

+ (and consequently += as well) is a non-destructive operation (it returns a new object, leaving the receiver unchanged).

While they seem to be doing the same thing, this apparently small difference is crucial.

This comes into play due to another error: all of your subarrays in result start off as the same object. If you add to one of them, you add to all of them.

Why is this not an issue if you use +=? Because result[k] += [v] is the same as result[k] = result[k] += [v] (I'm lying here, there's a subtle difference, but it is not relevant here and just accept that they're the same for now to not get more confused :D ); and as + is non-destructive, result[k] + [v] is a different object than result[k]; when you update the value in the array with this assignment, you are not using the starting [] object any more, and the reference sharing error can't bite you any more.

A better way to create your result array would be one of these:

result = Array.new(keys.size) { [] }
result = keys.map { [] }

which will create a new array object for each element.

However, I would write it all quite differently:

a_of_h.each_with_object(Hash.new { |h, k| h[k] = [] }) { |h, r|
  h.each { |k, v| r[k] << v }
}

each_with_hash will give the passed object to the block as an additional argument (here r, for result), and will return it when the method is done. The argument — the object that will be in r — will be a hash with a default_proc: every time we try to get a key that's not inside yet, it will insert a new array there (i.e. instead of trying to pre-populate our result object, do it on-demand). Then we just go through each of the hashes in your array, and insert the value into the result hash without worrying if the key is there or not.

这篇关于为什么这两种方法会产生不同的结果?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆