合并哈希数组获取值数组的哈希 [英] Merge array of hashes to get hash of arrays of values

查看:138
本文介绍了合并哈希数组获取值数组的哈希的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是<一个相反的href=\"http://stackoverflow.com/questions/1640979/turning-a-hash-of-arrays-into-an-array-of-hashes-in-ruby\">Turning阵列的散列成哈希值的红宝石的数组。

优雅和/或有效地将哈希值的数组到一个散列结果,其中的值是所有值的数组:

  HS = [
  {A:1,B:2},
  {答:3,C:4},
  {B:5,D:6}
]
collect_values​​(HS)
#=&GT; {:一个= GT; [1,3],B =&GT; [2,5]:C =&GT; [4],D =&GT; [6]}

这简短code几乎工作,但失败时,有没有重复创建一个数组:

 高清collect_values​​(哈希)
  hashes.inject({}){| A,B | a.merge(二){| _,X,Y | [* X * Y]}}
结束
collect_values​​(HS)
#=&GT; {:一个= GT; [1,3],B =&GT; [2,5]:C =→4,D =→6}

这code的工作,但你可以写一个更好的版本?

 高清collect_values​​(哈希)
  #需要Ruby 1.8.7+的对象#自来水
  Hash.new {| H,K | H [K] = []} .tap做|结果|
    hashes.each {| H | h.each {| K,V |结果[K]&LT;&LT; V}}
  结束
结束

解决方案,只有在Ruby 1.9的工作是可以接受的,但需要注意的是这样的。

更新:基准测试结果

下面是基准以下各种答案的结果(和几个我自己的),使用哈希的三种不同的阵列:


  • 之一,每个散具有明显的按键,所以没有合并发生过:结果
    [{:一个=大于1},{B =大于2},{C =→3},{D =→4},{:E =→5 },{:F =&GT; 6},{:G =&GT; 7},...]


  • 下,每个哈希具有相同的密钥,所以最大合并发生:结果
    [{:一个=大于1},{:一个=→2},{:一个=→3},{:一个=→4},{:一个=大于5 },{:A =&GT; 6},{:A =&GT; 7},...]


  • ,另一种是独特的,共享密钥的组合:结果
    [{:C =大于1},{D =大于1},{C =大于2},{F =大于1},{C =大于1 ,D =&GT; 1},{:H =&GT; 1},{:C =&GT; 3},...]


               用户系统实际总
Phrogz 2A 0.577000 0.000000 0.577000(0.576000)
Phrogz 2B 0.624000 0.000000 0.624000(0.620000)
格伦·1 0.640000 0.000000 0.640000(0.641000)
Phrogz 1 0.671000 0.000000 0.671000(0.668000)
迈克尔·1 0.702000 0.000000 0.702000(0.700000)
迈克尔·2 0.717000 0.000000 0.717000(0.726000)
格伦2 0.765000 0.000000 0.765000(0.764000)
fl00r 0.827000 0.000000 0.827000(0.836000)
泽圭太0.874000 0.000000 0.874000(0.868000)
Tokland 1 0.873000 0.000000 0.873000(0.876000)
Tokland 2 1.077000 0.000000 1.077000(1.073000)
Phrogz 3 2.106000 0.093000 2.199000(2.209000)

最快code 是这个方法,我又说:

 高清collect_values​​(哈希)
  {} {.tap | R | hashes.each {| H | h.each {| K,V | (R [K] || = [])所述;&下; V}}}
结束

我已经接受<一个href=\"http://stackoverflow.com/questions/5490952/merge-array-of-hashes-to-get-hash-of-arrays-of-values/5491741#5491741\">glenn麦当劳的回答因为它是在速度,合理简洁方面的竞争力,而且(最重要的),因为它指出了使用哈希与施工方便自行修改默认PROC的危险,因为这可能会引入错误的修改当用户索引它以后。

最后,这里的基准code,如果你想运行自己的比较:

 要求'黄金'#,产生第三哈希
需要'面'#对于tokland1的map_by
AZSYMBOLS =(:一..:Z).to_a
试验= {
  '26鲜明的哈希'=&GT; AZSYMBOLS.zip(1..26).MAP {| A |哈希[* A]},
  '26相同密钥散列'=&GT; ([:A] * 26).ZIP(1..26).MAP {| A |哈希[* A]},
  '26混合键散列'=&GT; (2..27).MAP做|我|
    因素= i.prime_division.transpose
    哈希[AZSYMBOLS.values​​_at(* factors.first).ZIP(factors.last)
  结束
}高清phrogz1(哈希)
  Hash.new {| H,K | H [K] = []} .tap做|结果|
    hashes.each {| H | h.each {| K,V |结果[K]&LT;&LT; V}}
  结束
结束
高清phrogz2a(哈希)
  {} {.tap | R | hashes.each {| H | h.each {| K,V | (R [K] || = [])所述;&下; V}}}
结束
高清phrogz2b(哈希)
  hashes.each_with_object({}){| H,R | h.each {| K,V | (R [K] || = [])所述;&下; V}}
结束
高清phrogz3(哈希)
  结果= hashes.inject({}){| A,B | a.merge(二){| _,X,Y | [* X * Y]}}
  result.each {| K,V |结果[K] = [V]除非v.is_a?数组}
结束
高清glenn1(HS)
  hs.reduce({}){| h时,对| pairs.each {| K,V | (H [K] || = [])LT;&LT; V}; H}
结束
高清glenn2(HS)
  hs.map(安培;:to_a).flatten(1)。降低({}){| h时,(K,V)| (H [K] || = [])LT;&LT;伏; H}
结束
高清fl00r(HS)
  H = {Hash.new | H,K | H [k]的= []}
  hs.map(安培;:to_a).flatten(1)。每个{| V | ħ[V [0]]所述;&下; v [1]}
  H
结束
DEF泽(一)
  a.map(安培;:to_a).flatten(1).group_by {| K,V | ķ} {.each_value | V | v.map {|!K,V | V}}
结束
高清michael1(哈希)
  H = {Hash.new | H,K | H [k]的= []}
  hashes.each_with_object(H)做|小时,结果|
    h.each {| K,V |结果[K]&LT;&LT; v}
  结束
结束
高清michael2(哈希)
  H = {Hash.new | H,K | H [k]的= []}
  hashes.inject(H)做|因此,H |
    h.each {| K,V |结果[K]&LT;&LT; v}
    结果
  结束
结束
高清tokland1(HS)
  hs.map(安培;:to_a).flatten(1).map_by {| K,V | [K,V]}
结束
高清tokland2(HS)
  哈希[hs.map(安培;:to_a).flatten(1).group_by(安培;:第一).MAP {| K,VS |
    [K,vs.map {|Ø|问题o [1]}]
  }]
结束需要标杆
N = 10_000
Benchmark.bm做| X |
  x.report('Phrogz 2A'){{TESTS.each | N,H | N.times {phrogz2a(H)}}}
  x.report('Phrogz 2B'){{TESTS.each | N,H | N.times {phrogz2b(H)}}}
  x.report('格伦1'){{TESTS.each | N,H | N.times {glenn1(H)}}}
  x.report('Phrogz 1'){{TESTS.each | N,H | N.times {phrogz1(H)}}}
  x.report(迈克尔·1'){{TESTS.each | N,H | N.times {michael1(H)}}}
  x.report(迈克尔·2'){{TESTS.each | N,H | N.times {michael2(H)}}}
  x.report('格伦2'){{TESTS.each | N,H | N.times {glenn2(H)}}}
  x.report('fl00r'){{TESTS.each | N,H | N.times {fl00r(H)}}}
  x.report('泽'){{TESTS.each | N,H | N.times {泽(H)}}}
  x.report('Tokland 1'){{TESTS.each | N,H | N.times {tokland1(H)}}}
  x.report('Tokland 2'){{TESTS.each | N,H | N.times {tokland2(H)}}}
  x.report('Phrogz 3'){{TESTS.each | N,H | N.times {phrogz3(H)}}}结束


解决方案

任你选:

  hs.reduce({}){| h时,对| pairs.each {| K,V | (H [K] || = [])LT;&LT; V}; H}hs.map(安培;:to_a).flatten(1)。降低({}){| h时,(K,V)| (H [K] || = [])LT;&LT;伏; H}

我是强烈反对的默认值哈希搞乱,因为其他建议这样做,因为这样的检查的一个值修改散列,这似乎非常错误的我。

This is the opposite of Turning a Hash of Arrays into an Array of Hashes in Ruby.

Elegantly and/or efficiently turn an array of hashes into a hash where the values are arrays of all values:

hs = [
  { a:1, b:2 },
  { a:3, c:4 },
  { b:5, d:6 }
]
collect_values( hs )
#=> { :a=>[1,3], :b=>[2,5], :c=>[4], :d=>[6] }

This terse code almost works, but fails to create an array when there are no duplicates:

def collect_values( hashes )
  hashes.inject({}){ |a,b| a.merge(b){ |_,x,y| [*x,*y] } }
end
collect_values( hs )
#=> { :a=>[1,3], :b=>[2,5], :c=>4, :d=>6 }

This code works, but can you write a better version?

def collect_values( hashes )
  # Requires Ruby 1.8.7+ for Object#tap
  Hash.new{ |h,k| h[k]=[] }.tap do |result|
    hashes.each{ |h| h.each{ |k,v| result[k]<<v } }
  end
end

Solutions that only work in Ruby 1.9 are acceptable, but should be noted as such.

Update: Benchmarking Results

Here are the results of benchmarking the various answers below (and a few more of my own), using three different arrays of hashes:

  • one where each hash has distinct keys, so no merging ever occurs:
    [{:a=>1}, {:b=>2}, {:c=>3}, {:d=>4}, {:e=>5}, {:f=>6}, {:g=>7}, ...]

  • one where every hash has the same key, so maximum merging occurs:
    [{:a=>1}, {:a=>2}, {:a=>3}, {:a=>4}, {:a=>5}, {:a=>6}, {:a=>7}, ...]

  • and one that is a mix of unique and shared keys:
    [{:c=>1}, {:d=>1}, {:c=>2}, {:f=>1}, {:c=>1, :d=>1}, {:h=>1}, {:c=>3}, ...]

               user     system      total        real
Phrogz 2a  0.577000   0.000000   0.577000 (  0.576000)
Phrogz 2b  0.624000   0.000000   0.624000 (  0.620000)
Glenn 1    0.640000   0.000000   0.640000 (  0.641000)
Phrogz 1   0.671000   0.000000   0.671000 (  0.668000)
Michael 1  0.702000   0.000000   0.702000 (  0.700000)
Michael 2  0.717000   0.000000   0.717000 (  0.726000)
Glenn 2    0.765000   0.000000   0.765000 (  0.764000)
fl00r      0.827000   0.000000   0.827000 (  0.836000)
sawa       0.874000   0.000000   0.874000 (  0.868000)
Tokland 1  0.873000   0.000000   0.873000 (  0.876000)
Tokland 2  1.077000   0.000000   1.077000 (  1.073000)
Phrogz 3   2.106000   0.093000   2.199000 (  2.209000)

The fastest code is this method that I added:

def collect_values(hashes)
  {}.tap{ |r| hashes.each{ |h| h.each{ |k,v| (r[k]||=[]) << v } } }
end

I've accepted glenn mcdonald's answer as it was competitive in terms of speed, reasonably terse, but (most importantly) because it pointed out the danger of using a Hash with a self-modifying default proc for convenient construction, as this may introduce bad changes when the user is indexing it later on.

Finally, here's the benchmark code, in case you want to run your own comparisons:

require 'prime'   # To generate the third hash
require 'facets'  # For tokland1's map_by
AZSYMBOLS = (:a..:z).to_a
TESTS = {
  '26 Distinct Hashes'   => AZSYMBOLS.zip(1..26).map{|a| Hash[*a] },
  '26 Same-Key Hashes'   => ([:a]*26).zip(1..26).map{|a| Hash[*a] },
  '26 Mixed-Keys Hashes' => (2..27).map do |i|
    factors = i.prime_division.transpose
    Hash[AZSYMBOLS.values_at(*factors.first).zip(factors.last)]
  end
}

def phrogz1(hashes)
  Hash.new{ |h,k| h[k]=[] }.tap do |result|
    hashes.each{ |h| h.each{ |k,v| result[k]<<v } }
  end
end
def phrogz2a(hashes)
  {}.tap{ |r| hashes.each{ |h| h.each{ |k,v| (r[k]||=[]) << v } } }
end
def phrogz2b(hashes)
  hashes.each_with_object({}){ |h,r| h.each{ |k,v| (r[k]||=[]) << v } }
end
def phrogz3(hashes)
  result = hashes.inject({}){ |a,b| a.merge(b){ |_,x,y| [*x,*y] } }
  result.each{ |k,v| result[k] = [v] unless v.is_a? Array }
end
def glenn1(hs)
  hs.reduce({}) {|h,pairs| pairs.each {|k,v| (h[k] ||= []) << v}; h}
end
def glenn2(hs)
  hs.map(&:to_a).flatten(1).reduce({}) {|h,(k,v)| (h[k] ||= []) << v; h}
end
def fl00r(hs)
  h = Hash.new{|h,k| h[k]=[]}
  hs.map(&:to_a).flatten(1).each{|v| h[v[0]] << v[1]}
  h
end
def sawa(a)
  a.map(&:to_a).flatten(1).group_by{|k,v| k}.each_value{|v| v.map!{|k,v| v}}
end
def michael1(hashes)
  h = Hash.new{|h,k| h[k]=[]}
  hashes.each_with_object(h) do |h, result|
    h.each{ |k, v| result[k] << v }
  end
end
def michael2(hashes)
  h = Hash.new{|h,k| h[k]=[]}
  hashes.inject(h) do |result, h|
    h.each{ |k, v| result[k] << v }
    result
  end
end
def tokland1(hs)
  hs.map(&:to_a).flatten(1).map_by{ |k, v| [k, v] }
end
def tokland2(hs)
  Hash[hs.map(&:to_a).flatten(1).group_by(&:first).map{ |k, vs|
    [k, vs.map{|o|o[1]}]
  }]
end

require 'benchmark'
N = 10_000
Benchmark.bm do |x|
  x.report('Phrogz 2a'){ TESTS.each{ |n,h| N.times{ phrogz2a(h) } } }
  x.report('Phrogz 2b'){ TESTS.each{ |n,h| N.times{ phrogz2b(h) } } }
  x.report('Glenn 1  '){ TESTS.each{ |n,h| N.times{ glenn1(h)   } } }
  x.report('Phrogz 1 '){ TESTS.each{ |n,h| N.times{ phrogz1(h)  } } }
  x.report('Michael 1'){ TESTS.each{ |n,h| N.times{ michael1(h) } } }
  x.report('Michael 2'){ TESTS.each{ |n,h| N.times{ michael2(h) } } }
  x.report('Glenn 2  '){ TESTS.each{ |n,h| N.times{ glenn2(h)   } } }
  x.report('fl00r    '){ TESTS.each{ |n,h| N.times{ fl00r(h)    } } }
  x.report('sawa     '){ TESTS.each{ |n,h| N.times{ sawa(h)     } } }
  x.report('Tokland 1'){ TESTS.each{ |n,h| N.times{ tokland1(h) } } }
  x.report('Tokland 2'){ TESTS.each{ |n,h| N.times{ tokland2(h) } } }
  x.report('Phrogz 3 '){ TESTS.each{ |n,h| N.times{ phrogz3(h)  } } }

end

解决方案

Take your pick:

hs.reduce({}) {|h,pairs| pairs.each {|k,v| (h[k] ||= []) << v}; h}

hs.map(&:to_a).flatten(1).reduce({}) {|h,(k,v)| (h[k] ||= []) << v; h}

I'm strongly against messing with the defaults for hashes, as the other suggestions do, because then checking for a value modifies the hash, which seems very wrong to me.

这篇关于合并哈希数组获取值数组的哈希的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆