如何合并哈希数组以获取值数组的哈希 [英] How to merge array of hashes to get hash of arrays of values

查看:29
本文介绍了如何合并哈希数组以获取值数组的哈希的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这与 相反将数组的哈希转换为 Ruby 中的哈希数组.

优雅和/或有效地将散列数组转换为散列,其中值是所有值的数组:

hs = [{ a:1, b:2 },{ a:3, c:4 },{ b:5, d:6 }]collect_values( hs )#=>{ :a=>[1,3], :b=>[2,5], :c=>[4], :d=>[6] }

这个简洁的代码几乎可以工作,但在没有重复项时无法创建数组:

def collect_values( hashes )hashes.inject({}){ |a,b|a.merge(b){ |_,x,y|[*x,*y] } }结尾collect_values( hs )#=>{ :a=>[1,3], :b=>[2,5], :c=>4, :d=>6 }

这段代码有效,但你能写一个更好的版本吗?

def collect_values( hashes )# 对象需要 Ruby 1.8.7+#tapHash.new{ |h,k|h[k]=[] }.tap do |result|hashes.each{ |h|h.each{ |k,v|结果[k]<<v } }结尾结尾

仅适用于 Ruby 1.9 的解决方案是可以接受的,但应注意.

<小时>

以下是使用三种不同的哈希数组对以下各种答案(以及我自己的一些答案)进行基准测试的结果:

  • 其中每个散列都有不同的键,因此不会发生合并:
    <代码>[{:a=>1}, {:b=>2}, {:c=>3}, {:d=>4}, {:e=>5}, {:f=>6}, {:g=>7}, ...]

  • 其中每个散列都具有相同的键,因此发生最大合并:
    <代码>[{:a=>1}, {:a=>2}, {:a=>3}, {:a=>4}, {:a=>5}, {:a=>6}, {:a=>7}, ...]

  • 还有一个混合了唯一密钥和共享密钥:
    <代码>[{:c=>1}, {:d=>1}, {:c=>2}, {:f=>1}, {:c=>1, :d=>1}, {:h=>1}, {:c=>3}, ...]
<前>用户系统总实数Phrogz 2a 0.577000 0.000000 0.577000 ( 0.576000)Phrogz 2b 0.624000 0.000000 0.624000 ( 0.620000)格伦 1 0.640000 0.000000 0.640000 ( 0.641000)Phrogz 1 0.671000 0.000000 0.671000 ( 0.668000)迈克尔 1 0.702000 0.000000 0.702000 ( 0.700000)迈克尔 2 0.717000 0.000000 0.717000 ( 0.726000)格伦 2 0.765000 0.000000 0.765000 ( 0.764000)fl00r 0.827000 0.000000 0.827000 ( 0.836000)萨瓦 0.874000 0.000000 0.874000 ( 0.868000)托克兰 1 0.873000 0.000000 0.873000 ( 0.876000)托克兰 2 1.077000 0.000000 1.077000 ( 1.073000)Phrogz 3 2.106000 0.093000 2.199000 ( 2.209000)

最快的代码是我添加的这个方法:

def collect_values(hash){}.tap{ |r|hashes.each{ |h|h.each{ |k,v|(r[k]||=[]) <<v } } }结尾

我已接受glenn mcdonald's answer" 因为它在速度方面具有竞争力,相当简洁,但(最重要的是)因为它指出了使用带有自修改默认 proc 的 Hash 以方便构建的危险,因为这可能会在用户稍后对其进行索引时引入不良更改.

最后,这是基准代码,以防您想运行自己的比较:

require 'prime' # 生成第三个哈希require 'facets' # 对于 tokland1 的 map_byAZSYMBOLS = (:a..:z).to_a测试 = {'26 个不同的哈希' =>AZSYMBOLS.zip(1..26).map{|a|哈希[*a] },'26 个相同密钥的哈希' =>([:a]*26).zip(1..26).map{|a|哈希[*a] },'26 混合键哈希' =>(2..27).map do |i|因素 = i.prime_division.transpose哈希[AZSYMBOLS.values_at(*factors.first).zip(factors.last)]结尾}def phrogz1(哈希)Hash.new{ |h,k|h[k]=[] }.tap do |result|hashes.each{ |h|h.each{ |k,v|结果[k]<<v } }结尾结尾def phrogz2a(哈希){}.tap{ |r|hashes.each{ |h|h.each{ |k,v|(r[k]||=[]) <<v } } }结尾def phrogz2b(哈希)hashes.each_with_object({}){ |h,r|h.each{ |k,v|(r[k]||=[]) <<v } }结尾def phrogz3(哈希)结果 = hashes.inject({}){ |a,b|a.merge(b){ |_,x,y|[*x,*y] } }result.each{ |k,v|结果[k] = [v] 除非v.is_a?大批 }结尾def glenn1(hs)hs.reduce({}) {|h,pairs|pair.each {|k,v|(h[k] ||= []) <<v};H}结尾def glenn2(hs)hs.map(&:to_a).flatten(1).reduce({}) {|h,(k,v)|(h[k] ||= []) <<v;H}结尾def fl00r(hs)h = Hash.new{|h,k|h[k]=[]}hs.map(&:to_a).flatten(1).each{|v|h[v[0]] <<v[1]}H结尾def sawa(a)a.map(&:to_a).flatten(1).group_by{|k,v|k}.each_value{|v|v.map!{|k,v|v}}结尾def michael1(哈希)h = Hash.new{|h,k|h[k]=[]}hashes.each_with_object(h) 做 |h,结果|h.each{ |k, v|结果[k]<<v }结尾结尾def michael2(哈希)h = Hash.new{|h,k|h[k]=[]}hashes.inject(h) 做 |result, h|h.each{ |k, v|结果[k]<<v }结果结尾结尾def tokland1(hs)hs.map(&:to_a).flatten(1).map_by{ |k, v|[k, v] }结尾def tokland2(hs)哈希[hs.map(&:to_a).flatten(1).group_by(&:first).map{ |k, vs|[k, vs.map{|o|o[1]}]}]结尾需要基准"N = 10_000Benchmark.bm 做 |x|x.report('Phrogz 2a'){ TESTS.each{ |n,h|N.times{ phrogz2a(h) } } }x.report('Phrogz 2b'){ TESTS.each{ |n,h|N.times{ phrogz2b(h) } } }x.report('Glenn 1'){ TESTS.each{ |n,h|N.times{ glenn1(h) } } }x.report('Phrogz 1'){ TESTS.each{ |n,h|N.times{ phrogz1(h) } } }x.report('Michael 1'){ TESTS.each{ |n,h|N.times{ michael1(h) } } }x.report('Michael 2'){ TESTS.each{ |n,h|N.times{ michael2(h) } } }x.report('Glenn 2'){ TESTS.each{ |n,h|N.times{ glenn2(h) } } }x.report('fl00r'){ TESTS.each{ |n,h|N.times{ fl00r(h) } } }x.report('sawa'){ TESTS.each{ |n,h|N.times{sawa(h)}}}x.report('Tokland 1'){ TESTS.each{ |n,h|N.times{ tokland1(h) } } }x.report('Tokland 2'){ TESTS.each{ |n,h|N.times{ tokland2(h) } } }x.report('Phrogz 3'){ TESTS.each{ |n,h|N.times{ phrogz3(h) } } }结尾

解决方案

选择:

hs.reduce({}) {|h,pairs|pair.each {|k,v|(h[k] ||= []) <<v};H}hs.map(&:to_a).flatten(1).reduce({}) {|h,(k,v)|(h[k] ||= []) <<v;H}

我强烈反对像其他建议那样弄乱散列的默认值,因为然后检查一个值会修改散列,这对我来说似乎是错误的.

This is the opposite of Turning a Hash of Arrays into an Array of Hashes in Ruby.

Elegantly and/or efficiently turn an array of hashes into a hash where the values are arrays of all values:

hs = [
  { a:1, b:2 },
  { a:3, c:4 },
  { b:5, d:6 }
]
collect_values( hs )
#=> { :a=>[1,3], :b=>[2,5], :c=>[4], :d=>[6] }

This terse code almost works, but fails to create an array when there are no duplicates:

def collect_values( hashes )
  hashes.inject({}){ |a,b| a.merge(b){ |_,x,y| [*x,*y] } }
end
collect_values( hs )
#=> { :a=>[1,3], :b=>[2,5], :c=>4, :d=>6 }

This code works, but can you write a better version?

def collect_values( hashes )
  # Requires Ruby 1.8.7+ for Object#tap
  Hash.new{ |h,k| h[k]=[] }.tap do |result|
    hashes.each{ |h| h.each{ |k,v| result[k]<<v } }
  end
end

Solutions that only work in Ruby 1.9 are acceptable, but should be noted as such.


Here are the results of benchmarking the various answers below (and a few more of my own), using three different arrays of hashes:

  • one where each hash has distinct keys, so no merging ever occurs:
    [{:a=>1}, {:b=>2}, {:c=>3}, {:d=>4}, {:e=>5}, {:f=>6}, {:g=>7}, ...]

  • one where every hash has the same key, so maximum merging occurs:
    [{:a=>1}, {:a=>2}, {:a=>3}, {:a=>4}, {:a=>5}, {:a=>6}, {:a=>7}, ...]

  • and one that is a mix of unique and shared keys:
    [{:c=>1}, {:d=>1}, {:c=>2}, {:f=>1}, {:c=>1, :d=>1}, {:h=>1}, {:c=>3}, ...]

               user     system      total        real
Phrogz 2a  0.577000   0.000000   0.577000 (  0.576000)
Phrogz 2b  0.624000   0.000000   0.624000 (  0.620000)
Glenn 1    0.640000   0.000000   0.640000 (  0.641000)
Phrogz 1   0.671000   0.000000   0.671000 (  0.668000)
Michael 1  0.702000   0.000000   0.702000 (  0.700000)
Michael 2  0.717000   0.000000   0.717000 (  0.726000)
Glenn 2    0.765000   0.000000   0.765000 (  0.764000)
fl00r      0.827000   0.000000   0.827000 (  0.836000)
sawa       0.874000   0.000000   0.874000 (  0.868000)
Tokland 1  0.873000   0.000000   0.873000 (  0.876000)
Tokland 2  1.077000   0.000000   1.077000 (  1.073000)
Phrogz 3   2.106000   0.093000   2.199000 (  2.209000)

The fastest code is this method that I added:

def collect_values(hashes)
  {}.tap{ |r| hashes.each{ |h| h.each{ |k,v| (r[k]||=[]) << v } } }
end

I've accepted "glenn mcdonald's answer" as it was competitive in terms of speed, reasonably terse, but (most importantly) because it pointed out the danger of using a Hash with a self-modifying default proc for convenient construction, as this may introduce bad changes when the user is indexing it later on.

Finally, here's the benchmark code, in case you want to run your own comparisons:

require 'prime'   # To generate the third hash
require 'facets'  # For tokland1's map_by
AZSYMBOLS = (:a..:z).to_a
TESTS = {
  '26 Distinct Hashes'   => AZSYMBOLS.zip(1..26).map{|a| Hash[*a] },
  '26 Same-Key Hashes'   => ([:a]*26).zip(1..26).map{|a| Hash[*a] },
  '26 Mixed-Keys Hashes' => (2..27).map do |i|
    factors = i.prime_division.transpose
    Hash[AZSYMBOLS.values_at(*factors.first).zip(factors.last)]
  end
}

def phrogz1(hashes)
  Hash.new{ |h,k| h[k]=[] }.tap do |result|
    hashes.each{ |h| h.each{ |k,v| result[k]<<v } }
  end
end
def phrogz2a(hashes)
  {}.tap{ |r| hashes.each{ |h| h.each{ |k,v| (r[k]||=[]) << v } } }
end
def phrogz2b(hashes)
  hashes.each_with_object({}){ |h,r| h.each{ |k,v| (r[k]||=[]) << v } }
end
def phrogz3(hashes)
  result = hashes.inject({}){ |a,b| a.merge(b){ |_,x,y| [*x,*y] } }
  result.each{ |k,v| result[k] = [v] unless v.is_a? Array }
end
def glenn1(hs)
  hs.reduce({}) {|h,pairs| pairs.each {|k,v| (h[k] ||= []) << v}; h}
end
def glenn2(hs)
  hs.map(&:to_a).flatten(1).reduce({}) {|h,(k,v)| (h[k] ||= []) << v; h}
end
def fl00r(hs)
  h = Hash.new{|h,k| h[k]=[]}
  hs.map(&:to_a).flatten(1).each{|v| h[v[0]] << v[1]}
  h
end
def sawa(a)
  a.map(&:to_a).flatten(1).group_by{|k,v| k}.each_value{|v| v.map!{|k,v| v}}
end
def michael1(hashes)
  h = Hash.new{|h,k| h[k]=[]}
  hashes.each_with_object(h) do |h, result|
    h.each{ |k, v| result[k] << v }
  end
end
def michael2(hashes)
  h = Hash.new{|h,k| h[k]=[]}
  hashes.inject(h) do |result, h|
    h.each{ |k, v| result[k] << v }
    result
  end
end
def tokland1(hs)
  hs.map(&:to_a).flatten(1).map_by{ |k, v| [k, v] }
end
def tokland2(hs)
  Hash[hs.map(&:to_a).flatten(1).group_by(&:first).map{ |k, vs|
    [k, vs.map{|o|o[1]}]
  }]
end

require 'benchmark'
N = 10_000
Benchmark.bm do |x|
  x.report('Phrogz 2a'){ TESTS.each{ |n,h| N.times{ phrogz2a(h) } } }
  x.report('Phrogz 2b'){ TESTS.each{ |n,h| N.times{ phrogz2b(h) } } }
  x.report('Glenn 1  '){ TESTS.each{ |n,h| N.times{ glenn1(h)   } } }
  x.report('Phrogz 1 '){ TESTS.each{ |n,h| N.times{ phrogz1(h)  } } }
  x.report('Michael 1'){ TESTS.each{ |n,h| N.times{ michael1(h) } } }
  x.report('Michael 2'){ TESTS.each{ |n,h| N.times{ michael2(h) } } }
  x.report('Glenn 2  '){ TESTS.each{ |n,h| N.times{ glenn2(h)   } } }
  x.report('fl00r    '){ TESTS.each{ |n,h| N.times{ fl00r(h)    } } }
  x.report('sawa     '){ TESTS.each{ |n,h| N.times{ sawa(h)     } } }
  x.report('Tokland 1'){ TESTS.each{ |n,h| N.times{ tokland1(h) } } }
  x.report('Tokland 2'){ TESTS.each{ |n,h| N.times{ tokland2(h) } } }
  x.report('Phrogz 3 '){ TESTS.each{ |n,h| N.times{ phrogz3(h)  } } }

end

解决方案

Take your pick:

hs.reduce({}) {|h,pairs| pairs.each {|k,v| (h[k] ||= []) << v}; h}

hs.map(&:to_a).flatten(1).reduce({}) {|h,(k,v)| (h[k] ||= []) << v; h}

I'm strongly against messing with the defaults for hashes, as the other suggestions do, because then checking for a value modifies the hash, which seems very wrong to me.

这篇关于如何合并哈希数组以获取值数组的哈希的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆