如何合并散列数组以获得值数组的哈希值 [英] How to merge array of hashes to get hash of arrays of values

查看:76
本文介绍了如何合并散列数组以获得值数组的哈希值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这与相反在Ruby中将哈希数组转换为哈希数组



优化和/或高效地将哈希数组转换为哈希值,其中数组是数组所有值:

  hs = [
{a:1,b:2},
{ a:3,c:4},
{b:5,d:6}
]
collect_values(hs)
#=> {:a => [1,3],:b => [2,5],:c => [4],:d => [6]}



这个简洁的代码几乎可以工作,但是在没有重复的情况下无法创建数组:

  def collect_values(哈希)
hashes.inject({}){| a,b | a.merge(b){| _,x,y | [* x,* y]}}
结束
collect_values(hs)
#=> {:a => [1,3],:b => [2,5],:c => 4,:d => 6}
pre>

此代码有效,但是您可以编写更好的版本吗?

  def collect_values(哈希)
#需要Ruby 1.8.7+用于Object#挖掘
Hash.new {| h,k | h [k] = []} .tap do | result |
hashes.each {| h | h.each {| k,v |结果[k] 结束
结束






只适用于Ruby 1.9的解决方案是可以接受的,以下是使用三种不同的哈希数组对基准测试结果进行基准测试的结果(以及我自己的几个答案): b

    每个散列有不同的键,所以不会发生合并:

    [{:a => 1},{:b => 2},{: c => 3},{:d => 4},{:e => 5},{:f => 6},{:g => 7},...]
  • 其中每个散列具有相同的键,因此发生最大合并:

    [ {:a => 1},{:a => 2},{:a => 3},{:a => 4},{:a => 5},{:a => ; 6},{:a => 7},...]




  • [{:c => 1},{:d => 1},{:c => 2},{:f => 1} ,{:c => 1,:d => 1},{:h => 1},{:c => 3},...]



 
用户系统总实际
Phrogz 2a 0.577000 0.000000 0.577000(0.576000)
Phrogz 2b 0.624000 0.000000 0.624000(0.620000)
Glenn 1 0.640000 0.000000 0.640000(0.641000)
Phrogz 1 0.671000 0.000000 0.671000(0.668000)
Michael 1 0.702000 0.000000 0.702000(0.700000)
Michael 2 0.717000 0.000000 0.717000(0.726000)
Glenn 2 0.765000 0.000000 0.765000(0.764000)
fl00r 0.827000 0.000000 0.827000(0.836000)
sawa 0.874000 0.000000 0.874000(0.868000)
Tokland 1 0.873000 0.000000 0.873000(0.876000)
Tokland 2 1.077000 0.000000 1.077000(1.073000)
Phrogz 3 2.106000 0.093000 2.199000(2.209000)

 

最快的代码是我添加的这种方法:

  def collect_values(哈希)
{} .tap {| r | hashes.each {| h | h.each {| k,v | (r [k] || = [])< v}}}
end

我接受了格伦麦克唐纳的答案,因为它在竞争中速度方面,合理简洁,但是(最重要的是),因为它指出了使用Hash和自修改默认过程来构建方便的危险,因为这可能会在用户稍后编制索引时引入不好的更改。 p>

最后,这里是基准代码,以防您想要进行自己的比较:

  require'prime'#生成第三个散列
require'facets'#对于tokland1的map_by
AZSYMBOLS =(:a ..:z).to_a
TESTS = {
'26 Distinct Hashes'=> AZSYMBOLS.zip(1..26).MAP {| A | Hash [* a]},
'26 Same-Key Hashes'=> ([:A] * 26)的.zip(1..26).MAP {| A |哈希[* a]},
'26混合键Hashes'=> (2..27).map do | i |
因子= i.prime_division.transpose
散列[AZSYMBOLS.values_at(* factors.first).zip(factors.last)]
结束
}

def phrogz1(哈希)
Hash.new {| h,k | h [k] = []} .tap do | result |
hashes.each {| h | h.each {| k,v |结果[k]<< v}}
结束
结束
def phrogz2a(散列)
{} .tap {| r | hashes.each {| h | h.each {| k,v | (r [k] || = [])< v}}}
end
def phrogz2b(哈希)
hashes.each_with_object({}){| h,r | h.each {| k,v | (r [k] || = [])< v}}
end
def phrogz3(哈希)
result = hashes.inject({}){| a,b | a.merge(b){| _,x,y | [* x,* y]}}
result.each {| k,v |结果[k] = [v]除非v.is_a? Array}
end
def glenn1(hs)
hs.reduce({}){| h,pairs | pairs.each {| k,v | (h [k] || = [])<< V}; h}(hs)
hs.map(&:to_a).flatten(1).reduce({}){| h,(k,v)| h}
end
def glenn2(hs) (h [k] || = [])<< v; h}
end
def fl00r(hs)
h = Hash.new {| h,k | h [k] = []}
hs.map(&:to_a).flatten(1).each {| v | h [v [0]]<< v(1)}
h
end
def sawa(a)
a.map(&:to_a).flatten(1).group_by {| k,v | ķ} {.each_value | V | v.map {|!K,V | v}}
end
def michael1(哈希)
h = Hash.new {| h,k | h [k] = []}
hashes.each_with_object(h)do | h,result |
h.each {| k,v |结果[k]<< v}
end
end
def michael2(hashes)
h = Hash.new {| h,k | h [k] = []}
hashes.inject(h)do | result,h |
h.each {| k,v |结果[k]<< v}
结果
结束
结束
def tokland1(hs)
hs.map(&:to_a).flatten(1).map_by {| k, v | [k,v]}
end
def tokland2(hs)
Hash [hs.map(&:to_a).flatten(1).group_by(&:first).map {| k,vs |
[k,vs.map {| o | o [1]}]
}]
结束

需要'基准'
N = 10_000
Benchmark.bm do | x |
x.report('Phrogz 2a'){TESTS.each {| n,h | N.times {phrogz2a(h)}}}
x.report('Phrogz 2b'){TESTS.each {| n,h | N.times {phrogz2b(h)}}}
x.report('Glenn 1'){TESTS.each {| n,h | N.times {glenn1(h)}}}
x.report('Phrogz 1'){TESTS.each {| n,h | N.times {phrogz1(h)}}}
x.report('Michael 1'){TESTS.each {| n,h | N.times {michael1(h)}}}
x.report('Michael 2'){TESTS.each {| n,h | N.times {michael2(h)}}}
x.report('Glenn 2'){TESTS.each {| n,h | N.times {glenn2(h)}}}
x.report('fl00r'){TESTS.each {| n,h | N.times {fl00r(h)}}}
x.report('sawa'){TESTS.each {| n,h | N.times {sawa(h)}}}
x.report('Tokland 1'){TESTS.each {| n,h | N.times {tokland1(h)}}}
x.report('Tokland 2'){TESTS.each {| n,h | N.times {tokland2(h)}}}
x.report('Phrogz 3'){TESTS.each {| n,h | N.times {phrogz3(h)}}}

end


解决方案

pairs.each {| k,v | (h [k] || = [])<< V}; (1).reduce({}){| h,(k,v)| | h}

hs.map(& to_a).flatten (h [k] || = [])<< v; h}

我强烈反对混淆哈希的默认设置,因为其他建议是这样做的,因为然后检查的值是否会修改哈希,这对我来说似乎非常错误。


This is the opposite of Turning a Hash of Arrays into an Array of Hashes in Ruby.

Elegantly and/or efficiently turn an array of hashes into a hash where the values are arrays of all values:

hs = [
  { a:1, b:2 },
  { a:3, c:4 },
  { b:5, d:6 }
]
collect_values( hs )
#=> { :a=>[1,3], :b=>[2,5], :c=>[4], :d=>[6] }

This terse code almost works, but fails to create an array when there are no duplicates:

def collect_values( hashes )
  hashes.inject({}){ |a,b| a.merge(b){ |_,x,y| [*x,*y] } }
end
collect_values( hs )
#=> { :a=>[1,3], :b=>[2,5], :c=>4, :d=>6 }

This code works, but can you write a better version?

def collect_values( hashes )
  # Requires Ruby 1.8.7+ for Object#tap
  Hash.new{ |h,k| h[k]=[] }.tap do |result|
    hashes.each{ |h| h.each{ |k,v| result[k]<<v } }
  end
end

Solutions that only work in Ruby 1.9 are acceptable, but should be noted as such.


Here are the results of benchmarking the various answers below (and a few more of my own), using three different arrays of hashes:

  • one where each hash has distinct keys, so no merging ever occurs:
    [{:a=>1}, {:b=>2}, {:c=>3}, {:d=>4}, {:e=>5}, {:f=>6}, {:g=>7}, ...]

  • one where every hash has the same key, so maximum merging occurs:
    [{:a=>1}, {:a=>2}, {:a=>3}, {:a=>4}, {:a=>5}, {:a=>6}, {:a=>7}, ...]

  • and one that is a mix of unique and shared keys:
    [{:c=>1}, {:d=>1}, {:c=>2}, {:f=>1}, {:c=>1, :d=>1}, {:h=>1}, {:c=>3}, ...]

               user     system      total        real
Phrogz 2a  0.577000   0.000000   0.577000 (  0.576000)
Phrogz 2b  0.624000   0.000000   0.624000 (  0.620000)
Glenn 1    0.640000   0.000000   0.640000 (  0.641000)
Phrogz 1   0.671000   0.000000   0.671000 (  0.668000)
Michael 1  0.702000   0.000000   0.702000 (  0.700000)
Michael 2  0.717000   0.000000   0.717000 (  0.726000)
Glenn 2    0.765000   0.000000   0.765000 (  0.764000)
fl00r      0.827000   0.000000   0.827000 (  0.836000)
sawa       0.874000   0.000000   0.874000 (  0.868000)
Tokland 1  0.873000   0.000000   0.873000 (  0.876000)
Tokland 2  1.077000   0.000000   1.077000 (  1.073000)
Phrogz 3   2.106000   0.093000   2.199000 (  2.209000)

The fastest code is this method that I added:

def collect_values(hashes)
  {}.tap{ |r| hashes.each{ |h| h.each{ |k,v| (r[k]||=[]) << v } } }
end

I've accepted "glenn mcdonald's answer" as it was competitive in terms of speed, reasonably terse, but (most importantly) because it pointed out the danger of using a Hash with a self-modifying default proc for convenient construction, as this may introduce bad changes when the user is indexing it later on.

Finally, here's the benchmark code, in case you want to run your own comparisons:

require 'prime'   # To generate the third hash
require 'facets'  # For tokland1's map_by
AZSYMBOLS = (:a..:z).to_a
TESTS = {
  '26 Distinct Hashes'   => AZSYMBOLS.zip(1..26).map{|a| Hash[*a] },
  '26 Same-Key Hashes'   => ([:a]*26).zip(1..26).map{|a| Hash[*a] },
  '26 Mixed-Keys Hashes' => (2..27).map do |i|
    factors = i.prime_division.transpose
    Hash[AZSYMBOLS.values_at(*factors.first).zip(factors.last)]
  end
}

def phrogz1(hashes)
  Hash.new{ |h,k| h[k]=[] }.tap do |result|
    hashes.each{ |h| h.each{ |k,v| result[k]<<v } }
  end
end
def phrogz2a(hashes)
  {}.tap{ |r| hashes.each{ |h| h.each{ |k,v| (r[k]||=[]) << v } } }
end
def phrogz2b(hashes)
  hashes.each_with_object({}){ |h,r| h.each{ |k,v| (r[k]||=[]) << v } }
end
def phrogz3(hashes)
  result = hashes.inject({}){ |a,b| a.merge(b){ |_,x,y| [*x,*y] } }
  result.each{ |k,v| result[k] = [v] unless v.is_a? Array }
end
def glenn1(hs)
  hs.reduce({}) {|h,pairs| pairs.each {|k,v| (h[k] ||= []) << v}; h}
end
def glenn2(hs)
  hs.map(&:to_a).flatten(1).reduce({}) {|h,(k,v)| (h[k] ||= []) << v; h}
end
def fl00r(hs)
  h = Hash.new{|h,k| h[k]=[]}
  hs.map(&:to_a).flatten(1).each{|v| h[v[0]] << v[1]}
  h
end
def sawa(a)
  a.map(&:to_a).flatten(1).group_by{|k,v| k}.each_value{|v| v.map!{|k,v| v}}
end
def michael1(hashes)
  h = Hash.new{|h,k| h[k]=[]}
  hashes.each_with_object(h) do |h, result|
    h.each{ |k, v| result[k] << v }
  end
end
def michael2(hashes)
  h = Hash.new{|h,k| h[k]=[]}
  hashes.inject(h) do |result, h|
    h.each{ |k, v| result[k] << v }
    result
  end
end
def tokland1(hs)
  hs.map(&:to_a).flatten(1).map_by{ |k, v| [k, v] }
end
def tokland2(hs)
  Hash[hs.map(&:to_a).flatten(1).group_by(&:first).map{ |k, vs|
    [k, vs.map{|o|o[1]}]
  }]
end

require 'benchmark'
N = 10_000
Benchmark.bm do |x|
  x.report('Phrogz 2a'){ TESTS.each{ |n,h| N.times{ phrogz2a(h) } } }
  x.report('Phrogz 2b'){ TESTS.each{ |n,h| N.times{ phrogz2b(h) } } }
  x.report('Glenn 1  '){ TESTS.each{ |n,h| N.times{ glenn1(h)   } } }
  x.report('Phrogz 1 '){ TESTS.each{ |n,h| N.times{ phrogz1(h)  } } }
  x.report('Michael 1'){ TESTS.each{ |n,h| N.times{ michael1(h) } } }
  x.report('Michael 2'){ TESTS.each{ |n,h| N.times{ michael2(h) } } }
  x.report('Glenn 2  '){ TESTS.each{ |n,h| N.times{ glenn2(h)   } } }
  x.report('fl00r    '){ TESTS.each{ |n,h| N.times{ fl00r(h)    } } }
  x.report('sawa     '){ TESTS.each{ |n,h| N.times{ sawa(h)     } } }
  x.report('Tokland 1'){ TESTS.each{ |n,h| N.times{ tokland1(h) } } }
  x.report('Tokland 2'){ TESTS.each{ |n,h| N.times{ tokland2(h) } } }
  x.report('Phrogz 3 '){ TESTS.each{ |n,h| N.times{ phrogz3(h)  } } }

end

解决方案

Take your pick:

hs.reduce({}) {|h,pairs| pairs.each {|k,v| (h[k] ||= []) << v}; h}

hs.map(&:to_a).flatten(1).reduce({}) {|h,(k,v)| (h[k] ||= []) << v; h}

I'm strongly against messing with the defaults for hashes, as the other suggestions do, because then checking for a value modifies the hash, which seems very wrong to me.

这篇关于如何合并散列数组以获得值数组的哈希值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆