Ruby-根据哈希键的子集显示2个哈希数组之间的增量 [英] Ruby - Show Deltas Between 2 array of hashes based on subset of hash keys

查看:96
本文介绍了Ruby-根据哈希键的子集显示2个哈希数组之间的增量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试比较具有非常相似的哈希结构(相同且始终存在的密钥)的两个哈希数组,并返回两者之间的增量-具体来说,我想捕获以下:

I'm attempting to compare two arrays of hashes with very similar hash structure (identical and always-present keys) and return the deltas between the two--specifically, I'd like to capture the folllowing:

  • 散列array1中不存在的array2
  • 中的一部分
  • array2的散列部分array1
  • 中不存在
  • 同时出现在两个数据集中的哈希
  • Hashes part of array1 that do not exist in array2
  • Hashes part of array2 that do not exist in array1
  • Hashes which appear in both data sets

通常可以通过简单地执行以下操作来实现:

This typically can be achieved by simply doing the following:

deltas_old_new = (array1-array2)
deltas_new_old = (array2-array1)

我的问题(这变成了2-3个小时的奋斗!)是我需要根据哈希值中的3个键("id","ref", 'name')-这3个键的值实际上是构成数据中唯一条目的内容-但我必须保留哈希的其他键/值对(例如'extra'和为简洁起见未显示的许多其他键/值对.

The problem for me (which has turned into a 2-3 hour struggle!) is that I need to identify the deltas based on the values of 3 keys within the hash ('id', 'ref', 'name')--the values of these 3 keys are effectively what makes up a unique entry in my data -- but I must retain the other key/value pairs of the hash (e.g. 'extra' and numerous other key/value pairs not shown for brevity.

示例数据:

array1 = [{'id' => '1', 'ref' => '1001', 'name' => 'CA', 'extra' => 'Not Sorted On 5'},
          {'id' => '2', 'ref' => '1002', 'name' => 'NY', 'extra' => 'Not Sorted On 7'},
          {'id' => '3', 'ref' => '1003', 'name' => 'WA', 'extra' => 'Not Sorted On 9'},
          {'id' => '7', 'ref' => '1007', 'name' => 'OR', 'extra' => 'Not Sorted On 11'}]

array2 = [{'id' => '1', 'ref' => '1001', 'name' => 'CA', 'extra' => 'Not Sorted On 5'},
          {'id' => '3', 'ref' => '1003', 'name' => 'WA', 'extra' => 'Not Sorted On 9'},
          {'id' => '8', 'ref' => '1002', 'name' => 'NY', 'extra' => 'Not Sorted On 7'},
          {'id' => '5', 'ref' => '1005', 'name' => 'MT', 'extra' => 'Not Sorted On 10'},
          {'id' => '12', 'ref' => '1012', 'name' => 'TX', 'extra' => 'Not Sorted On 85'}]

预期结果(3个单独的哈希数组):

对象包含array1中的数据,但不包含array2中的数据-

Object containing data in array1 but not in array2 --

[{'id' => '2', 'ref' => '1002', 'name' => 'NY', 'extra' => 'Not Sorted On 7'},
 {'id' => '7', 'ref' => '1007', 'name' => 'OR', 'extra' => 'Not Sorted On 11'}]

对象包含array2中的数据,但不包含array1中的数据-

Object containing data in array2 but not in array1 --

[{'id' => '8', 'ref' => '1002', 'name' => 'NY', 'extra' => 'Not Sorted On 7'},
 {'id' => '5', 'ref' => '1005', 'name' => 'MT', 'extra' => 'Not Sorted On 10'},
 {'id' => '12', 'ref' => '1012', 'name' => 'TX', 'extra' => 'Not Sorted On 85'}]

同时包含数据array1array2的对象-

Object containing data in BOTH array1 and array2 --

[{'id' => '1', 'ref' => '1001', 'name' => 'CA', 'extra' => 'Not Sorted On 5'},
 {'id' => '3', 'ref' => '1003', 'name' => 'WA', 'extra' => 'Not Sorted On 9'}]

我已经尝试了多次尝试比较数组的迭代并基于3个键使用Hash#keep_if,并将两个数据集合并到单个数组中,然后尝试基于array1进行重复数据删除,但是我空手而归.预先感谢您的时间和帮助!

I've tried numerous attempts at comparing iterating over the arrays and using Hash#keep_if based on the 3 keys as well as merging both data sets into a single array and then attempting to de-dup based on array1 but I keep coming up empty handed. Thank you in advance for your time and assistance!

推荐答案

这不是很漂亮,但是可以.它会创建第三个数组,其中包含array1array2中的所有唯一值,并对其进行迭代.

This isn't very pretty, but it works. It creates a third array containing all unique values in both array1 and array2 and iterates through that.

然后,由于include?不允许自定义匹配器,因此我们可以使用

Then, since include? doesn't allow a custom matcher, we can fake it by using detect and looking for an item in the array which has the custom sub-hash matching. We'll wrap that in a custom method so we can just call it passing in array1 or array2 instead of writing it twice.

最后,我们遍历array3并确定item是来自array1array2还是来自两者,并添加到相应的输出数组中.

Finally, we loop through our array3 and determine whether the item came from array1, array2, or both of them and add to the corresponding output array.

array1 = [{'id' => '1', 'ref' => '1001', 'name' => 'CA', 'extra' => 'Not Sorted On 5'},
          {'id' => '2', 'ref' => '1002', 'name' => 'NY', 'extra' => 'Not Sorted On 7'},
          {'id' => '3', 'ref' => '1003', 'name' => 'WA', 'extra' => 'Not Sorted On 9'},
          {'id' => '7', 'ref' => '1007', 'name' => 'OR', 'extra' => 'Not Sorted On 11'}]

array2 = [{'id' => '1', 'ref' => '1001', 'name' => 'CA', 'extra' => 'Not Sorted On 5'},
          {'id' => '3', 'ref' => '1003', 'name' => 'WA', 'extra' => 'Not Sorted On 9'},
          {'id' => '8', 'ref' => '1002', 'name' => 'NY', 'extra' => 'Not Sorted On 7'},
          {'id' => '5', 'ref' => '1005', 'name' => 'MT', 'extra' => 'Not Sorted On 10'},
          {'id' => '12', 'ref' => '1012', 'name' => 'TX', 'extra' => 'Not Sorted On 85'}]

# combine the arrays into 1 array that contains items in both array1 and array2 to loop through
array3 = (array1 + array2).uniq { |item| { 'id' => item['id'], 'ref' => item['ref'], 'name' => item['name'] } }

# Array#include? doesn't allow a custom matcher, so we can fake it by using Array#detect
def is_included_in(array, object)
  object_identifier = { 'id' => object['id'], 'ref' => object['ref'], 'name' => object['name'] }

  array.detect do |item|
    { 'id' => item['id'], 'ref' => item['ref'], 'name' => item['name'] } == object_identifier
  end
end

# output array initializing
array1_only = []
array2_only = []
array1_and_array2 = []

# loop through all items in both array1 and array2 and check if it was in array1 or array2
# if it was in both, add to array1_and_array2, otherwise, add it to the output array that
# corresponds to the input array
array3.each do |item|
  in_array1 = is_included_in(array1, item)
  in_array2 = is_included_in(array2, item)

  if in_array1 && in_array2
    array1_and_array2.push item
  elsif in_array1
    array1_only.push item
  else
    array2_only.push item
  end
end


puts array1_only.inspect        # => [{"id"=>"2", "ref"=>"1002", "name"=>"NY", "extra"=>"Not Sorted On 7"}, {"id"=>"7", "ref"=>"1007", "name"=>"OR", "extra"=>"Not Sorted On 11"}]
puts array2_only.inspect        # => [{"id"=>"8", "ref"=>"1002", "name"=>"NY", "extra"=>"Not Sorted On 7"}, {"id"=>"5", "ref"=>"1005", "name"=>"MT", "extra"=>"Not Sorted On 10"}, {"id"=>"12", "ref"=>"1012", "name"=>"TX", "extra"=>"Not Sorted On 85"}]
puts array1_and_array2.inspect  # => [{"id"=>"1", "ref"=>"1001", "name"=>"CA", "extra"=>"Not Sorted On 5"}, {"id"=>"3", "ref"=>"1003", "name"=>"WA", "extra"=>"Not Sorted On 9"}]

这篇关于Ruby-根据哈希键的子集显示2个哈希数组之间的增量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆