数组合并(联合) [英] Array Merge (Union)

查看:43
本文介绍了数组合并(联合)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个数组需要合并,而使用联合 (|) 运算符的速度非常慢.. 还有其他方法可以完成数组合并吗?

I have two array I need to merge, and using the Union (|) operator is PAINFULLY slow.. are there any other ways to accomplish an array merge?

此外,数组填充的是对象,而不是字符串.

Also, the arrays are filled with objects, not strings.

数组中的对象示例

#<Article 
 id: 1, 
 xml_document_id: 1, 
 source: "<article><domain>events.waikato.ac</domain><excerpt...", 
 created_at: "2010-02-11 01:32:46", 
 updated_at: "2010-02-11 01:41:28"
>

其中 source 是一小段 XML.

Where source is a short piece of XML.

编辑

对不起!合并"是指我不需要插入重复项.

Sorry! By 'merge' I mean I need to not insert duplicates.

A => [1, 2, 3, 4, 5]
B => [3, 4, 5, 6, 7]
A.magic_merge(B) #=> [1, 2, 3, 4, 5, 6, 7]

理解整数实际上是文章对象,并且联合运算符似乎永远

Understanding that the integers are actually Article objects, and the Union operator appears to take forever

推荐答案

这里的脚本对两种合并技术进行了基准测试:使用管道运算符 (a1 | a2) 和使用 concatenate-and-uniq((a1 + a2).uniq).两个额外的基准测试分别给出了 concatenate 和 uniq 的时间.

Here's a script which benchmarks two merge techniques: using the pipe operator (a1 | a2), and using concatenate-and-uniq ((a1 + a2).uniq). Two additional benchmarks give the time of concatenate and uniq individually.

require 'benchmark'

a1 = []; a2 = []
[a1, a2].each do |a|
  1000000.times { a << rand(999999) }
end

puts "Merge with pipe:"
puts Benchmark.measure { a1 | a2 }

puts "Merge with concat and uniq:"
puts Benchmark.measure { (a1 + a2).uniq }

puts "Concat only:"
puts Benchmark.measure { a1 + a2 }

puts "Uniq only:"
b = a1 + a2
puts Benchmark.measure { b.uniq }

在我的机器上(Ubuntu Karmic,Ruby 1.8.7),我得到如下输出:

On my machine (Ubuntu Karmic, Ruby 1.8.7), I get output like this:

Merge with pipe:
  1.000000   0.030000   1.030000 (  1.020562)
Merge with concat and uniq:
  1.070000   0.000000   1.070000 (  1.071448)
Concat only:
  0.010000   0.000000   0.010000 (  0.005888)
Uniq only:
  0.980000   0.000000   0.980000 (  0.981700)

这表明这两种技术在速度上非常相似,并且 uniq 是操作的较大部分.这在直觉上是有道理的,它是 O(n)(充其量),而简单的连接是 O(1).

Which shows that these two techniques are very similar in speed, and that uniq is the larger component of the operation. This makes sense intuitively, being O(n) (at best), whereas simple concatenation is O(1).

因此,如果您真的想加快速度,您需要查看 <=> 运算符是如何为数组中的对象实现的.我相信大部分时间都花在比较对象上,以确保最终数组中的任何对之间不相等.

So, if you really want to speed this up, you need to look at how the <=> operator is implemented for the objects in your arrays. I believe that most of the time is being spent comparing objects to ensure inequality between any pair in the final array.

这篇关于数组合并(联合)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆