阵列合并(联合) [英] Array Merge (Union)

查看:238
本文介绍了阵列合并(联合)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个数组我需要合并,并使用联盟(|)?运算符是痛苦的缓慢..还有没有其他的方法来完成一个数组合并

I have two array I need to merge, and using the Union (|) operator is PAINFULLY slow.. are there any other ways to accomplish an array merge?

另外,阵列填充有对象,而不是字符串。

Also, the arrays are filled with objects, not strings.

阵列中的对象的一个​​实施例

An Example of the objects within the array

#<Article 
 id: 1, 
 xml_document_id: 1, 
 source: "<article><domain>events.waikato.ac</domain><excerpt...", 
 created_at: "2010-02-11 01:32:46", 
 updated_at: "2010-02-11 01:41:28"
>

其中source是一小段的XML。

Where source is a short piece of XML.

修改

对不起!通过合并我的意思是我要不要插入重复。

Sorry! By 'merge' I mean I need to not insert duplicates.

A => [1, 2, 3, 4, 5]
B => [3, 4, 5, 6, 7]
A.magic_merge(B) #=> [1, 2, 3, 4, 5, 6, 7]

了解该整数实际上是Article对象和联盟运营商似乎采取的永久

Understanding that the integers are actually Article objects, and the Union operator appears to take forever

推荐答案

下面是一个脚本,基准两个合并技术:使用管道操作符( A1 | A2 ),并使用串连-和uniq的((A1 + A2).uniq )。另外两个标准给予连击的时间和个别uniq的。

Here's a script which benchmarks two merge techniques: using the pipe operator (a1 | a2), and using concatenate-and-uniq ((a1 + a2).uniq). Two additional benchmarks give the time of concatenate and uniq individually.

require 'benchmark'

a1 = []; a2 = []
[a1, a2].each do |a|
  1000000.times { a << rand(999999) }
end

puts "Merge with pipe:"
puts Benchmark.measure { a1 | a2 }

puts "Merge with concat and uniq:"
puts Benchmark.measure { (a1 + a2).uniq }

puts "Concat only:"
puts Benchmark.measure { a1 + a2 }

puts "Uniq only:"
b = a1 + a2
puts Benchmark.measure { b.uniq }

在我的机器(Ubuntu的业报,红宝石1.8.7),我得到的输出是这样的:

On my machine (Ubuntu Karmic, Ruby 1.8.7), I get output like this:

Merge with pipe:
  1.000000   0.030000   1.030000 (  1.020562)
Merge with concat and uniq:
  1.070000   0.000000   1.070000 (  1.071448)
Concat only:
  0.010000   0.000000   0.010000 (  0.005888)
Uniq only:
  0.980000   0.000000   0.980000 (  0.981700)

这表明,这两种技术在速度上非常相似,而 uniq的是操作的较大的部分。这是有道理的直觉,是O(n)的(最好),而简单的串联是O(1)。

Which shows that these two techniques are very similar in speed, and that uniq is the larger component of the operation. This makes sense intuitively, being O(n) (at best), whereas simple concatenation is O(1).

所以,如果你真的想加快这,你需要怎么看待&LT; =&GT; 算在你的阵列中的对象来实现。我相信大部分的时间都花在比较对象,以确保最终阵列中的任何对之间的不平等。

So, if you really want to speed this up, you need to look at how the <=> operator is implemented for the objects in your arrays. I believe that most of the time is being spent comparing objects to ensure inequality between any pair in the final array.

这篇关于阵列合并(联合)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆