[只相当于运营商]什么是快速算法找到一个收集和他们组重复的元素? [英] [only equal operator]what are the fast algorithms to find duplicate elements in a collection and group them?

查看:108
本文介绍了[只相当于运营商]什么是快速算法找到一个收集和他们组重复的元素?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我们有元素的集合,这些元素只有平等的运营商。因此,它是不可能对它们进行排序。

你怎么能挑选出那些有重复,并付诸每组最少的比较? pferably用C ++ $ P $,但算法比语言更重要。例如,给定{E1,E2,E3,E4,E4,E2,E6,E4,E3},我希望提取出{E2,E2},{E3,E3},{E4,E4,E4}。什么样的数据结构和算法,你会选谁?

修改

我的情况下,如果二进制数据1等于二进制数据2,我们可以说这两个要素是相同的。但是,只有 = != 是合乎逻辑的。

 元素1:

4 0 OBJ
<< /类型/页/儿童5 0 R /计数1>>
流
.....二进制数据1 ....
endstream
endobj

要素二:

5 0 OBJ
<< /类型/页/儿童5 0 R /计数1>>
流
.....二进制数据2 ....
endstream
endobj
 

解决方案

这足以找到任意predicate P ,使得 P(A,A)==假 P(A,B)及和放大器; P(B,A)==假 P(A,B)及和放大器; P(B,C)意味着 P(A,C) P(A,B)及!&放; !P(B,A)意味着 A == b 。欠即可满足这一属性,从而更大,那么。但他们远不是唯一的可能性。

您现在可以排序您的收藏由predicate P ,且相等的所有元素将毗邻。在你的情况,定义 P(E1,E2)=真,P(E2,E3)=真

Suppose we have a collection of elements, and these elements only have equal operator. So, it's impossible to sort them.

how can you pick out those with duplicates and put them into each group with least amount of comparison? preferably in C++, but algorithm is more important than the language. For Example given {E1,E2,E3,E4,E4,E2,E6,E4,E3}, I wish to extract out {E2,E2}, {E3,E3}, {E4,E4,E4}. what data structure and algorithm you will choose?

EDIT

My scenario, if binary data 1 is equal to binary data 2 we can say these two elements are identical. But, only = and != is logical

element 1:

4 0 obj
<< /Type /Pages /Kids 5 0 R /Count 1 >>
stream
.....binary data 1....
endstream
endobj

element 2:

5 0 obj
<< /Type /Pages /Kids 5 0 R /Count 1 >>
stream
.....binary data 2....
endstream
endobj

解决方案

It is sufficient to find any arbitrary predicate P such that P(a,a)==false, P(a,b) && P(b,a)==false, P(a,b) && P(b,c) implies P(a,c) and !P(a,b) && !P(b,a) implies a == b. Less-then satisfies this property, as thus greater-then. But they're far from the only possibilities.

You can now sort your collection by predicate P, and all elements which are equal will be adjacent. In your case, define P(E1,E2)=true, P(E2,E3)=true, etc.

这篇关于[只相当于运营商]什么是快速算法找到一个收集和他们组重复的元素?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆