算法来查找列表的重复次数可含有任何数目的重复 [英] Algorithm to find a repeated number in a list that may contain any number of repeats

查看:98
本文介绍了算法来查找列表的重复次数可含有任何数目的重复的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

关闭它作为一个重复之前,请仔细阅读这个问题,但如果它是一个诚实的重复,我会很乐意去了解它。这是<一个一般化href="http://stackoverflow.com/questions/6420467/find-any-of-multiple-repeated-integers-in-a-list">Find在一个列表多个可能重复的整数中的任何一个。

Please read this question carefully before closing it as a duplicate, though if it is an honest duplicate I'll be happy to know about it. This is a generalization of Find any one of multiple possible repeated integers in a list.

由于任何一组的取值   整数,以及任何阵列的 A   长度 N + 1 每个条目   从拍摄的取值,什么是最好的   算法找到一些(必须有   至少一个)的重复录入的 A

Given any set S of N distinct integers, and any array A of length N+1 each entry of which is taken from S, what is the best algorithm to find some (there must be at least one) repeated entry of A?

注意:的有可能是在多个重复条目的 A ,然后进入任何可以重复多次,

NOTE: There could be multiple repeated entries in A, and any entry could be repeated multiple times.

由于尼莫所指出的,琐碎的解决方案采用 O(1)空间和 O(N ^ 2)的时间。我正在寻找一个解决方案,提高了时间不影响空间太多了。为了precise,溶液(S)我在寻找:

As Nemo points out, the trivial solution takes O(1) space and O(N^2) time. I'm looking for a solution that improves the time without compromising the space too much. To be precise, the solution(s) I'm looking for:

  • 返回一个值 出现在 A 至少两次,
  • 在最多使用的 O(日志N)空间的没有的修改 A
  • 在不到 O(N ^ 2)时间
  • Returns a value a that appears in A at least twice,
  • Uses at most O(log N) space without modifying A, and
  • Takes less than O(N^2) time

编辑:的集合的取值是有保证的阵列的 A 至少有一个重复的条目。对于这个问题,不要以为你的取值给你们的有序集合。您可以查询的取值(布尔返回 s在s ),还可以查询的 A (请致电 A [1] ),但仅此而已。任何解决方案进行排序的 A 取值超出了空间的限制。

The set S is there to ensure that the array A has at least one repeated entry. For this problem, do not assume that you have S given to you as an ordered set. You can query S (boolean to return true is s in S and false otherwise) and you can query A (call for A[i]), but that's all. Any solution that sorts A or S exceeds the space limit.

这概括无效我的<一个href="http://stackoverflow.com/questions/6420467/find-any-of-multiple-repeated-integers-in-a-list/6420503#6420503">pointer解决方案原来的问题(其中有 O(1)空间和 O(N)的时间),和空间的限制,我堂堂无效<一href="http://stackoverflow.com/questions/6420467/find-any-of-multiple-repeated-integers-in-a-list/6420498#6420498">fiver's解决方案(其中有 O(N)空间和时间)。

This generalization invalidates my pointer solution to the original question (which has O(1) space and O(N) time), and the space constraint I'm imposing invalidates fiver's solution (which has O(N) space and time).

推荐答案

这个算法类似于贾斯汀西蒙的,但关键是如何计算的S只用O(1)空间中位数(或第k个元素)有效的。

This algorithm is similar to Justin Simon's, but the key point is how to compute the median (or the kth element) of S using just O(1) space efficiently.

下面是密钥算法,这是随机化:

Here is that key algorithm, which is randomized:

设置下等于S的最小单元和上等于S的最大元素选择一个随机元素x选自S也就是说之间下限和上限(此加至多为O(n)预期时间)。计算x的等级(O(n)的时间)。如果x的等级太低,设置较低的x的继任者(O(n)的时间),否则设置上限等于x的predecessor(O(n)的时间)。重复,直到下等于上。

Set lower equal to the minimum element of S and upper equal to the maximum element of S. Pick a random element x from S that is between lower and upper (this costs at most O(n) expected time). Compute the rank of x (O(n) time). If x's rank is too low, set lower to the successor of x (O(n) time), else set upper equal to the predecessor of x (O(n) time). Repeat until lower equals upper.

请注意,每个迭代成本为O(n)的预期,有O(LG N)次迭代的期望,以便在预期的时间成本为O(n LG电子n)和空间使用情况为O(1),因为我们只保存下和上。

Note that each iteration costs O(n) in expectation and there are O(lg n) iterations in expectation so the expected time cost is O(n lg n) and space usage is O(1) since we only store lower and upper.

利用这种能力来选择第k个元素,我们就可以使用在<建议的鸽巢原理href="http://stackoverflow.com/questions/6420467/find-any-one-of-multiple-possible-repeated-integers-in-a-list">the原来的问题找到的S越来越小的片断,包含了太多的元素都是不同的,使用O(LG N)A型和O(1)空间的线性扫描存储在每个区域元素的相关款项。每个这样的迭代成本为O(n)除为O(n LG N)找到第k个元素的成本,并有O(LG N)次迭代,所以总成本为O(n LG ^ 2 N)。

Using this ability to select the kth element, we can then use the pigeonhole principle as suggested in the original question to find increasingly small segments of S that contain too many elements to all be distinct, using O(lg n) linear scans of A and O(1) space to store the relevant sums of elements in each region. Each such iteration costs O(n) in addition to the O(n lg n) cost of finding the kth element, and there are O(lg n) iterations, so the total cost is O(n lg^2 n).

这篇关于算法来查找列表的重复次数可含有任何数目的重复的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆