确定数组是否包含 n...n+m 的算法? [英] Algorithm to determine if array contains n...n+m?

查看:50
本文介绍了确定数组是否包含 n...n+m 的算法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 Reddit 上看到了这个问题,并没有提出积极的解决方案,我认为在这里提出这个问题是个完美的问题.这是一个关于面试问题的帖子:

I saw this question on Reddit, and there were no positive solutions presented, and I thought it would be a perfect question to ask here. This was in a thread about interview questions:

编写一个方法,该方法接受一个大小为 m 的 int 数组,如果数组包含数字 n...n+m-1、该范围内的所有数字以及仅该范围内的数字,则返回 (True/False).不保证数组被排序.(例如,{2,3,4} 将返回 true.{1,3,1} 将返回 false,{1,2,4} 将返回 false.

Write a method that takes an int array of size m, and returns (True/False) if the array consists of the numbers n...n+m-1, all numbers in that range and only numbers in that range. The array is not guaranteed to be sorted. (For instance, {2,3,4} would return true. {1,3,1} would return false, {1,2,4} would return false.

我遇到的问题是我的面试官一直要求我优化(更快的 O(n)、更少的内存等),以至于他声称您可以使用恒定的内存量.从来没有想过那个.

The problem I had with this one is that my interviewer kept asking me to optimize (faster O(n), less memory, etc), to the point where he claimed you could do it in one pass of the array using a constant amount of memory. Never figured that one out.

与您的解决方案一起,请说明他们是否假设数组包含唯一项目.还请指出您的解决方案是否假设序列从 1 开始.(我稍微修改了问题以允许出现 2、3、4...)

Along with your solutions please indicate if they assume that the array contains unique items. Also indicate if your solution assumes the sequence starts at 1. (I've modified the question slightly to allow cases where it goes 2, 3, 4...)

我现在认为不存在处理重复的时间线性和空间常数算法.任何人都可以验证这一点吗?

edit: I am now of the opinion that there does not exist a linear in time and constant in space algorithm that handles duplicates. Can anyone verify this?

重复问题归结为测试数组是否在 O(n) 时间、O(1) 空间中包含重复项.如果可以做到这一点,您可以先简单地进行测试,如果没有重复,则运行发布的算法.那么你能在 O(n) 时间 O(1) 空间中测试欺骗吗?

The duplicate problem boils down to testing to see if the array contains duplicates in O(n) time, O(1) space. If this can be done you can simply test first and if there are no duplicates run the algorithms posted. So can you test for dupes in O(n) time O(1) space?

推荐答案

在不允许小于1的数字并且没有重复的假设下,有一个简单的求和标识——来自的数字之和1m1 为增量是 (m * (m + 1))/2.然后,您可以对数组求和并使用此标识.

Under the assumption numbers less than one are not allowed and there are no duplicates, there is a simple summation identity for this - the sum of numbers from 1 to m in increments of 1 is (m * (m + 1)) / 2. You can then sum the array and use this identity.

可以查出以上保证下是否存在dupe,加上保证没有数大于m或小于n(可在O(N)中查看)

You can find out if there is a dupe under the above guarantees, plus the guarantee no number is above m or less than n (which can be checked in O(N))

伪代码中的思想:
0) 从 N = 0 开始
1) 取列表中的第 N 个元素.
2) 如果列表已经排序,它不在正确的位置,检查它应该在哪里.
3)如果它应该在的地方已经有相同的数字,那么你就被骗了 - RETURN TRUE
4) 否则,交换数字(将第一个数字放在正确的位置).
5) 用你刚刚交换的号码,它在正确的地方吗?
6) 如果不是,返回第二步.
7) 否则,从第一步开始,N = N + 1.如果这超出了列表的末尾,你就没有被骗.

The idea in pseudo-code:
0) Start at N = 0
1) Take the N-th element in the list.
2) If it is not in the right place if the list had been sorted, check where it should be.
3) If the place where it should be already has the same number, you have a dupe - RETURN TRUE
4) Otherwise, swap the numbers (to put the first number in the right place).
5) With the number you just swapped with, is it in the right place?
6) If no, go back to step two.
7) Otherwise, start at step one with N = N + 1. If this would be past the end of the list, you have no dupes.

而且,是的,它以 O(N) 运行,尽管它看起来像 O(N ^ 2)

And, yes, that runs in O(N) although it may look like O(N ^ 2)

此解决方案的工作假设您可以修改数组,然后使用就地基数排序(实现 O(N) 速度).

This solution works under the assumption you can modify the array, then uses in-place Radix sort (which achieves O(N) speed).

已经提出了其他数学解决方案,但我不确定它们中的任何一个是否已被证明.有一堆可能有用的和,但其中大多数都会在表示和所需的位数方面遇到问题,这将违反恒定的额外空间保证.我也不知道它们中的任何一个是否能够为给定的一组数字生成一个不同的数字.我认为平方和可能有效,它有一个已知的计算公式(参见 Wolfram's)

Other mathy-solutions have been put forth, but I'm not sure any of them have been proved. There are a bunch of sums that might be useful, but most of them run into a blowup in the number of bits required to represent the sum, which will violate the constant extra space guarantee. I also don't know if any of them are capable of producing a distinct number for a given set of numbers. I think a sum of squares might work, which has a known formula to compute it (see Wolfram's)

因此,有人提到可能使用总和 + 平方和.没有人知道这是否有效,我意识到只有在 (x + y) = (n + m) 时才会出现问题,例如事实 2 + 2 = 1 + 3.正方形也有这个问题,这要归功于勾股三元组(所以 3^2 + 4^2 + 25^2 == 5^2 +7^2 + 24^2,平方和不起作用).如果我们使用 费马大定理,我们知道这不会发生在 n^3 上.但是我们也不知道是否有 x + y + z = n 用于此(除非我们知道但我不知道).所以不能保证这也不会损坏 - 如果我们继续沿着这条路走,我们很快就会耗尽比特.

So, it has been mentioned to maybe use sum + sum of squares. No one knew if this worked or not, and I realized that it only becomes an issue when (x + y) = (n + m), such as the fact 2 + 2 = 1 + 3. Squares also have this issue thanks to Pythagorean triples (so 3^2 + 4^2 + 25^2 == 5^2 + 7^2 + 24^2, and the sum of squares doesn't work). If we use Fermat's last theorem, we know this can't happen for n^3. But we also don't know if there is no x + y + z = n for this (unless we do and I don't know it). So no guarantee this, too, doesn't break - and if we continue down this path we quickly run out of bits.

然而,在我的喜悦中,我忘记指出您可以打破平方和,但这样做会创建一个无效的正常总和.我不认为你可以同时做,但是,正如已经指出的,我们没有任何一种证明.

In my glee, however, I forgot to note that you can break the sum of squares, but in doing so you create a normal sum that isn't valid. I don't think you can do both, but, as has been noted, we don't have a proof either way.

我必须说,找到反例有时比证明事情容易得多!考虑以下序列,它们的总和为 28,平方和为 140:

I must say, finding counterexamples is sometimes a lot easier than proving things! Consider the following sequences, all of which have a sum of 28 and a sum of squares of 140:

[1, 2, 3, 4, 5, 6, 7]
[1, 1, 4, 5, 5, 6, 6] 
[2, 2, 3, 3, 4, 7, 7]

我找不到任何长度为 6 或更少的示例.如果您也想要一个具有适当最小值和最大值的示例,请尝试使用长度为 8 的示例:

I could not find any such examples of length 6 or less. If you want an example that has the proper min and max values too, try this one of length 8:

[1, 3, 3, 4, 4, 5, 8, 8]

<小时>

更简单的方法(修改 hazzen 的想法):

长度为 m 的整数数组包含从 n 到 n+m-1 的所有数字恰好一次 iff


Simpler approach (modifying hazzen's idea):

An integer array of length m contains all the numbers from n to n+m-1 exactly once iff

  • 每个数组元素都在 n 和 n+m-1 之间
  • 没有重复

(原因:在给定的整数范围内只有 m 个值,所以如果数组在这个范围内包含 m 个唯一值,则必须每一个都包含一次)

(Reason: there are only m values in the given integer range, so if the array contains m unique values in this range, it must contain every one of them once)

如果你被允许修改数组,你可以用修改过的hazzen算法思想(不需要做任何求和)一次通过列表检查两者:

If you are allowed to modify the array, you can check both in one pass through the list with a modified version of hazzen's algorithm idea (there is no need to do any summation):

  • 对于从 0 到 m-1 的所有数组索引 i
  • For all array indexes i from 0 to m-1 do
  1. 如果数组[i] = n+m => RETURN FALSE(找到的值超出范围")
  2. 计算 j = array[i] - n(这是 array[i] 在 sorted 数组中从 0 开始的位置,其值从 n 到 n+m-1)
  3. 虽然j不等于i
  1. If array[i] < n or array[i] >= n+m => RETURN FALSE ("value out of range found")
  2. Calculate j = array[i] - n (this is the 0-based position of array[i] in a sorted array with values from n to n+m-1)
  3. While j is not equal to i
  1. 如果 list[i] 等于 list[j] => RETURN FALSE(发现重复")
  2. 用列表[j]交换列表[i]
  3. 重新计算 j = array[i] - n

  • 返回真值
  • 我不确定原始数组的修改是否计入 O(1) 的最大允许额外空间,但如果不是,这应该是原始发布者想要的解决方案.

    I'm not sure if the modification of the original array counts against the maximum allowed additional space of O(1), but if it doesn't this should be the solution the original poster wanted.

    这篇关于确定数组是否包含 n...n+m 的算法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆