查找第k个最小的元素在2排序的数组联盟 [英] Finding kth smallest element in union of 2 sorted array

查看:198
本文介绍了查找第k个最小的元素在2排序的数组联盟的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我认为这个问题被问了很多次,但仍没有任何明确的解决方案! 不管怎么说,这是我发现了O(K)为正确答案(可能是O(logm + LOGN)也是如此)。但我不明白的一部分,在那里,如果M_B> M_A(或者反过来),我们应该M_B后元素后扔掉。但在这里它的反面 - 投掷元素这是M_B之前。任何人都可以请解释一下为什么?

I think this question was asked so many times, but still there aren't any clear solution! Anyways, this is what I found as good answer in O(k) (possibly O(logm + logn) too). But I don't understand part, where if M_B > M_A (or other way round) we should be throwing away after elements after M_B. But here its reverse - throwing elements which are before M_B. Can anyone please explain why?

http://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15451-s01/recitations/rec03/rec03.ps

和其他问题正在做K / 2 ......我们应该做的,但不是很明显我。

And other question is doing K/2 ... we should be doing it, but it isn't obvious to me.

Example
A = [2, 9, 15, 22, 24, 25, 26, 30]
B = [1, 4, 5, 7, 18, 22, 27, 33]
k= 6

Answer is 9 (A[1])

下面是我想,如果我想为O解决(登录K)...需要抛出每次K / 2个元素。 基本的解决方案:如果K< 2:从第二返回最小的元素 - A [0],A [1],B [0],B [1] 其他: 比较A [K / 2]和B [K / 2]:如果A [K / 2]其中, B〔K / 2]:那么第k个最小的元素将在A [1 ... N]和B [1 ... K / 2] ......好吧我在​​这里投掷器K / 2(可以做类似的A [克/ 2]> B〔K / 2]。因此,现在的问题是下一次也可为k指数K或K / 2?

Here is what I think, if I want to solve in O(Log k) ... need to throw k/2 elements each time. Base solution: if K < 2: return 2nd smallest element from - A[0], A[1], B[0], B[1] else: compare A[k/2] and B[k/2]: if A[k/2] < B[k/2]: then kth smallest element will be in A[1 ... n] and B[1 ... K/2] ... okay here I thrower k/2 (can do similar for A[k/2] > B[k/2]. so now question is next time also k index is K or k/2?

我在做什么是对的?

推荐答案

这算法是不坏 - 它比通常是此处引用的SO,在我看来,一个人好,因为它是一个简单很多 - 但它有一个巨大的缺陷:它要求两个向量至少有 K 元素。 (这个问题说,它们都具有相同的元素数, N ,但从来没有指定 N'GE; k ;该函数甚至没有让你告诉它的载体有多大然而,这很容易解决,我会离开它作为一个练习,现在一般情况下,我们需要一个算法这样的工作在differently-。大小的数组,而且它;我们只是需要明确的preconditions)

That algorithm isn't bad -- it's better than the one which is usually referenced here on SO, in my opinion, because it's a lot simpler -- but it has one huge flaw: it requires that both vectors have at least k elements. (The problem says that they both have the same number of elements, n, but never specifies that n ≥ k; the function doesn't even let you tell it how big the vectors are. However, that's easily solved. I'll leave it as an exercise for now. In general, we'd need an algorithm like this to work on differently-sized arrays, and it does; we just need to be clear on the preconditions.)

使用地板 CEIL 是好的,具体的,但也许令人困惑。就让我们来看看这是最普遍的方式。此外,引用的解决方案似乎假设数组1索引(即 A [1] 是第一要素,而不是 A [0] )。我要写的描述,然而,使用了更多的C类伪code,所以它假定 A [0] 是第一要素。因此,我打算把它写到找到的元素 K 在合集中,这是(K + 1) 元素。最后,该解决方案我要描述从psented解决$ P $,这将是明显的结束条件不同,巧妙地。恕我直言,这是稍微好一点。

The use of floor and ceil is nice and specific, but maybe confusing. Let's just look at this in the most general way. Also, the solution quoted seems to assume that arrays are 1-indexed (i.e. A[1] is the first element, not A[0]). The description I'm about to write, however, uses a more C-like pseudocode, so it assumes that A[0] is the first element. Consequently, I'm going to write it to find element k in the combined set, which is the (k+1)th element. And finally, the solution I'm about to describe differs subtly from the solution presented, which will be apparent in the end condition. IMHO, it's slightly better.

确定,如果 X 的元素 K 序列中,恰好有 K 序列中的元素不是 X时。 (我们将不涉及地方有重复元素的情况下,但它没有多大的不同。见注3)。

OK, if x is element k in a sequence, there are exactly k elements in the sequence smaller than x. (We won't deal with the case where there are repeated elements, but it's not much different. See note 3.)

假设我们知道, A B 各有一个元素 K 。 (请记住,这意味着他们每个人都有至少 K + 1 元素。)选择的 K任何非负整数小于 的;我们把它叫做。而让Ĵ K - 我 - 1 (让 I + J ==ķ - 1 )。 [见注1所示。现在,看看元素 A [1] B [J] 。比方说, A [1] 更小,因为我们只需要改变在其他情况下,所有的名字。请记住,我们假定所有的元素都不同。因此,这里就是我们知道这一点:

Suppose that we know that A and B each have an element k. (Remember, this means they each have at least k + 1 elements.) Select any non-negative integer less than k; we'll call it i. And let j be k - i - 1 (so that i + j == k - 1). [See note 1, below.] Now, look at elements A[i] and B[j]. Let's say A[i] is smaller, since we just have to change all the names in the other case. Remember that we're assuming all the elements are different. So here's what we know at this point:

1)有 A 元素这是&LT; A [1]

2)有 B 这是&LTĴ元素; B〔J]

2) There are j elements in B which are < B[j]

3) A [1] - ; B〔J]

4)从(2)和(3),我们知道:

4) From (2) and (3), we know that:

5)最多有 B Ĵ元素这是&LT; A [1]

6)从(1)和(5),我们知道:

6) From (1) and (5), we know that:

7)至多有 I + J 元素在一起这是&LT; A [1]

8),但 I + J K - 1 ,所以实际上我们知道:

8) But i + j is k - 1, so actually we know:

9)元素 K 合并数组必须大于 A [1] (因为 A [1] 是最多的元素 I + J )。

9) Element k of the merged array must be greater than A[i] (because A[i] is at most element i + j).

由于我们知道答案必须大于 A [1] ,我们可以放弃A [0]至A [1](实际上,我们只是增加了数组指针,但有效的,我们会丢弃)。但是,我们现在已经废弃 I + 1 从原来的问题元素。所以出了一套新的元素(以缩短 A 及原 B ),我们需要的元素 K - 第(i + 1),而不是元素 K

Since we know that the answer must be greater than A[i], we can discard A[0] through A[i] (actually, we just increment an array pointer, but effectively we'll discard them). However, we've now discarded i + 1 elements from the original problem. So out of the new set of elements (in the shortened A and the original B), we need element k - (i + 1), instead of the element k.

现在,让我们来检查precondition。我们说,无论 A B 有一个元素 K 要素入手,让他们都至少有 K + 1 元素。在新的问题,我们想知道是否缩短 A 及原 B 分别至少有的k - 我元素。显然 B 做,因为 K - 我不大于 K 。此外,我们删除 I + 1 A 元素。它最初至少有 K + 1 元素,所以现在它至少 K - 我元素。因此,我们确定那里。

Now, let's check the precondition. We said that both A and B had an element k elements to start with, so they both have at least k + 1 elements. In the new problem we want to know whether the shortened A and the original B each have at least k - i elements. Clearly B does, because k - i is no greater k. Also, we removed i + 1 elements from A. Originally it had at least k + 1 elements, so now it has at least k - i elements. So we're OK there.

最后,让我们检查的终止条件。在开始的时候我说,我们选择非负整数Ĵ我+ J ==的K - 1 。这是不可能的,如果 K == 0 ,但它可以为 K == 1做。因此,我们只需要做一些特别的东西,一旦 K 到达0,在这种情况下,我们需要做的是回归分(A [0],B [0])。 [这是不是你看的算法更简单的终止条件,见注2]

Finally, let's check the termination condition. At the beginning I said that we choose non-negative integers i and j so that i + j == k - 1. That's not possible if k == 0, but it can be done for k == 1. So we only need to do something special once k reaches 0, in which case what we need to do is return min(A[0], B[0]). [This is a much simpler termination condition than in the algorithm you looked at, see Note 2.]

那么什么是采摘一个很好的策略,我?我们最终会删除或者 I + 1 K - 从这个问题我元素,我们希望这是接近一半的元素尽可能的。因此,我们应该选择 I =地板((K - 1)/ 2)。虽然它可能不会立即明显,这将使 J =地板(K / 2)

So what's a good strategy for picking i? We'll end up removing either i + 1 or k - i elements from the problem, and we'd like that to be as close to half of the elements as possible. So we should choose i = floor((k - 1) / 2). Although it might not be immediately obvious, that will make j = floor(k / 2).

我要离开了,我解决的情况下, A B 有较少的元素位。它并不复杂;我会鼓励你考虑一下吧。

I'm leaving out the bit where I solve the case where A and B have fewer elements. It's not complicated; I'd encourage you to think about it yourself.

[1]你在看选择算法 I + J ==氏/ code>(如 K 甚至),并丢弃或者Ĵ元素。矿山选择 I + J = =的K - 1 (总是)这可能使它们较小的一个,但随后下降 I + 1 J + 1 元素。因此,它应该稍微收敛更快。

[1] The algorithm you were looking at selects i + j == k (if k is even), and drops either i or j elements. Mine selects i + j == k - 1 (always) which might make one of them smaller, but then it drops i + 1 or j + 1 elements. So it should converge slightly more rapidly.

[2]选择之间的差异 I + J ==氏/ code>(他们)和 I + J = =的K - 1 (矿)是在结束条件明显。在他们的配方中,无论Ĵ必须是积极的,因为如果一个人0,是有风险的丢弃0个元素,这将是一个无限递归循环。因此,在他们的配方, K 的最小可能值是2,而不是1,所以他们终止情况下必须处理 K == 1 ,其涉及四个元件之间进行比较,而不是两个。对于它的价值,我相信找到第二个最小的元素了两个排序向量最好的解决办法是:最小(最大(A [0],B [0]),分(A [1],B [1 ])),​​这需要三个比较。这不会使他们的算法更慢;只是要复杂得多。

[2] The difference between selecting i + j == k (theirs) and i + j == k - 1 (mine) is apparent in the end condition. In their formulation, both i and j must be positive, because if one of the were 0, there is a risk of dropping 0 elements, which would be an infinite recursive loop. So in their formulation, the minimum possible value of k is 2, not 1, and so their termination case has to handle k == 1, which involves comparing between four elements, rather than two. For what it's worth, I believe the best solution of "find the second smallest element out of two sorted vectors" is: min(max(A[0], B[0]), min(A[1], B[1])), which requires three comparisons. This doesn't make their algorithm slower; just more complicated.

[3]假设元素可以重复。其实这并没有改变什么。该算法仍然有效。为什么?好了,我们可以pretend,在 A 竟是一对与每个元素的实际值与实际指数,同样在 B ,和我们在载体中比较值时使用索引作为决胜局。向量之间,我们给preference在 A 如果 A [1]中的所有元素与乐; B〔J] ;否则,在 B 中的所有元素。这实际上并没有改变实际的code可言,因为我们从来没有真正做任何的比较方式不同,但它使所有的不平等证明有效。

[3] Suppose elements could repeat. Actually this doesn't change anything. The algorithm still works. Why? Well, we could pretend that every element in A was actually a pair with its actual value and its actual index, and similarly for every element in B, and that we use the index as a tie breaker when comparing values within a vector. Between vectors, we give preference to all the elements in A if A[i] ≤ B[j]; otherwise to all the elements in B. This doesn't actually change the actual code at all, because we never actually have to do any comparison differently, but it makes all the inequalities in the proof valid.

这篇关于查找第k个最小的元素在2排序的数组联盟的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆