找到包含所有元素的最短子数组 [英] Find shortest subarray containing all elements

查看:70
本文介绍了找到包含所有元素的最短子数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设您有一组数字和另一组数字.您必须以最小的复杂度找到包含所有数字的最短子数组.

Suppose you have an array of numbers, and another set of numbers. You have to find the shortest subarray containing all numbers with minimal complexity.

数组可以有重复项,我们假设这组数字没有.它不是有序的 - 子数组可以包含任何顺序的数字集.

The array can have duplicates, and let's assume the set of numbers does not. It's not ordered - the subarray may contain the set of number in any order.

例如:

Array: 1 2 5 8 7 6 2 6 5 3 8 5
Numbers: 5 7

那么最短的子数组显然是Array[2:5](python记法).

Then the shortest subarray is obviously Array[2:5] (python notation).

另外,如果您出于某种原因(一种在线算法)想避免对数组进行排序,您会怎么做?

Also, what would you do if you want to avoid sorting the array for some reason (a la online algorithms)?

推荐答案

线性时间解决方案的证明

我将写 right-extension 表示将范围的右端点增加 1,而 left-contraction 表示将范围的左端点增加 1. 这个答案是 Aasmund Eldhuset 的答案 的轻微变化.这里的区别在于,一旦我们找到最小的 j 使得 [0, j] 包含所有有趣的数字,我们此后只考虑包含所有有趣数字的范围.(可以这样解释 Aasmund 的答案,但也可以将其解释为允许由于左收缩而丢失单个有趣的数字——该算法的正确性尚未确定.)

Proof of a linear-time solution

I will write right-extension to mean increasing the right endpoint of a range by 1, and left-contraction to mean increasing the left endpoint of a range by 1. This answer is a slight variation of Aasmund Eldhuset's answer. The difference here is that once we find the smallest j such that [0, j] contains all interesting numbers, we thereafter consider only ranges that contain all interesting numbers. (It's possible to interpret Aasmund's answer this way, but it's also possible to interpret it as allowing a single interesting number to be lost due to a left-contraction -- an algorithm whose correctness has yet to be established.)

基本思想是对于每个位置 j,我们将找到结束于位置 j 的最短满足范围,假设我们知道结束于位置 j-1 的最短满足范围.

The basic idea is that for each position j, we will find the shortest satisfying range ending at position j, given that we know the shortest satisfying range ending at position j-1.

修复了基本情况下的一个小故障.

Fixed a glitch in the base case.

基本情况:找到最小的 j',使得 [0, j'] 包含所有有趣的数字.通过构造,可以没有范围 [0, k <;j'] 包含所有有趣的数字,因此我们无需进一步担心它们.现在找到 smallestlargest i 使得 [i, j'] 包含所有有趣的数字(即保持 j' 固定).这是在位置 j' 处结束的最小满足范围.

Base case: Find the smallest j' such that [0, j'] contains all interesting numbers. By construction, there can be no ranges [0, k < j'] that contain all interesting numbers so we don't need to worry about them further. Now find the smallestlargest i such that [i, j'] contains all interesting numbers (i.e. hold j' fixed). This is the smallest satisfying range ending at position j'.

为了找到终止于任意位置 j 的最小满足范围,我们可以将终止于位置 j-1 的最小满足范围右扩展 1 个位置.这个范围也必然包含所有有趣的数字,尽管它可能不是最小长度.我们已经知道这是一个令人满意的范围的事实意味着我们不必担心将范围向后"扩展到左侧,因为这只能在其最小长度上增加范围(即使解决方案更糟.我们需要考虑的唯一操作是保留包含所有有趣数字的属性的左收缩.因此,在此属性保持的情况下,范围的左端点应尽可能前移.当没有更多的左收缩可以执行时,我们的最小长度满足范围以 j 结束(因为进一步的左收缩显然不能使范围再次满足),我们就完成了.

To find the smallest satisfying range ending at any arbitrary position j, we can right-extend the smallest satisfying range ending at position j-1 by 1 position. This range will necessarily also contain all interesting numbers, though it may not be minimal-length. The fact that we already know this is a satisfying range means that we don't have to worry about extending the range "backwards" to the left, since that can only increase the range over its minimal length (i.e. make the solution worse). The only operations we need to consider are left-contractions that preserve the property of containing all interesting numbers. So the left endpoint of the range should be advanced as far as possible while this property holds. When no more left-contractions can be performed, we have the minimal-length satisfying range ending at j (since further left-contractions clearly cannot make the range satisfying again) and we are done.

由于我们对每个最右边的位置 j 执行此操作,因此我们可以在所有最右边的位置上取最小长度范围来找到整体最小值.这可以使用嵌套循环来完成,其中 j 在每个外循环循环中前进.显然 j 前进了 1 n 次.由于在任何时间点我们只需要 j 的前一个值的最佳范围的最左边位置,我们可以将其存储在 i 中,并随时更新它.i 从 0 开始,总是 <= j <= n,并且只向上推进 1,这意味着它最多可以推进 n 次.i 和 j 最多前进 n 次,这意味着该算法是线性时间的.

Since we perform this for each rightmost position j, we can take the minimum-length range over all rightmost positions to find the overall minimum. This can be done using a nested loop in which j advances on each outer loop cycle. Clearly j advances by 1 n times. Since at any point in time we only ever need the leftmost position of the best range for the previous value of j, we can store this in i and just update it as we go. i starts at 0, is at all times <= j <= n, and only ever advances upwards by 1, meaning it can advance at most n times. Both i and j advance at most n times, meaning that the algorithm is linear-time.

在以下伪代码中,我将两个阶段合并为一个循环.如果我们已经达到拥有所有有趣数字的阶段,我们只会尝试收缩左侧:

In the following pseudo-code, I've combined both phases into a single loop. We only try to contract the left side if we have reached the stage of having all interesting numbers:

# x[0..m-1] is the array of interesting numbers.
# Load them into a hash/dictionary:
For i from 0 to m-1:
    isInteresting[x[i]] = 1

i = 0
nDistinctInteresting = 0
minRange = infinity
For j from 0 to n-1:
    If count[a[j]] == 0 and isInteresting[a[j]]:
        nDistinctInteresting++
    count[a[j]]++

    If nDistinctInteresting == m:
        # We are in phase 2: contract the left side as far as possible
        While count[a[i]] > 1 or not isInteresting[a[i]]:
            count[a[i]]--
            i++

        If j - i < minRange:
            (minI, minJ) = (i, j)

count[]isInteresting[] 是散列/字典(如果涉及的数字很小,则为普通数组).

count[] and isInteresting[] are hashes/dictionaries (or plain arrays if the numbers involved are small).

这篇关于找到包含所有元素的最短子数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆